Not even cold yet

Tasks spawned on external peripherals need a machine host, the smallest specialized devices and HD televisions. The one I received was still warm, running hot, the user logged in. In my hands was a snatched appliance, formerly in service, now a piece for automation, breathing heat in the desert between plastic keys and LCD.

“I wouldn’t develop here,” he said, nudging a solid chunk of dust from the heatsink. He tapped and blew the fragments remained, put the parts in their place. “They would have to pay me first.”

If that were true, I would not be where I was. I could not say it was purely for love anymore, but I was still a chef in a kitchen of free brilliants. Just over the next project was another opportunity, even so a permutation of prior knowns, pieced together a little different, a new same. I fancied the person to replace me would be an angel, knowing the superset of things, perfect in the ways I was not.

“She would quit a few days later,” he scoffed.

Trivial ownage

Something I never thought I would do: owning my own machine. In Windows, physical access applies equally as well as on OpenBSD. First, recovering a wifi password:

Second, resetting admin privs:

1. Reboot and hold F8.
2. Boot into safe mode with Command Prompt.

> net user root /add
> net localgroup Administrators root /add
> net localgroup Users root /delete

New investments into an existing codebase

Working with someone else’s code was a funny feeling: I wanted to do the minimum possible addition that would add a feature, copying his style (and, possibly, his code). For the domain, the CSS could be repeated; that was a useful crash course. I manually typed each block, like an unguided tutorial.

The rest of it is straightforward, but that is a testament to the original author. The program is a collection of objects stored in an array, and general enough to accommodate new contexts. It does not use any frameworks, and relative paths keeps assumptions clean.

Usually the story goes that you inherit a massive, ten-thousand line codebase with obfuscated identifiers and hundred-line functions. Every piece of inline PHP longer than eighty characters made me feel like a thief in a temple of Mozart, each stone a note belonging, each addition an affront to aesthetics.

Loosening up the parser

pdftotext.exe is nice, but its output is not always reliable. This is not the fault of the tool, but the nature of generated PDFs into OCR documents. The generator assumes the recipient is a human being and not a machine. Simplifying the algorithm required a paradigm shift: when it comes to money, a human will verify the output.

Instead of identifying specific pieces of text, I start reading everything after a certain number of lines. Tcl vacuums everything into a list devoid of commas, and this is placed into an Excel cell. A human could parse it with some trouble, but I count it a usable iteration versus opening each multi-page PDF one at a time.

The next step would be to print out the source, mark it up, and be ready the following day ready to bang out code, a la HN’s edw(?). In this case, “good enough” meets pragmatics: the data will be verified manually, so there is a cap on the extent of automation.

Using PHP in local server mode

The PHP interpreter has a server mode to test files. It is mentioned in the documentation in an early comment. I’ve been trying to use Tcl instead of batch scripting to make future applications “space-friendly” [1] for users, so I wrote this as a quick start:

set php {C:/php-5/php.exe}
cd $::env(WORKDIR)/php

exec $php -S localhost:8000

Now you can visit that address and interact with your PHP scripts. Another quick-and-dirty method is to save files as .cgi and point Apache to the PHP binary:

  echo '<p>hello world</p>';

[1] Integrating batch scripts with Tcl is easy, but spaces in path names passed into batch scripts can lead to failing demos.

Calculate SHA-1 with Microsoft’s fciv

I had to calculate the SHA-1 hash of the PHP zip folder, and the File Checksum Integrity Validator lets you do just that. It runs on the command-line. The installation instructions need to be followed carefully, but otherwise everything is golden. I’ve only tried it on one download so far, though.

My PATH is too full to accommodate new binaries, so invoking fciv can be a hassle. But it’s worth the convenience. If you have not been checking signatures for important downloads like Apache, or on any site that offers a hash, it’s a good habit to start now.

Builders and re-builders

There was a divergence early that made it easy to determine whether my outlook was at first the same or different from others: given access to the multiply spreadsheets generated from a database of opaque tables, do you use it for its ends or rebuild an edifice of guesswork and gumption? I chose the former, and no one else did.

The problem is the data can be changed by columns, things that cannot be kept static in contracts, but added, removed, or shifted by outside requests, and these updates are not made known to us. We are ghosts in the machine, tolerated but without caste: unkempt, unruly – surely to the masters of those tables, sentient bugs in a process of manual forms.

Exceptions cannot be tolerated on our end, because we hold queries just as sacred: no impure data can go into the database; abort on error. Yet there is hesitance in the automation, because each column is a string: semantics are impossible except by regular expressions that must be at once flexible and strict, future-proof but dynamic.

This is a ripe field for machine-learning. We have an infinitude of data pieces; all we need do is build a training set to recognize certain strings. Features like words in the string, “cadence,” average length, and so on could be enough to guess; in our case, all we need is a guess that the target columns no longer hold names of fruits but prices, and layer a task queue on top.

Better gains

Admission to a server brings along opportunities, and this in the form of additional users. Projects opaque otherwise except shared by request, and I want to avoid any misunderstanding. Even with free software and problems to solve, things get political too easily. Does management know the quiet feud simmering; do they encourage it as a kind of competition? Scarcity lends its own awareness; altruism isn’t the first instinct to people who have never heard of the Free Software Foundation, GNU, BSD, etc.

Not that anyone is obligated to share. I’ve mentioned it before. But I don’t gain anything by sharing what I know, and that makes me cautious. I don’t want to be cautious. I just want to write code and solve problems. To exult in useful, elegant algorithms, emergent amid cycles of repeat-fail. If anything, the best advice I can give has nothing to do with programming.

Learn marketable skills. Build a portfolio. Engineer an escape out of this cardboard box. Do the opposite of things in the ever after, because everything I do is the prolonging of a dream.

Not even knowing the server helped

Of the inherited codebase, server logs showed an IP address with visits from two different browsers: one a modern version of Internet Explorer, and another that was much older. Although the problem had been resolved, I was tasked to find the root cause. The only other interesting tidbit in the logs was someone had used debug=true flag as part of a GET request, which was not a typical user case. It was otherwise inconclusive.

Later, I found the problem had been fixed on the user end by turning off Compatibility Mode for Intranet sites. Unable to debug from the server, and unable to collect enough data – screenshots, emails, anything else that had been sent to the supervisor – I was unable to determine anything.

This experience allowed me to learn about Apache logs, grep, and remote file transfer, even though the issue had not been serious. What I should do is write down an SOP and a script for the task of “check server logs.”

Thoughts on hosting ASP.NET

We didn’t want to plunk down for a full version of IIS, so we looked at different options. Apache 2.2 with mod_aspdotnet works for .NET 2.0 according to its instructions, and .NET 4.0.x can be configured with an additional directive. However, the default C# ASP.NET web site project created from Visual Studio fails with an “IIS integration pipeline” error with .NET 4.5. Creating the project using .NET 3.5 works, though.

The second approach is using IIS Express 8.0, and this is what we’ll go with. We get to use the latest .NET framework and use Visual Studio. Being a fan of Mojolicious, I like <%’s as much as anyone, but my teammates prefer the Microsoft stack.

Modifying IIS Express 8.0 for simple hosting involves the following:

  • modify My Documents\IIS Express\application.config with your site details
  • open a port for iisexpress.exe in Windows Firewall

Here is a summary of my investigations:

Server                         Project      Result       
Apache 2.2 w/ mod_aspdotnet    hello.aspx   Works as instructed
Apache 2.2 w/ mod_aspdotnet + AspNetVersion directive (4.0.303319)
                               Visual Studio C# ASP.NET web site
                               with target .NET 3.5
                                            Appears to work
                                            (blank page)
Apache 2.2 w/ mod_aspdotnet + AspNetVersion directive (4.0.303319)
                               Visual Studio C# ASP.NET web site
                               with target .NET 4.5
                                            Error: integrated pipeline
IIS Express 8.0                .NET 4.5 project Works