Convert files to Unicode with VBA

I had a case where Unicode characters would show normally in Notepad, but opening them in Excel resulted in gibberish. I still wanted to use VBA because of automation via cscript. The FileSystemObject approach and the OpenText() subroutine failed to reproduce the Unicode characters. Instead, I used a combination of ADODB.Stream and brute conversion of each character to Unicode.

Set objStream = CreateObject("ADODB.Stream")
objStream.CharSet = "UTF-8"
objStream.Type = 2

strData = objStream.ReadText
strData = ITSlugFunction(strData)  ' see link below

strData = "<html>" & strData & "..."

' write string to C:\test.html

You can embed this into an Outlook mail message and get Unicode characters from the command-line. Yes, Outlook will automatically treat &#xx; characters as Unicode!


Reading the well-written

I thought probability theory would be easy, but I was wrong. It feels like black magic, even when you have the solution. My prerequisites were weak sauce, so I knew I had to fix it. I decided to work through “Fundamental Concepts of Algebra.”

Where am I going to find the time to grind through these books? I also need to apply the stuff, so that my programming does not atrophy. I’ve pondered moonlighting as a sys admin at home. Wouldn’t that be a great second job; coming up with learning materials and infrastructure for mom and pop, typical users, and gleaning killer UX experience?

The answer is yes! Of course. I could create a whole ecosystem here. To them, I could be Apple and Google. I could be their rockstar. I could show them my talent. My skills would grow with their computer literacy.

Back to the math: so, you can take any set of objects – logs, men, or rocks – and establish a one-to-one correspondence with the set of positive integers, finite to fit the former’s length, and that cardinal value – that *number* – associates concrete notions with the abstract. Now we can operate on them with all the fine finesse of mathematics.

Old dog, older tricks

I’ve been learning math. The books are cheap and the reasons are greater. I realized everyone could queue up to whatever I had made and improve on it, and I couldn’t see the end. It’s unfortunate that I have to divide my time between theory and application, in the sense that I am between being late and being behind. My one consolation is being in a place frozen in a kind of stasis, a world in lag, catching up only as events dictate.

I started programming initially with the hope of having a mentor. It would have been nice to have direction. Now that compass is mine, because I did not recognize it earlier: to build one’s network, to cultivate positive experiences with all, to not be so ideologically centered.

Those are my sins and I will own them. In ten years, perhaps I will have grown up a little.

Wayfarer paradise

You know one thing that trumps a developer; it is the early developer who is not one, hired early, but more sense from home work to make polished pieces. Nose to the grindstone and biding time, seizing the moment when the original software breaks – insert self into opportunity.

Domain knowledge is gold, and failure led to replacement: when there is fear, it is best to let go. The users will benefit with someone who has been with them in the trenches, who has developed the discipline of “not cutting corners,” and maintenance effort is thus delegated.

I offered further training, but the explicit understanding was these “developers” would only be treated as “technical users.” How could I in good conscience then proceed?

Art of the selfless

I made myself redundant by sharing my code. In fear and insecure, I did the only thing I could: let it go, let it grow, and by fortune inherit destiny. In my gut, that stone hurled overboard into a shadow sea; my mind, bereft of sugar and running on steamed veggies: I had to sow a form of chaos, to plant new roots for all our good.

We don’t have a culture of sharing. Is it because we do not have a reasonable repo? Maybe if I put up the first words for it, the initial tutorials, and encouraged the strengthening of each other. This is the samewise regret when the idea for a game fell in school: I did not bring the biggest investment, and no one wanted to chip in to nothing.

Previously, I wished to establish a “developer culture.” Aren’t I four steps closer to my wish? Then why does it feel I am at Microsoft, an alpha of one, against factions?

Expeditionary pages

In the grand tradition of self-reference, I realized one could write pages which summarized other pages. Your corpus was your work, and this new article would serve as a curation of selected content. These “expeditionary pages” allow your notebook a longer shelf relevance, because lookups cost less than a complete linear search. You also do not have to maintain X number of pages – as a guess – to maintain an index; you can keep writing as a journal, entry by entry.

I would like my next notebook to be gridded, because I have been drawing flowcharts and little printing squares for Fortran. The goto statement remains formidable, but initial design lends sanity to the testing. Modifying the program remains awkward, as each branching requires a grasping familiarity of what then happens. It would be nice to say my brain is approaching the sharpness of a well-run stack, but that would be lying.

Cards in the forest

Field widths almost demand intimacy with the input data – across all possible ranges, even. It was probably easier to punch them from cards or prepared, with all significant digits accounted for, from another trusted program. The stored and subsequent printing of inputs can vary greatly, mostly from these elements:

  • user-punched decimal or not ?
  • read() field width for FORMAT is Fx.0 or Fx.y ?

Here is sample output from reading in two numbers with different field widths:

field-width - Copy

And a summary of observations:

  • blanks do not become zero
  • gaps are collapsed
  • arguments are “numbers on a card” and nothing more

RemoveDuplicates() doesn’t always work

Explicitly specify the columns argument and surround the array variable with parentheses. The documentation speaks of defaults and conveniences; do not believe it.

Range("A:A").RemoveDuplicates Columns:=(Array(1)), Headers:=xlYes

I’ve started using scratch tables, ephemeral views (if spreadsheets were single tables). I’ll spawn one at the end with Sheets.Add(), process it, and then delete it. It feels like each solution is subtly different, and I am gradually building a collection of useful subroutines. Things are getting cleaner, too.

The common elements I’ve been using are column enumerations, a sheet to act as a trace output console, and a lot of abuse with evaluations. Here I concatenate literal ampersand in a runtime formula with escaped double quotes with a UNC workbook path:

Cells(row_no,1) = "=VLOOKUP(J" & row_no & ", '\\net\path\[wb.xlsx]" & _
  "Sheet1'!B:D, 3, 0)" & Chr(38) & """yes"""

OpenBSD 5.5 on VirtualBox with AMD A6

(I’ve got Windows 8.1.) Here’s the settings I used:

  • version: OpenBSD 64-bit
  • Enable I/O APIC
  • Extended Features: Enable PAE/NX
  • Processor(s): 1 CPU
  • Hardware Virtualization: Enable VT-x/AMD-v
  • Uncheck Nested Paging
  • install55.iso file
  • BIOS: disable secure boot

Probably a lot of these settings are unnecessary if you pick the right version of OpenBSD to start. I wanted 32-bit to stick to the single-threaded kernel, but whatever.