Yesterday my friend Khan posted this:
TL;DR – the second semicolon is not really a semicolon.
Source code for programs is stored in plain text files. With few exceptions, all words are in English and all characters are from the set of printable 7-bit ASCII characters. If you are crafty or copy and paste from some other WYSIWYG editor (like Microsoft Word) you can end up inserting special characters that look like valid characters.
If you run into this yourself, first try this:
- re-type the broken line of code by hand
- if the re-typed line works, delete the broken line
- kelp calm and carry on
Works like a charm, every time. Eric Kolve taught me that trick for solving these kinds of problems (almost 20 years ago!). But of course you want to know why, and typing is a pain, so keep reading.
Whenever every character looks perfect but the code won’t compile/run, I first open the file or copy & paste the text into Vim. I did this with the two lines Khan presented. Right away I noticed a difference in the semicolons due to the fixed-width font my console uses (“Monospace, 9pt”, apparently). Confirmed with the Vim
ga command on both semicolons, revealing the first is 0x3b (a proper 7-bit ASCII English semicolon, as any interpreter/compiler would expect) and the second is 0x037e, a special character far outside the 7-bit ASCII range that looks exactly like your usual semicolon.
Other tools I sometimes use for these types of problems: od (“octal dump”), xxd (a hex dumper), diff (put in two different files or do the diff using Vim windows), or the unicode command (when I’m looking for more information on a single character).
I really like the book CODE by Charles Petzold for learning about character codes and computers in general. This book is something of a gentle introduction to the kinds of material covered in computer science machine architecture classes.
Original post by Khan on Facebook (not public).