Monday, June 19, 2017

Toggle syntax highlighting to catch bugs

I usually write code with syntax highlighting enabled.  While preparing my final commit message, I view the proposed diff in its own color scheme (red for removed lines, green for added lines, white for context lines).  Even though I've spent hours working on a patch, I often spot mistakes in my newly added lines as soon as the color scheme changes.  Apparently, psychologists already knew about this phenomenon: “Once you’ve learned something in a particular way, it’s hard to see the details without changing the visual form.”  The article suggests other visual changes like using a different font or printing to paper.

I use two other, related hacks for helping myself find mistakes in my code:

  • go to sleep and review my code in the morning
  • watch a video, play a game, read a book, work on a completely unrelated problem to force my mind to lose as much of its mental model as possible then review my code again
In each case, the new perspective often reveals details that I overlooked before.

Saturday, May 20, 2017

Sensitive survey questions

Do you steal from your employer? Do you lie on your taxes? Have you cheated on your wife?  If you want to gather statistical information about these questions, you can't ask directly.  Most respondents will lie.  I'm aware of three methods for addressing the problem, two of them are quite clever.


Bogus Pipeline

The first one is not particularly clever.  Hook the subject to a machine.  Tell them it's a lie detector even though it's not. Ask them to respond honestly and pose a few baseline questions to which you know the answer (What's your name? What day is it? etc). After each answer, have the machine indicate that it detected truth.  Now ask the subject to respond deceptively and ask more baseline questions. After each response, have the machine indicate that it detected a lie.  Now hide the machine's truth/lie indicator and ask your questions.  Most subjects will tell the truth.

This is called a bogus pipeline. It's complicated to implement, requires physical access to the subject and not as accurate as other techniques.

Randomized Response

Ask the subject to flip a coin but don't tell you what it is.  If it's heads, they should answer truthfully. If it's tails, they should answer yes (or whatever the socially unfavorable answer is).  Now ask your question.  Applying some simple math to the aggregate responses, you can accurately calculate the percentages you want to know.

This one's pretty helpful, but it requires the subject to have a coin (who uses coins anymore?).  The subject must also be smart enough to recognize that the coin gives him deniability. It seems obvious, but it's not obvious to everyone.

Unmatched Count

Construct an innocuous survey along these lines: "How many of the following statements are true about you? I own a dog. I drink coffee. I've been married. I have brown hair."  Construct a second survey, identical to the first but add your sensitive statement, "I cheat on my taxes".  For each subject, randomly give them one survey or the other.  Calculate the average answer for each type of survey.  The difference between the two averages tells you the percentages you want to know.

This one's my favorite. Since the subject only tells you their final count, it's obvious to them that they've divulged no sensitive information.  The math for analyzing the results is similarly easy.

Do you know of any other techniques?

Tuesday, May 02, 2017

Switching to OpenBSD

Short story:

After 12 years, I switched from macOS to OpenBSD.  It's clean, focused, stable, consistent and lets me get my work done without any hassle.

Long story:

When I first became interested in computers, I thought operating systems were fascinating. For years I would reinstall an operating system every other weekend just to try a different configuration: MS-DOS 3.3, Windows 3.0, Linux 1.0 (countless hours recompiling kernels).  In high school, I settled down and ran OS/2 for 5 years until I graduated college. I switched to Linux after college and used it exclusively for 5 years. I got tired of configuring Linux, so I switched to OS X for the next 12 years, where things just worked.

I was pretty happy with OS X.  It gave me Unix and mostly got out of the way so that I could write software.  I wrote about enjoying Apple's simplicity.  Snow Leopard even spent an entire release cycle just fixing bugs and improving performance.

But Snow Leopard was 7 years ago. These days, OS X is like running a denial of service attack against myself.  macOS has a dozen apps I don't use but can't remove. Updating them requires a restart.  Frequent updates to the browser require a restart.  A minor XCode update requires me to download a 4.3 GB file.  My monitors frequently turn off and require a restart to fix.  A system's availability is a function of mean time between failure and mean time to repair.  For macOS, both numbers are heading in the wrong direction for me. I don't hold any hard feelings about it, but it's time for me to get off this OS and back to productive work.

So where do I go now?  We own 5 Chromebooks and they have great availability.  Updates are infrequent, small, fast and nearly transparent.  Unfortunately, I need an OS where I can write and compile code.  I also want it to run on older, commodity hardware so I can replace a broken laptop for $400 instead of $2,000.

I considered several Linux distributions.  Lubuntu seemed promising, but it was too bloated for my taste.  A couple years ago, I tried Ubuntu on a Dell XPS Developer Edition for a few months.  Even with hardware designed for Linux, it was too fragile. Desktop Linux has also become even more complex than when I used it a decade ago.  I just want to get my work done, not feed and maintain an OS.

I was reminded of OpenBSD during the Heartbleed scare.  While everyone else was complaining about OpenSSL and claiming that open source had failed, the OpenBSD developers quietly drew their machetes and hacked out hundreds of thousands of lines of bad code, forking off LibreSSL where they can keep it clean and stable.  The OpenBSD community is like that: focus on what's really important, hold your code to a high standard, ignore all the distractions.  They're not trying to live in the past, just trying to make the future a place worth living.

Anyway, I found OpenBSD very refreshing, so I created a bootable thumb drive and within an hour had it up and running on a two-year old laptop.  I've been using it for my daily work for the past two weeks and it's been great.  Simple, boring and productive.  Just the way I like it.  The documentation is fantastic.  I've been using Unix for years and have learned quite a bit just by reading their man pages.  OS releases come like clockwork every 6 months and are supported for 12.  Security and other updates seem relatively rare between releases (roughly one small patch per week during 6.0).  With syspatch in 6.1, installing them should be really easy too.

I also enjoy that most things are turned off in OpenBSD by default.  The base installation is sparse.  It assumes that I'll enable a service or install a tool if I want it.   So I'm not constantly facing updates for software I never use.

My experience with OpenBSD is still young, but I really like what I see so far.

Thursday, April 06, 2017

Using Project Fi

I signed up for Project Fi last month.  It's been a real pleasure to use.  I expected it to be a step down compared to Ting, but I was wrong.  For my use cases, Fi is slightly better in a couple ways.  Fi's customer support isn't as good, but it's acceptable and better than most phone companies.

Billing


Project Fi canceled my Google Voice account during sign up, but it transferred all my GV account credit over to the new account.  The credit wasn't visible on the first bill but appeared on the second.  It was nice not to lose those GV funds.  (Fi also transferred my voicemail greetings and blocked numbers from GV).

I really like the way that Fi charges for data.  I chose the 1 GB plan (since there's no penalty for overages).  Last month I consumed 1.165 GB of data.  On Ting I would have paid $10 for crossing into the second GB.  On Fi, I paid only $1.65 extra; exactly covering my overage.

Fi provides free data-only SIM cards whose usage is just added to your account.  Since my family uses VoIP for all phone calls (either SIP or Google Voice) and SMS, everyone just needs data.  I gave everyone a data-only SIM card and it's been working great.  The SIM cards work in every device we've tried, old and new.  This avoids the $20/month charge per phone line.  Fi billing breaks out usage for each device.  The only downside is that you can only order one data SIM at a time.  It took me a few weeks to place orders for all the cards I needed.

Coverage


Ting is built on T-Mobile's network.  It has great coverage almost everywhere I go.  The one exception is the northwest quarter of my grocery store.  Data signal in that part of the store was always missing.

Since Project Fi automatically switches between T-Mobile, Sprint and US Cellular networks, depending on signal strength, I expected Fi to do better.  It did.  This dead spot is no longer a problem.  In this particular store, Fi often switches to the Sprint network then switches back to T-Mobile when I go elsewhere.  Never underestimate the power of a layer of abstraction.  Data-only SIM cards only use T-Mobile, so their coverage is identical to what I had on Ting.

I use Signal Spy to see which network my phone is currently using.  I find it gratifying to drive through town and watch Fi switch networks.  I look forward to some road trips this summer to see how effective it is in that scenario.

Fi's switching algorithm sometimes sticks with a network whose signal is slightly weaker than the alternatives.  Theoretically this could hurt battery life, but I've never had it impact connectivity.  It does annoy my OCD a little.

Conclusion


Overall, I'm very happy with Project Fi and will probably stick with it after I return from Europe.  I hope Fi's international coverage is as good as the US coverage.

Wednesday, March 29, 2017

Cancel a Stripe subscription in App Engine

Stripe's API for canceling subscriptions requires that you send an HTTP DELETE request.  If you want to cancel the subscription at the end of the current billing period (instead of immediately), you need to include a body parameter at_period_end=true in the request.  That's fine.  RFC 7231 section 4.3.5 says that a DELETE request is allowed to have a body but it "has no defined semantics".

Unfortunately, App Engine's urlfetch service silently removes the body from all outgoing DELETE requests.  Trying to cancel a Stripe subscription in App Engine always does it immediately, even if you asked to cancel at the end of the billing period.   Google's known about the problem since 2008, but never fixed it.  The glitch impacts many APIs other than Stripe.

Fortunately, you can use App Engine's sockets API to construct a workaround.  In Go, you build an HTTP client whose transport uses the socket API to make outbound network connections.  That uses Go's built in support for HTTP DELETE requests and avoids the bugs in App Engine's urlfetch service.

httpClient := &http.Client{
 Transport: &http.Transport{
  Dial: func(network, addr string) (net.Conn, error) {
   return socket.Dial(c, network, addr)
  },
 },
}


Tuesday, March 28, 2017

Programming Languages by Spec Size

I was curious which programming language has the smallest specification.  Which one has the largest?  For each language, I printed the spec to a PDF and counted the pages.  Go is the smallest.  C++ is the largest.

This is a very rough estimate of language complexity since each specification varies in style and purpose.  For example, Prolog includes Annex A which is only informative but doubles the size.  Haskell 2010 includes Part II - Libraries which defines the standard library, not the language.  Anyway, here's the list:

If you'd like me to add other languages, comment with a link to the spec and I'll update this post.

Monday, March 20, 2017

gsutil: No module named google_compute_engine

While trying to run gsutil ls on a Compute Engine VM, I received a stack trace like this:
Traceback (most recent call last):
  ...
ImportError: No module named google_compute_engine
It turns out that gsutil doesn't like Python installed from Linuxbrew.  It really wants to use the system Python.  The following works fine:
env CLOUDSDK_PYTHON=/usr/bin/python gsutil ls ...
Easy enough to fix.  Nobody else had documented trouble with this configuration so here you go Internet.