Pages

Thursday, March 29, 2012

Migrated WordGap from Python to Go 1

WordGap, an anagram search tool running on Google AppEngine, was originally written in Python. I recently rewrote it in Go 1 and added support for the SOWPODS word list. In terms of number of lines, the code size is roughly the same as the Python version, but the run-time performance has gained a considerable boost: the average latency of anagramming requests has dropped from ~500ms to ~30ms! Go has strongly exceeded my expectation and is now my language of choice for AppEngine app development.

Sunday, March 18, 2012

Initializing large arrays in Go for AppEngine

Go supports composite literals to construct arrays, slices, or maps. A list of strings, for example, may be created as follows:
words := []string{"AA", "AAH", "AAHED"}
This is convenient and works really well for small lists. If you try to initialize an array with hundreds of thousands of elements, though, Go compilation will significantly slow down even if the entire list can comfortably fit into the memory. Worse yet, you won't be able to deploy your app to AppEngine. A typical error message looks like this:
Error 422: --- begin server output ---
Compile failed:
2012/03/18 00:00:00 go-app-builder: Failed building app: failed running 6g: signal 9
--- end server output ---
Rolling back the update.
Error 422: --- begin server output ---
The solution is to construct large data structures at run-time by parsing data encoded in external files. For example, store a long list of words in a text file, one word per line; during initialization, convert them into a string slice:
import (
  "io/ioutil"
  "strings"
)

func loadStringList(filename string) []string {
  content, err := ioutil.ReadFile(filename)
  if err != nil {
    panic(err)
  }
  return strings.Split(string(content), "\n")
}

func init() {
  words := loadStringList("words.txt")
  ...
}
For more complicated data structures, you may want to use the gob package to do efficient deserialization.