Pages

Friday, July 10, 2015

Code Golf at Google

Update: Phil Johnson from ITWorld wrote an article on this.


What is code golf

Code golf is a type of programming competition where contestants strive to write the shortest code that solves a specific problem. Solutions are solely ranked by their code size as long as they pass all tests within certain time and space limits. This mostly sedentary recreational activity has nothing to do with swinging clubs on golf links other than the similarity between the two scoring systems, where lower is better.

Excelling at conventional programming competitions (e.g. ACM ICPC, Code Jam, or TopCoder) demands mastery of algorithms and speed of coding. Code golf, however, celebrates a different skill set. Expert golfers exhibit not only algorithmic prowess but also a deep knowledge of arcane language features.


Code golf at Google

During a two-week leave from Google in June 2014, I was reflecting on the convoluted Java frameworks widely adopted at work. Those hefty frameworks brought coding structures and conventions to large engineering teams; meanwhile, they also sucked the fun of programming like a Pastafarian monster slurping all the tomato sauce on a plate of spaghetti.

To bring back the fun while fostering a culture that hails code brevity, I decided to start a 20% project to create an internal code golf platform — a playground for Google engineers to show off wits, a fountain of programming tricks for the studious, and a symbol of defiance against the trend of technical complexity permeating through the company. Later in the development, it occurred to me that the system could also be purposed as a recruiting tool: interviewers would benefit from a growing repertoire of coding questions with high-coverage test cases against which candidates's code can be automatically validated. And that became the ostensible mission statement on the code golf site to justify the business case ;)


Design choices

The site I designed (http://go/codegolf if you're a Googler) differentiates itself in two major aspects from public code golf attractions such as http://codegolf.stackexchange.com/ and http://golf.shinh.org/:

Ease-of-use

Each programming puzzle asks you to write a single function rather than a full program. This eliminates the need to write any boilerplate I/O logic. It's pretty easy to write code directly in the browser which displays instant test results, but if you prefer offline testing, all test cases used by the online judge are conveniently available in JSON.

As of July 2015, the site supports automated testing of code in JavaScript, Go, Python, C++, and Haskell. Depending on the language you choose, the corresponding function signature is automatically formatted for you. Once your solution passes all the tests, it instantly gets ranked on the leaderboard against other entries in the same language category based on the total byte size.

Competitiveness

The site runs a new contest every week. Scores and authors of the top solutions are always shown on the leaderboard, but the source code won't be revealed until the contest is over — at which point, challengers are allowed to submit even shorter code, though that won't affect the finalized ranking. To encourage early submissions, time is used as a tie-breaker. At the end of a contest, everyone who has solved a problem receives a performance score based on their ranking, relative code size, and participation rate of the contest. The very top contestants also get special badges of honor.


Tech stack

The code golf site runs on Google App Engine and Compute Engine. It's a concoction of a frontend server in Go and JavaScript plus a collection of backend sandboxes. The biggest challenge was building those sandboxes to execute untrusted code in a controlled environment. Another tricky part was modeling the data to enable problem writing in a language-agnostic way.


Code golf community at Google

Since its launch in mid-June 2014, the code golf site has garnered a growing number of loyal users. Hundreds of Googlers have participated in the weekly contests. Some of them confessed in private that they had developed an unhealthy addiction that negatively impacted their productivity, to which I remedied by refraining from offering any monetary rewards to code golf winners ;)

Code golf tournaments have covered a diverse range of challenges from deceptively trivial CS101 exercises to number theory riddles. The themes also vary greatly, including tributes to acts of Googliness, travesty of the performance review system, and taunts on unpopular company decisions.

What fascinated me most was the extraordinary quality of winning entries. Sometimes I would run a contest that's identical to one on codegolf.stackexchange.com. The top voted entries on Stack Exchange are just not on par with the best solutions I've witnessed at Google, which exude an elevated level of creativity and craftsmanship rarely found elsewhere. Reading the code can easily trigger a software engineer's imposter syndrome, a profound feeling of humbleness, tinged with a sense of proudness to be working with these talented people.

For example, in real world scenarios, I've never encountered a single use case of division by zero; but at code golf, a former mathematics professor once won a JavaScript challenge by employing an ingenious "m/=0" expression to reduce code size. He later wrote a 1200-word monograph to dissect his winning entry and shed light on previous iterations. From his article, I also learned that in JavaScript, calling Math.min() with zero arguments returns infinity — an interesting tidbit that may never be useful.

I feel obliged to share some outstanding one-liner solutions for a classic programming problem below: Given a list of strings representing a grid of cells ("0"s for dead cells, "1"s for live cells), write a function g() that returns the next iteration following the rules of Conway's Game of Life. For example, given the input ["111", "110", "100"], the expected output is ["101", "001", "110"].

JavaScript (114B)
function g(b,y,e){for(j=s='';q=e&&b[j++]*16+8;s+=q&1)for(x=9;x--;q>>=(10+e[y-x%3+1])[j+x/3|0]);return s||b.map(g)}
Go (118B)
func g(a[]string)(r[]string){for i,p:=range a{r=append(r,"");for j,q:=range p{x:=1/-^i*3;q^=18;for;x/3<3-i/^-len(a);x++{q<<=("0"+a[i+x/3-1]+"0")[j+x%3]%2};r[i]+="01"[q/16&1:][:1]}};return}
Python (123B)
g=lambda a,e=enumerate:[`[+(2<`zip(*a[i+i/~i:i+2])[j+j/~j:j+2]`.count('1')<4+int(c))for j,c in e(x)]`[1::3]for i,x in e(a)]
C++ (138B)
template<class V>V g(V a){V b=a;for(r:b)for(p:r){p^=50;for(R:a)for(P:R)p<<=&R-&a[&r-&b[1]]<3u&3u>&p-&r[&P-&R[1]]&P;p=48+p/16%2;}return b;}
Haskell (104B)
z=zip[2..] f i=drop(i-3).take i g a=[[("0001"++c:"0000")!!mod(read$f i a>>=f j)9|(j,c)<-z r]|(i,r)<-z a]

(If you're a Googler, check out the full contest results and discussions at http://go/codegolf/11)

I was the sole maintainer of the code golf site for a while. Over time, more contributors volunteered to design problems, review tests, write editorials, implement features, and eventually take over the site ownership when I left Google. It's a bittersweet feeling to leave my brainchild to more capable hands. I don't really feel attached to any code I wrote, but the amazing code golf subculture at Google and the friendships developed within the community will always have a special place in my heart.

14 comments:

  1. Nice! Can any of this be used for interviewing candidates ?

    ReplyDelete
    Replies
    1. Yes, I used a few code golf questions for interviews. Of course I never asked candidates to optimize for code size :)

      Delete
  2. I'm trying to run the c++ version, but it won't compile because the loop variables in the range for loops do not have declarations.

    ReplyDelete
    Replies
    1. Implicit type in range-based for loop is an allowed C++ extension.

      Delete
    2. No, that is not valid c++14 code.

      Delete
  3. Will this be made publicly accessible?

    ReplyDelete
    Replies
    1. I hope so! That'll be a lot of fun.

      Delete
  4. Wow, memories, in the 80s in Digital Equipment we had the battle of the languages, using the game of life, there was even a submission from the cobol team. The was an APL version which was one line. DEC was the Google of its era as the place to work
    Jerry

    ReplyDelete
  5. The haskell doesn't appear to compile, or at least, I couldn't figure out how to format it so it would. Yay significant whitespace.

    ReplyDelete
    Replies
    1. Here is the properly formatted Haskell solution (three lines)

      z=zip[2..]
      f i=drop(i-3).take i
      g a=[[("0001"++c:"0000")!!mod(read$f i a>>=f j)9|(j,c)<-z r]|(i,r)<-z a]

      Delete
  6. Code Golf is really fun and it helps to hack programming languages, but most of the time the best results are obtained in the same programming languages Ruby/Perl/Python/Bash. Here is the proof (among 4000 code golfers): https://www.codingame.com/leaderboards/global/puzzle/thor-codesize

    ReplyDelete
  7. Judging by byte size shows arcane skills, but doesn't seem practical. Judging by a combination of readability and efficiency would promote practical skills. Of course, judging readability is not as easy and is more subjective, but at least the participants would be improving skills they might actually use.

    ReplyDelete
  8. There is dependably another or better approach to accomplish something in software development. Programming apparatuses and dialects are always advancing and you need to continue adapting new innovations and methods. What google does for it's employees are one of the best routines to enhance the productivity. One of the mobile app development company california follows same type of activities to enhance their employees.

    ReplyDelete