allenc allencheung

Putting the Engineering into Software Engineering

It’s almost become cliche to make fun of the fragile nature of software development—professional or otherwise—as a discipline unworthy of being labeled engineering. The lack of security with voting software; maps that give laughingly bad directions; trading systems that lose money at unprecedented rates—once computing became mainstream enough to stroke fears of widespread Y2K failures, people noted how crappy most software systems behave.

It’s bad enough that there are regular debates on whether software engineering, as a discipline, should be held in the same regard as other “real” engineering disciplines. The rigor and precision and consequences of failure from other engineering fields, be they aerospace or chemical or civil, seem an order of magnitude higher than how most software feels like its built. Personally, my new-found appreciation for mechanical watches comes in no small part from the engineering of watch movements.

When I last wrote about this 3 years ago, I concluded that at sufficient scope and scale, software projects start taking on the characteristics of major engineering endeavors. It’s not productive to compare the engineering efforts to construct a bridge to that of throwing together a personal blog; something closer in scale to Microsoft Windows is a more appropriate to software engineering analogue to bridges and aircraft and skyscrapers. And to be clear, there’s still plenty of merit to that line of argument.

But I’ve also been thinking about software that can withstand the test of time, software that has been hardened with evolving requirements and diverse use cases and physical constraints. Such a system need to be large in scope, and it’d also need to exhibit the rigor and detail boasted by other engineering feats, combined with a level of longevity and robustness. It’s not so much about adding user-visible features, but keeping the lights on, without drama, over a sustained period of time.

And I think the google3 version control system is a great example.

A quick summary: google3 is the central code repository for Google’s truly massive codebase, across all departments, all offices, all teams. It’s so big that traditional source control tools no longer work1, so an internal team has built and continues to maintain a custom set of tools to manage this monolith. When the white paper was published 2 years ago, the system has been running and supporting Google’s engineering teams for over 16 years already.

The feat isn’t in just the number of lines of code and files, but rather the level of consideration for the design of the software and its evolution. If one of the hallmarks of “real engineering” is attention and rigor to details, a large system like google3 inherits the same characteristics in its construction and ongoing maintenance, the same sense of dependability that is the hallmark of critical infrastructure.


  1. Things tend to go south when just the relevant code plus dependencies can’t fit on hard drives and code cannot be built in-full, within a reasonable timeframe, with modern desktops.

By allen
allenc allencheung

Elsewhere