E Engineering

Warnings as Errors in Production Environments

March 5, 2014

I listen to Accidental Tech Podcast 1, and in episode 54, there was a hearty discussion on whether enabling warnings as errors in production – or in some cases, development – was a good idea. The context was around whether that kind of strict error checking could have prevented the #gotofail SSL bug that affected iOS and OSX in late February, that turning on something like the Clang -Weverything2 in production would have caught this sooner.

One side of the argument says that warnings are exactly what they imply: they’re things to be aware of, but aren’t system critical and over time would rightfully be glossed over or ignored. The other side of the argument would be that all warnings are meaningful, and that by forcing errors to happen in both development and production, the developer is very incentivized to fix potential issues, with further implication that failure to heed these warnings are a development smell (e.g., the methodology equivalent of code smells). It’s a pragmatic versus idealist debate at heart.
If you’ve read my writing you can pretty quickly guess which side I lean towards.

Using warnings-as-a-hammer is a very hard line to draw in the sand of production systems. In development, this adherence implies a reverence to maintaining a certain style of code and level of systems integration, which itself is insufficient as an indicator for code quality, i.e., compilers can’t teach and enforce good coding by themselves. While in the best case scenarios this can draw attention to potential problems, I find it much more likely that for the mediocre engineers where’d this would have the most impact, it would be faster for them to just find ways around the restriction. If the C++ compiler warns of a bare (int*) cast, then the path of least resistance would be to rewrite it as a reintrepret_cast<int*>.

In production, warnings-as-errors is even worse as it places engineering sanctity above business realities. Code development and execution are means to an end, and businesses would be paralyzed if every library or framework or point-release upgrade has the potential to throw a warning that’d ground entire system to a halt. Much like how Google builds its infrastructure by assuming component failure and spending its efforts in making recovery smooth (e.g., a Google hard drive dies and needs replacement every few seconds), the right solution to recoverable problems in production is log, triage and prioritize, not to explode in the most spectacular way possible.

There are also many systems where this rule simply would not work as the production environment is not in your complete control, or that the cost of patching others’ warnings and errors becomes prohibitively expensive.

Any type of code that runs in the browser is a victim of an unstable production environment; different browsers, and different plugins and extensions on top of your user’s browsers make reporting even errors a completely lost cause. I remember turning on error logging on an older version of Square’s Dashboard, and quickly flooding the system with so many Internet-Explorer-specific errors (with minified stack traces, if they existed) that it pretty much killed the logging server and the email server trying to dutifully notify us of what we thought we wanted to know. It wasn’t just the volume that made this worthless: many of these errors were non-actionable, and even acknowledgement became a waste of time for the team.

As to worrying about library and framework warnings, fretting over others’ minor inadequacies is a particularly insidious form of the Not Invented Here syndrome. Assuming that the libraries are even open sourced (e.g., they can easily be forked and modified), the opportunity cost of tracking down and fixing minor integration issues – which surface as warnings – is tremendous, that is, if they can be patched at all. Warnings about deprecation with no clear successor, runtime issues which are already gracefully handled, or just plain old mislabeling of INFO as WARNING level messaging; there are plenty of false positives with warnings that seasoned engineers have learned to rightfully ignore in favor of doing more important things.

Tools can help us write better code up to a point. Most of the time, the code exists to serve a purpose that is only somewhat correlated with its quality.

Though the general “tech” label is a bit of a misnomer, the hosts and topics are focused around Macs and iPhones and areas of tech that interoperate with Apple’s systems.↩
Interestingly, GCC doesn’t appear to have an equivalent, and perhaps for good reason.↩

Introduction to Computer Science is Just a Start

Comments 2

Kevin Peck says:

March 6, 2014 at 5:35 am

For my code I do my best to keep it warning free. I have started jobs at so many locations where there are thousands of warnings in the code with most of them being very easy fixes. Once you are trained to ignore warnings they grow quickly and it is very easy to ignore the ones that are helpful. Nothing like filling the output of every Jenkins build process with thousands of extra lines of warnings.
Turn off the warnings you don’t care about. Why see them every day if you have no desire to ever fix them? Of course this is much easier to do in a static language like Java or C#. When it comes to JavaScript you are bound to get warnings in the console based on the current browser. Still you can use JSHint or JSLint to weed out code warnings.
I have not yet used warnings as errors for a production build. I do treat warnings seriously and do my best to eliminate them. They are another helpful tool in by toolbox.
Warnings can help you become a better coder. Learn how to fix it once and don’t continue to make that mistake.

Reply
1. Allen Cheung says:
  
  March 9, 2014 at 7:14 pm
  
  Agreed that warnings can and usually do help you code better, but production is an expensive place to learn those lessons.
  
  Reply