Monday, February 3, 2014

Coder vs Engineer

Stack Overflow is a fabulous resource for programmers. When I have programming questions, the first page of Google results is filled with links to its pages, and they usually have the answers I need. So why do I often feel depressed after browsing its questions?

The answer came to me this weekend: it's a hangout for coders, not engineers.

The question that prompted this revelation was yet another request for help with premature optimization. The program in question was tracking lap times for race cars, and the OP (original poster, for those not familiar with the acronym) was worried that he (she?) was extracting the list of cars and sorting it after every update. He saw this as a performance and garbage-collection hit, that would happen “thousands of times a second.”

That last line raised a red flag for me: I'm not a huge race fan, but but I can't imagine why you would expect to update lap times so frequently. The Daytona 500, for example, has approximately 40 cars, each of which take approximately a minute per lap. Even if they draft, you have a maximum of 40 updates per second, for a rather small set of objects.

To me, this is one of the key differences between a coder and an engineer: not attempting to bound the problem. Those interview questions about counting gas stations in Manhattan are all about this. You don't have to be exact, but if you can't set a bound to a problem, you can't find an effective solution. Sure, updating and sorting thousands of cars, thousands of times a second, that might have performance issues. But that's not how real-world races work.

Another difference is that, having failed to bound the problem (indeed, even to identify whether there is a problem), the coder immediately jumps to writing code. And that was the case for the people who answered this particular question. Creating a variety of solutions that all solved some interpretation of the OP's problem.

And I think that's what really bothers me: that coders will interpret a question in whatever way makes their coding easiest. This was driven home by a recent DZone puzzle. The question was how to remove duplicates from a linked list, “without using a buffer.” It's a rather poorly-worded question: what constitutes a buffer?

There were a few people who raised that question, but by far the majority started writing code. And some of the implementations were quite inventive in their interpretation of the question. The very first response limited the input list to integers, and used a bitset to track duplicates (at the worst, that would consume nearly 300Mb of RAM — a “buffer” seems modest by comparison). Another respondent seemed to believe that a Java ArrayList satisfied the “linked list” criteria.

Enough with the rant. Bottom line is that this industry needs to replace coders by engineers: people who take the time to understand the problems that they're tasked to solve. Before writing code.

1 comment:

WillS said...

I couldn't agree more. My college education is in Chem. Eng., one of the "classic" engineering disciplines, and one of the very first things we were taught in the engineering courses was to "fully understand the problem". Gather what you do know, use that to derive as much of what you don't know as possible, including the bounds of the problem, and then set about writing the calculations (code) to solve it.

I've experienced the coding vs. engineering issue throughout my 35 year career as a software engineer and I couldn't begin to tell you all the times I've seen code written in haste that had to be severely modified or just thrown out altogether because it didn't solve the right problem, or the problem wasn't analyzed deeply enough to discover all the requirements, or all the myriad other ways a programming effort can fail due to not fully understanding the problem.

I've frustrated many a manager who expected me to just sit down and start throwing code at a problem, by instead analyzing the problem for a good portion of the expected development time. That frustration was usually very short-lived when the resulting program got through QA on the first pass and went into production, never to be touched again except for enhancements.

I place the blame for the pervasive coding mentality in the software community these days at the feet of our higher education institutions and corporate management. Colleges and universities for teaching the languages and algorithm theory but not how to properly solve the problems to which they will be applied. And corporate management for their myopic view of software development costs - looking only at speed to production, not post-production costs such as maintenance, bug fixing, and recovery from issues caused by bugs. Somehow it's OK to continually push for software to be delivered in the shortest possible time frame even when it results in an operations staff running around like a hoard of crazed headless chickens trying to keep up with all the applications that are crashing, running out of memory, failing very ungracefully, or otherwise needing to be restarted several times a day. When the traffic data company I worked for until recently was acquired by a larger company, the OPS manager at the new company did an analysis of just the restarts of the several hundred applications that made up the traffic data system. Over 15,000 restarts per month! Anyone care to estimate the costs associated with that much *useless* work by the operations staff?

Bottom line: we need more engineering and less coding, and we need the universities to teach it and the corporations to encourage it. I've done the former my entire career, been swimming against the tide for most of that time, and had many a salmon day as a result of it. But in the end, my software has always solved the *correct problem* correctly and thoroughly, been robust enough to handle unexpected situations gracefully, and has changed more than a few minds over the years about the right way to go about solving a programming problem.

Will Sappington (hey dude! long time. I'll touch base by email.)