Optimization for Dummies

A somewhat inconspicuous looking article on optimizing third party code has stirred up quite a conversation/flame war among many in the Mac developer community. The comment thread on the article is already quite long, and it seems to me that the people with opposing viewpoints are just talking right past each other at this point, so I thought I'd add in my own perspective here.The jist of the article is that the author, Ankur, needed to draw some gradients in one of his applications, downloaded the source code for CTGradient, an open source library that provides a class to draw various types of gradients, and decided that it had way more functionality than he needed in his own program. So, he went through the code and basically removed everything he didn't need for this single application. That's all fine and good, and the article is actually an interesting look at performing various refactorings and dead code removal.The problem arises in that he implies that if a developer decides to use some open source code and doesn't go through and strip out every ounce of functionality that they're not immediately using, that means that they're encouraging "code bloat" and that their code is not "optimized". Several commenters asked what kind of performance/memory gain he actually saw, and the only numbers he provided were from the "Real Memory" column in Activity Monitor, which is a pretty crude measure. The conversation went downhill from there.I think one main point of miscommunication here is over the terminology the original author uses for some of the things he's talking about. When you talk about "optimizing" code, I, and most other developers I know of, think of making the code run faster. The author reinforces this by stating "...you can optimize this thing till it runs like a Ferrari". Certainly sounds like he's talking about making the code faster, but the vast majority of what he's doing is simply stripping out code that he's not going to be using. This has pretty little effect in and of itself - it saves a few KB in disk space, and if the code truly is unused, it probably won't even get paged into memory in the first place. There are a few places where the code probably runs faster, but the gains look pretty minimal in the big picture. However, arguing the nitty gritty details about his particular optimizations is missing the bigger point...Engineering is all about tradeoffs, and in computer software, this typically means choosing between things such as memory usage, disk usage, CPU usage, and so on. Every one of these, however, inevitably comes up against the restraint of development time. You can spend days, weeks, or months optimizing your code in various ways, but it's all for naught if you don't eventually ship your application. This means that you can't do everything, and have to pick and choose what areas of your code to work on, whether to add new features or shore up existing ones, and how much time to spend optimizing performance and memory usage.What is conspicuously missing from the article is any sort of evidence that CTGradient was actually causing any sort of performance or memory problem in his application in the first place. Now, may more have gone on that he didn't include in his write-up, but it sounds like he simply looked at the code, decided that it was obviously too big and bloated, and set to work spending quite a bit of time hacking it down to the bare minimum he needed. Finding and fixing performance and memory problems in real applications, however, is rarely so simple. It's rare that you can glance at a piece of code and immediately deduce that it's going to be problematic for your application, causing slowdowns or whatnot. Most such bottlenecks are discovered as a result of rigorous testing, using tools provided by Apple such as Sampler, Shark, and Instruments (among others) to dig into the details of what your app is actually doing and where it's spending its time. Upon discovering such a problem, you can then go in and spend your precious time fixing what most needs to be fixed. It's not that the modifications he makes don't actually reduce code size and memory footprint (they do) or increase performance (still not really sure about this without empirical data), but with testing first to find what needs fixing, the time spent doing all this could very well have been better spent fixing something that actually needs fixing.I actually use CTGradient in a couple of my projects, and I use the code basically untouched. This is not because I'm "lazy" and would rather count my customers' money while cackling evilly than spend the time to strip out everything I don't use, but rather because I have plenty of other things to spend time on, optimizing my app in ways that make a difference, and adding features that people want and need. None of my tests of my drawing code have ever shown any performance problems arising from using CTGradient as-is, so my motto is, if it ain't broke, don't fix it. The critical flaw in Ankur's argument in his post is that he never showed any evidence that anything was broken in the first place.

2007-11-24