Explain the Path Not Taken

In my travels around the blogosphere, there is a common refrain in technical advice. For any given problem, there are n number of ways of doing it, but there is one most right way.  When writers write, they have a limited amount of time, so they tell you the ‘right’ way.  They may tell you not to solve your problem a certain way; but often times they don’t tell you *why*.

Why the hell not?

The answer you didn’t choose is as important as the answer you chose.  The reasons behind it are just as important, if not more so.

Recently, I was tasked with improving the performance of an advanced search query.  The query was taking about 5 minutes to execute.  No, that’s not a typo.  After diagnosing and benchmarking the query with the resident Oracle DBA, we started brainstorming fixes; and the idea was to get the low hanging fruit out of the way.  Here are the list of possible ways to fix this issue:

1. Force the user to use more than scant criteria for an advanced search (in this case, by requiring the user to enter more information into other indexed fields)
2. Improve the query performance through adding the appropriate indexes
3. Parallelizing the query since parts of it used a full-table scan.

Pros and Cons:
Approach 1:
Pro: Allows us to defer the actual query performance issue
Con: Makes user unhappy; alters existing functionality

Approach 2:
Pro: Low hanging fruit. Easy fixes; easy to benchmark
Con: May not get us ‘all the way there’; may have un-intended side effects.

Approach 3:
Pro: Divvies up the Query into multiple threads
Con: Possible to slow the query down since those threads have to share information

We tried the latter two approaches.  I documented the first one; but did not implement it. The second approach netted a 78% performance increase when we added two indexes.
 The third approach ended up actually reducing the original query by a miniscule amount but it was three times slower than approach #2. Ultimately we added the indexes and reduced the query time by 78%; a non-trivial amount; and sufficient since the feature is used very rarely.   To put more man-hours into a fix for a feature that is so rarely used would be folly.

My point is, it’s important for the developer that comes after me to know all of this.  If the client says, “78% isn’t good enough” then the next developer will need to know what I already tried; otherwise he’ll simply waste time re-duplicating effort.

The next time you see a blog post on the internet detailing a fix, ask them what didn’t work — you may be surprised at the answer.

Leave a Reply