Tuesday, August 31

Obfuscating the package structure for clients

So you've got yourself a nice clean package structure. You're slowly building up your application organically, adding features here and there, driven by customer demand. releasing early and often, covering all the hip methodologies of the day. Then along comes a client who wants to make changes so drastic that they warrant sub-classing. We'll call this fictional client Acme. Let's say one of the classes to be altered is:

net.rudiment.calories.Calculator

So you make Calculator abstract, or an interface, and you create two sub-classes:

net.rudiment.calories.calculators.Generic
net.rudiment.calories.calculators.Acme

Seems simple enough. Perhaps years from now you'll have quite the collection of client-specific calorie calculators:

net.rudiment.calories.calculators.Cyberdyne
net.rudiment.calories.calculators.Hudsucker
net.rudiment.calories.calculators.Kumatsu
net.rudiment.calories.calculators.Oscorp
net.rudiment.calories.calculators.Tyrell
net.rudiment.calories.calculators.UAC
net.rudiment.calories.calculators.Vandelay

But what if calorie calculators aren't the only part of the system that gets commonly personalized for clients? What if you've got a few dozen classes spread out amongst the enormous source tree that have been abstracted and sub-classed like this? That makes it a little tedious to do site-wide client-specific changes. For example, if you needed to work on all the Acme-specific classes, you'd have to search them all out. It might be a little more convenient if you'd started with a structure like this:

net.rudiment.clients.acme.calories.Calculator

This also makes it easier to clean-up the code tree if you ever lose Acme as a client. You simply delete the root of the Acme-specific classes and presto, all gone. However, it makes it tricky to alter the abstract super-classes when you need to review and test all the sub-classes to ensure nothing broke; you have to search them all out rather than have them all conveniently right there in a single sub-package.

My current project is nearly five years old and contains over three-thousand classes in over six-hundred packages. Over the years we've used both of the techniques described above. And, as I explained, both have their drawbacks.

I'm curious to hear your opinions on the issue.

Tuesday, August 24

Checks and Balances: Don't trust yourself

I consider myself to be a pretty smart guy. Perhaps that's a little conceited. But, one of the reasons I consider myself to be so smart is that I don't trust myself to always make the best decisions. Whenever I have to make an important decision, and I have the luxury of time, I bounce it off my right hand man. More often than not, he's eager and willing to disagree with me. Debating the issue with him sometimes opens my eyes to a point of view I might not have considered. I didn’t hire him to be my lackey; I hired him to compliment my skill set.

My second in command reminds me of myself a decade ago; excited about new techniques and technologies and anxious to apply them. I, however, have become the old codger I used to despise; always choosing the safe and established route and thinking very long term about decisions, trying to see the big picture. How boring I have become. But I have much greater responsibilities now. The success of my company and its products depend on me making the right choices.

I love being the man in charge. I thrive on the responsibility, and I’m confident in myself to make the right choices. However, I’m modest enough to realize that I don’t know everything. Listening to somebody that disagrees with me helps me to make a better informed decision. I don’t become defensive; I become a sponge. I soak it up. It’s not my goal to change the other person’s mind, so arguing with them is futile. My goal is to succeed, so I listen and consider their comments.

A prime example of my mentality is this very blog. I don't post these rants because I think I know everything and I think I’m doing you a favor by sharing my wealth of wisdom. I post my ideas and opinions here with the hope that somebody will disagree with me. Not only disagree, but make an intelligent rebuttal. Open my eyes. Show me the error of my ways. Prove me wrong.

Are you up for the challenge? If so, read my last few posts and speak your mind. I'm dying to hear your opinion. Enlighten me!

Monday, August 23

How long will it take you to build this?

My colleague and I get this question a dozen times a day.

One day the big boss man pulls me aside and asks, "When I ask you how long it's going to take to build something, you tell me a long time but finish it quicker than expected. When I ask your colleague the same question, he tells me it will be quick, but takes a lot longer than expected. What's up with that?"

Well, there's several reasons, of course. First of all allow me to explain why we give differing time estimations. I can't read my colleague's mind, so I'm speculating on his thought process, but I believe that he answers the question literally. He hears the word "build" and interprets it as "code". In his mind, he believes that he can "build" the project in so many hours, and he does. However, what the boss is really asking is how long it will take to discuss, design, code, test, review, change, discuss again, redesign, re-code, test again, package it up, document it if necessary, and deliver it to the client. These are all the factors that I consider when asked "how long?" which is why my time estimations are so much larger than my colleague's.

OK, so why then do I overestimate and he underestimates? Again, there are several factors. He underestimates because he is again answering the question literally. My colleague suspects it will take him a certain amount of hours to complete a project, and he answers the question as if he were going to devote said hours to the project. However, what the boss is really asking is, "When can you finish this new project while still contributing to all your current projects and compensating for forthcoming interruptions such as impromptu meetings, phone calls, and cleaning the latest Outlook virus off my system?" These are the reasons that I overestimate; I anticipate a plethora of distractions, and when I'm lucky enough to get an hour or two of solid uninterrupted work done, I manage to deliver projects ahead of my anticipated schedule.

The real humor in all this is we've been iterating this cycle for so many years now the big boss man, admittedly, interpolates our time estimates and expects projects in half the time I declare and twice as long as my colleague proclaims.

Saturday, August 21

Exception Obfuscation: Hiding what really went wrong

In my last entry, I discussed my colleague's plea to pollute [sic] our precious four-year-old over-one-hundred-thousand-lines JDK-1.1-compliant code base with some fancy schmancy new collections APIs. Well, this week he's at it again with what he calls "exception chaining" and I call "exception obfuscation."

Exception

public Exception(Throwable cause)
Constructs a new exception with the specified cause and a detail message of (cause==null ? null : cause.toString()) (which typically contains the class and detail message of cause). This constructor is useful for exceptions that are little more than wrappers for other throwables (for example, PrivilegedActionException).

Parameters:
cause - the cause (which is saved for later retrieval by the Throwable.getCause() method). (A null value is permitted, and indicates that the cause is nonexistent or unknown.)
Since:
1.4
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/Exception.html#Exception(java.lang.Throwable)

Yup, it's another whiz bang new feature in the JDK 1.4 API. You can take a perfectly good exception and obfuscate it inside another exception. And what for pray tell does my colleague want to use this shiny new bauble? He wants to take all those pesky methods that declare multiple exceptions, like IOException and SocketException and SQLException and wrap them up into a neutral exception like SomethingBrokeException so that the calling classes don't have to declare catch blocks for all the possible types of exceptions; they can just catch the one generic exception.

"Well," you might ask, "why doesn't he just catch Exception? That will satisfy all the declared exceptions." Yes, but it will also encompass some undeclared exceptions, and when they get thrown, we want them to propagate, because they might indicate the system is in an unstable state, and in some cases it's better to crash than to keep processing and possibly cause even more damage.

OK, so he's got a valid complaint and a viable solution. What's the problem? The problem is what about calling classes (either present or future) that care about the type of exception thrown? What if the calling class wants/needs/knows-how to handle IOExceptions differently from SQLExceptions? Now it's catching a generic SomethingBrokeException, but it cares about what broke!

"Well," my colleague argues, "if the calling class really cares about what type of exception was the root of the problem, it can call the new getCause() function." So the point is that we can replace a try/catch block with an if/else block, and the calling classes that don't care about the problem can just catch the generic exception and thus save the programmer a little typing. Poppycock, I say!

I'm pretty big on analogies. Being a CTO, I commonly have to explain complex technological concepts to non-technical people (like my clients, and the CEO), and I always do so through analogies. So allow me to address this issue likewise: It's like you're a general contractor and I've hired you to build a new room onto my mansion. You've fitted the new room with contemporary five-prong electrical outlets so I can take advantage of the next generation of appliances. But since I don't have any such appliances, you're also offering to sell me some new-five-prong-to-old-three-prong adapters. Gee, thanks a lot! Now I've got a house with two different types of electrical outlets, and I have to keep a box full of adapters in the closet for special occasions. You've taken something that worked just fine, didn't make it work any better, and just made it harder for me to use with my existing tools, so that it might be easier to use with future tools.

Tuesday, August 10

If it ain't broke, don't fix it

Last week I had a little debate with my right hand man. We finally took the plunge and switched from Microsoft J++ to Eclipse as our primary IDE. This move required a massive change to the source tree due to a bug in Eclipse where it refuses to compile if you have a subdirectory with the same name as a class; but I digress. This isn't another blog rant about what's good and bad about Eclipse. We both put all our other work aside and spent the entire day performing a massive reorganization of the code tree... and this was a major pain in the ass thanks to Microsoft Visual Source Safe; but again I am straying from the main topic. We made Eclipse happy, and it returned the favor.

So back to the debate. J++, being the dinosaur that it is, compiles and conforms to the 1.1 version of the JDK. Eclipse, on the other hand, supports the 1.4 version of the JDK. So my right hand man sends me an instant message telling me that he's going to start using the newer 1.4 version of the collections API. That set off the klaxons in my head.

What's the harm in using the newer collections API? Well, before I answer that question, let me first ask what's wrong with the older versions?
  1. Are the older collections broken? Nope.
  2. Do the newer virtual machines support the older collections? Of course.
  3. I am aware that the newer collections offer [theoretical] performance gains by sacrificing thread safety, but our system has no performance problems, so I'll refer you to the quote by Joseph Newcomer in my prior blog post.
So there's certainly no need to switch to the new API, but why wouldn't we? Well, here are my reasons:
  1. Inconsistency within our code base. We have over 115,000 lines of Java code in our product. It's over four years old. If we start using the newer collection API now, there's inevitably going to be a situation where we are returned an old collection from and older object and have to convert it into a newer collection in order to pass it into a newer object. That means we'll need to build, or find and possibly license, a set of utility classes for converting from one collection type to another. This is discussed in an old article from Sun entitled Converting Between Old and New Collections. This headache can be avoided by continuing to use the older collections.
    Side note: I have another mantra about not passing generic collections in and out of functions, but that's a future rant.
  2. Portability. Yeah, the major platforms have ports of the latest JDK versions, but there's always the possibility that we'll have to port to a platform that's a little behind the curve. If that scenario ever surfaces, we'll have to re-code all the classes that take advantage of the shiny new toys in the later versions off the JDKs.
So I had to exercise my seniority and put the kibosh on the new collections.

Monday, August 9

Quotes to Code By

Whenever I come across a quote that strikes me as insightful or witty, I print it out in a large font and stick it on my wall for visitors to appreciate. Here are some of my favorites...

Inspiration...
Good, better, best, never let it rest, until your good is better, and your better best.
-- Shepparton-based steel manufacturer J. Furphy & Sons

Second place is the first loser.
-- Anonymous

I want to achieve as much as I can in this sport, tactically outwitting the opposition to win. I want to time trial as fast as I can physically go. I want to be a key member of a strong team that can ride aggressively and win and make other riders suffer in pursuit.
-- Emma James, April 2002
Wisdom...
We can take the time to fix it now, or we can have this discussion again next month.
-- Thomas E. Davis

There is never time to do it right, but always time to do it over.
-- Anonymous

The trouble with doing something right the first time is that nobody appreciates how difficult it was.
-- Anonymous

With each passing year I realize that the prior year I didn't know jack shit.
-- Thomas E. Davis

Growth for the sake of growth is the ideology of a cancer cell.
-- Edward Abbey

...too many cooks working on code in the early days causes bad architecture. Software development works best when a single person creates the overall architecture and only later parcels out modules to different developers. And if you add developers too fast, development screeches to a halt, a phenomenon well understood since 1975.
-- Joel Spolsky (referring to Brooks’ Mythical Man Month)

Think about the design decisions you made a year ago.
Think about how ignorant they seem in retrospect.
Think about the decisions you are making today.
Think about how they will seem a year from now.
-- Thomas E. Davis
Comprimises...
The cost of flexibility is complexity. Every time you put extra stuff into your code to make it more flexible, you are usually adding more complexity. If your guess about the flexibility needs of your software is wrong, you've only added complexity that makes it more difficult to change your software. You're obviously not getting the payback. The alternative is to use the Extreme Programming approach and not put the flexibility in at all. Extreme Programming says, since most of the time we get it wrong, just don't put the flexibility in there. If you strive to keep your design as simple as possible by avoiding speculative flexibility, then it's easier to change the code because you have less complication to deal with. The code is easier to understand and easier to change. As a result, you can make changes much more quickly.
-- Martin Fowler

Simplicity is about acknowledging the tricks exist but not using them.
-- Kent Beck

Optimization matters only when it matters. When it matters, it matters a lot, but until you know that it matters, don't waste a lot of time doing it. Even if you know it matters, you need to know where it matters. Without performance data, you won't know what to optimize, and you'll probably optimize the wrong thing. The result will be obscure, hard to write, hard to debug, and hard to maintain code that doesn't solve your problem. Thus it has the dual disadvantage of (a) increasing software development and software maintenance costs, and (b) having no performance effect at all.
-- Joseph Newcomer

Good. Fast. Cheap.
Pick any two!
-- Anonymous

I have worked with people who thought 80 hours a week made them better programmers, but from my perspective, they were so worn out that they got less done. Managers saw the long hours and were impressed by their dedication and loyalty, but all I saw was people spending hours on trivial problems because their brains were so fogged they were incapable of the five minutes of thought that would have pointed out a better solution.
-- Anonymous
Miscellany...
If you call its code,
it's an API.
If it calls your code,
it's a framework.
-- Simon Brunning

...while databases are slower, in many cases much slower, than procedural code, they have an important property: they can be used to answer unanticipated questions acceptably quickly. How quickly is acceptably quickly? Well, if the database can come back with an answer faster than it takes a skilled programmer to come up with a special purpose program to answer the question, it has done its job.
-- Anonymous post to slashdot.org
A disgruntled coworker...
"All this could be much better" should be our company motto.