Daily Archives: August 24, 2011

CS101: Clarity Trumps Everything

Clarity is the #1 priority when writing code. Clarity trumps everything else; it’s even more important than the code being correct! One of the biggest wins a serious programmer can offer is writing clear, readable code.

Source code is for the human reader; it’s not written like that for the computer. If it were up to the computer, humans would use the same ones and zeros it does. It would be much more efficient that way. But us (inefficient) humans need ways to write and talk about computer programs. We use many invented languages, computer languages, to do that. And when we write in these languages, it is crucial to be clear.

Clarity is more important than correctness (which is still very important!), because clear code can be fixed if it’s incorrect. Unclear code is harder to work with. It’s harder to find, let alone fix, bugs. Clear code also helps debug as you write; you are more likely to catch your own errors when you write clear code.

Clarity is more important than comments (which are still very important!), because clear code is readable! Part of clarity is using good object names, and that helps the code be self-documenting. On the other hand, writing clear code means writing good comments, so don’t skip commenting just because you think you write very clear code.

To be honest, writing good comments is such a close second, it’s hard not to argue it’s tied for first as being a #1 priority when writing code. It’s a wash as to which mess I’d rather not clean up: clear code with no comments; code with good comments but no clarity. I think it would be easier to add my own comments to the clear code than to clear up the commented code.

Clarity is important, because it enables change. Clear code has a longer life. Unclear code can be difficult (or impossible) to maintain.  I have sometimes found it easier to throw code away and start fresh, than to try to work with “a big ball of mud.” Clear code is maintainable and reusable.

The Lesson for Today

Always think about the next programmer who has to read and understand your code. That programmer is quite possibly you in six months when you have to revisit your code. Except now you’ve forgotten it all and it looks as new to you as to anyone else! Writing clear code benefits you when you revisit it as well as when you write it.

Write code that a human being can read. (Assume a human being that is a programmer who knows the language.  You don’t have to write code for novices.)

Write a Formal Letter

Think of writing code as you think of writing a formal letter.

When you write a formal letter, you have two goals: you have a message to communicate, and you must follow the protocol of a formal letter. Your message comes through when your writing is clear and good. Following the protocol is a matter of knowing and following some syntax rules.

A message + formal syntax. The result is a document with a context.

Code is the same way.  It has a specific message to communicate, and there is a formal syntax the writing must follow.

Following the syntax of the programming language is just one part of the formality of writing code.  In a formal letter, every part has meaning. In some contexts (international diplomacy, for example), every sentence, every punctuation mark has meaning. Even the spacing between parts can mean something in some contexts.

So when you write code, think of it as a message you wish to communicate and make your code look clean and readable.

Literate Programming

Literate Programming is a specialized approach to writing code that enables a programmer to write at a more human logical level, somewhat like a story. The source code was designed to be very readable by humans, and literate programming tools could translate the source into both code for the computer and documentation. It’s a huge win when you have the language, tools and practice to use it.

In the meantime we can borrow from the idea of source code as a book that can be read sensibly.  A source code file should unfold from top to bottom like a story with a beginning, a middle and an end.  The idea is to give the reader a good understanding of the function and purpose as they read.

For example, in Java this applies to how your write your class and interface files.  Where do you put the constructors? Where do you put the static and instance properties?  Where do you put static properties and methods versus the non-static?  Where do you put the get and set methods?  Is there a specific place for standard methods such as toString, equals, compareTo and hashCode?

The real point is this: the order of things in a code file is important.  The order communicates information.  If there is no structure in your source code, if the order of things has no meaning, then you lose the chance to communicate to the reader. You lose the chance to give yourself a known environment! Skilled programmers use the ideas behind literate programming and structure their source code to communicate more richly.

This means that once you know a programmer’s structure, you can navigate their code better. You know where things are. You know what they are likely to be named. When code has structure and order, it has a familiar shape.

Clarity, Clarity, Clarity

The order of things is one way you make source code clear. I’ve also mentioned the importance of good comments. (I’ll talk about good comments vs bad comments another time.) There are other important ways you write clear code.

The names of your variables go a long way in communicating your intent. The compiler couldn’t care less, but a longer, descriptive name can be a blessing for a human reader. Don’t be afraid to use descriptive variables names. The days of compiler length restrictions are safely behind us, and editing environments help remember and type names for us. Besides, programmers should be good typists, right?

At the same time, don’t go nuts. Verbosely long names can make the code look cluttered and ungainly. You want some degree of streamlining; that’s a part of clarity, too.

The thing about variable names is that you want a naming convention. You want a way of going about the business of variable names that gives you the same name in the same situation. I’ll talk more about that another time, but one example is that my loop indexes are always ix. The i changes to other letters, but the two-letter name ending in x is a standard I use faithfully.

After years of doing this, I sometimes surprise myself by how well variable names match across a project. Sometimes on a large project I’ll start to create a new class and then realize something like it already exists in the project or in a library. If it’s one I made in the last few years, chances are the one I just started looks surprisingly the same. Source code is one place where you want to be predictable. You want some moves to be instinctive.

A more subtle form of clarity, one that takes experience, has to do with the form of the code. The way the code goes about its business can be opaque or transparent. Source code should be as transparent in its intention as possible. Source code should speak its mind about what it’s doing. There isn’t space here to go into that, but I’m sure I’ll return to the topic often down the road.

Let me end with an example of a very simple form of clarity. It has to do with lining things up in your source code. Compare the two variable lists below:

    protected Logger myLog;
    protected HashMap<String,String> myMappedFields;
    protected HashMap<String,QueryChild> myChildEntityTypes;
    protected String myInstanceId;
    protected WebServiceRequest myRequest;
    protected SAXOnDemandMapping myFieldMap;
    protected SAXOnDemandEntity myCurrentEntity;

Just aligning the names makes the code easier to read:

    protected Logger                       myLog;
    protected HashMap<String,String>       myMappedFields;
    protected HashMap<String,QueryChild>   myChildEntityTypes;
    protected String                       myInstanceId;
    protected WebServiceRequest            myRequest;
    protected SAXOnDemandMapping           myFieldMap;
    protected SAXOnDemandEntity            myCurrentEntity;

It all boils down to the first rule:

Always write source code to be read!