Embrace Fear

I used to be scared of databases.

They’re the closest thing to a black box that I deal with on a daily basis.

Cryptic messags point to mal-formed queries; Debugging consists of <pre>PRINT</pre> statements; and learning the inner workings is akin to learning another computer architecture.

Databases were foreign animals to me. I open up a box to another world; another construct; and quite like the Matrix: I have no idea how it works.

There was my problem. Right there. In bold, even.

I wasn’t particularly proud of this fact.

If you’re a professional software developr, you’re probably on the wrong end of the database-skill curve; I know I [still] am.

For developers, SQL and Databases are a tool. We’d use XML and XSLT if it were more performant.  There’s an entire industry that uses Object Relational Mapping to get around SQL and Databases. But the problem isn’t with ORMs, it’s when we use them because we fear the unknown.

Verily I say unto you:

Embrace this fear. Let it work for you.

Without it, we’d blindly write queries; we may not parameterize variable input; and we might do something stupid, like mis-sizing a database field.

When you aren’t afraid of the unknown anymore, it’s time to question why. That fear is healthy; ignoring it isn’t.

Configuration Pain – Creating your own Custom Configuration Section

 

Recently I was asked to implement a ‘smarter’ application for managing temporary files on our company’s webfarm. We use a number of third-party tools that don’t clean up after themselves  really well (or, we may just be using them incorrectly); and as such we frequently run into low diskspace issues due to viewstate and other third party utilies not being cleaned up.

Goal: Build an application that can be configured by the end user to delete directories and files as needed (like viewstate, activePDF temp files) and run as a scheduled task. Any logging should be done to the Windows event log.

There are a number of methods that could be used to enable the user to set up their configuration options, but in keeping with the .NET model for configuration, I picked using XML and the app.config to store the settings using a custom configuration section.

I ran into an issue  during the course of development because I had the wrong XML layout. I used the following method:

<folder>
     <path>D:MyPathmyFolder</path>
     <description>MyDescriptionhere</description>
     <fileType>*.txt</fileType>
</folder>

instead of:

<folders>
     <folder id=”1″ path=”D:MyPathmyFolder” description=”MyDescriptionhere” fileType=”*.txt” />
</folders>

This creates a problem because of how .NET handles custom sections. It apparently doesn’t like the syntax of the former, and instead of implementing code to serialize it and de-serialize it myself, I chose to stick with implementing the configuration API as written.  I had that choice, since I was implementing a new ‘feature’; and if you’re maintaining a legacy project, you may not have that luxury. 

Once you know the format your XML will take, it’s a simple matter of creating the custom configuration section C# classes. The first class that is going to be created is a class that is the actual Custom Configuration section.  Classes also need to be created for each collection of XML items (the list of folders in the <folders>, and a class needs to be made for each element and its properties (the folder element and the path, description, and fileType properties.

 

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Configuration;

namespace DelViewState_CS
{
    public class DelViewStateConfigSettings : ConfigurationSection
    {
        public DelViewStateConfigSettings()
        {
            
        }
                       
        [ConfigurationProperty(“folders”, IsDefaultCollection = true)]
        public FoldersCollection Folders
        {
            get { return (FoldersCollection)base[“folders”]; }
        }
        
        [ConfigurationProperty(“timeout”, IsRequired = true)]
        public int Timeout
        {
            get { return (int)base[“timeout”]; }
            set { base[“timeout”] = value; }
        }
       
    }

}

 

The configuration section has two properties, the timeout property (a poorly named legacy property that indicates how long it’s been since the file was accessed), and a folderscollection property that will contain the individual folders and their properties.

Here is the C# code for the folderscollection class:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Configuration;
namespace DelViewState_CS
{
    public sealed class FoldersCollection : ConfigurationElementCollection
    {
        protected override ConfigurationElement CreateNewElement()
        {
            return new FoldersElement();
        }
        protected override object GetElementKey(ConfigurationElement element) 
        {
            return ((FoldersElement)element).Id;
        }
        public override ConfigurationElementCollectionType CollectionType
        {
            get
            { return ConfigurationElementCollectionType.BasicMap;  }
        }
        protected override string ElementName
        {
            get
            { return “folder”;  }
        }

        public FoldersElement this[int index]
        {
            get { return (FoldersElement)BaseGet(index); }
            set
            {
                if (BaseGet(index) != null)
                {
                    BaseRemoveAt(index);
                }
                BaseAdd(index, value);
            }

        }

        new public FoldersElement this[string Path]
        {
            get { return (FoldersElement)BaseGet(Path); }
        }

        public bool ContainsKey(string key)
        {
            bool result = false;
            object[] keys = BaseGetAllKeys();
            foreach (object obj in keys)
            {
                if ((string)obj == key)
                {
                    result = true;
                    break;
                }
            }
            return result;
        }
    }
}

The final class we have to write is the folder element itself.  This contains the necessary properties of the folder element:

 

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Configuration;
namespace DelViewState_CS
{
    public sealed class FoldersElement : ConfigurationElement
    {
        [ConfigurationProperty(“id”, IsRequired=true, IsKey=true)]
        public int Id
        {
            get {return (int)base[“id”]; }
            set { base[“id”] = value; }
        }
        [ConfigurationProperty(“path”, IsRequired = true)]
        public string Path
        {
            get { return (string)base[“path”]; }
            set { base[“path”] = value; }
        }

        [ConfigurationProperty(“description”, IsRequired = false)]
        public string Description
        {
            get { return (string)base[“description”]; }
            set { base[“description”] = value; }
        }

        [ConfigurationProperty(“fileType”, IsRequired = false)]
        public string FileType
        {
            get { return (string)base[“fileType”]; }
            set { base[“fileType”] = value; }
        }
        
    }
}

Once all of that is written, it’s a simple matter of accessing the custom section and reading these properties (and in my case, deleting the files associated with them. Accessing the custom configuration section is as simple as creating a static factory method in the customconfigurationsection.cs file:

public static DelViewStateConfigSettings GetDelViewStateConfigSettingsSection()
        {
            DelViewStateConfigSettings cs;
            System.Configuration.Configuration config = System.Configuration.ConfigurationManager.OpenExeConfiguration(System.Configuration.ConfigurationUserLevel.None) as System.Configuration.Configuration;
            cs = config.GetSection(“delViewStateConfigSettingsGroup/delViewStateConfigSettings”as DelViewStateConfigSettings;
            return cs;
        }

 

There are plenty of methods for writing to the event log, but I followed the guidelines laid out here.

In the end, it took a little over a week from the time that I first learned about the task to the time it was completed (testing, checkin, cleanup).  There are a number of ways it could be made better:

  • Using delegates and anonymous methods to shorten the written code that deletes the files and writes to the event log
  • Pull the custom section into its own XML file that is referenced int he app.config; to allow for drag and drop placement.
  • Checking the files for being ‘locked’ (which happens sometimes) and working around that.  For the time being, I’m logging those happenings to the event log so I can find out why they’re happening and fix the actual problem

There are more improvements to come, in the next blog post on this topic, I’ll talk about enhancing this utility by getting it to scope out IIS Virtual directories and auto-filling that information in.

Who let the Comments out?

There are a multitude of blog posts out there that detail when you should and should not comment, but Jeff Atwood of Coding Horror puts it best:

Code can only tell you how the program works; comments can tell you why it works. Try not to shortchange your fellow developers in either area.

We’ve all heard of tales of college students whose professors required that they extensively document their code; sometimes at risk of a letter grade.

Once upon a time, in a land not so far away, a diligent young computer science student sat eagerly attentive in her C++ programming class. Her formidable instructor was droning on and on about how Programming Comments were the basis of all that is good in the world of software design. Then at last, those famous words, “If you do not comment your code, you will automatically drop one (1) letter grade.”

Academia is one thing, but what about programming texts? Shouldn’t they advocate the sane way of commenting code?

I wish.

I pulled open my old C++ book today, just to get a refresher on function pointers and things I’d forgotten (since C# makes use of function pointers in anonymous methods and lambda expressions), and happened to open up to the page on comments.

There were no less than eight bullet points on commenting code.  From Structured & Object-Oriented Problem Solving using C++, Third Edition by Andrew C. Staugaard:

 

Debugging Tip

Programming Comments are an important part of the program documentation and should be used liberally (emphasis mine).  […] At a minimum, the program should include the following comments:

  • The beginning of the program should be commented with the programmer’s name, date the program was written, date the program was last revised, and the name of the person doing the revision.
  • The beginning of the program should be commented to explain the purpose of the program, which includes the problem definition and program algorithms.  This provides an overall perspective by which anyone, including you, can begin debugging or maintaining the program.
  • Preprocessor directives should be commented as to their purpose.
  • Constant and variable definitions should be commented as to their purpose.
  • Major sections of the program should be commented to explin the overall purpose of the respective section.
  • Individual program lines should be commented when the purpose of the code is not obvious relative to the application.
  • All major subprograms (functions in C++) should be commented just like the main program function.
  • The end of each program block (right curly brace) should be commented to indicate what the brace is ending.

Well, at least he’s thorough.  Some of the advice is sage; and other parts of the advice are just downright redundant in the real world, if you’re using source control and an anything but notepad.

So here’s my revised list — directed at that very same college student who is stuck reading that text.

 

Debugging Tip

Program Comments are an important part of the program documentation and should be used to help any programmer understand what is happening.   […] Use the following guidelines for guidance:

  • The beginning of the program should be commented with the programmer’s name, date the program was written, date the program was last revised, and the name of the person doing the revision.
  • The beginning of the program should be commented to explain the purpose of the program, which includes the problem definition and program algorithms.  This provides an overall perspective by which anyone, including you, can begin debugging or maintaining the program.
  • Preprocessor directives should be commented as to their purpose, and directives should not change the meaning of reserved words.
  • Constant and Variable definitions should have a clear name that explains their purpose, so comments aren’t necessary.
  • Major sections of the program should be commented to explain the overall purpose of the respective section when its purpose isn’t reasonbly clear.
  • All major subprograms (functions in C++) should be commented just like the main program function.
  • The end of each program block (right curly brace) should be commented to indicate what the brace is ending.
  • Write code that explains itself. Comments should only be used to tell someone WHY you did something, not what you did. That’s what code is for.

Instead of using the beginning of the textbook to discuss basics that any CS101 book should cover, I’d use it to cover source control, and the integral part it plays in keeping track of code changes — something that the author wanted the bullet points to do, but which don’t make sense in modern source control systems.

If you have new developers on your team, do them (and yourself) a favor and point them to one of the references I mentioned. Friends don’t let friends comment liberally!

NB: These ideas aren’t original to me; I got them from reading Coding Horror and reading Code Complete.

 

The Cost of OpenID

I was chatting with a friend today about implementing OpenID for a possible startup idea, and we talked about the cost of OpenID vs its benefits.

The conclusion of the chat was that if you’re a bootstrapped startup, it almost never makes sense to spend the time and money implementing an authentication system.

Don’t take my word for it, look at the numbers.

Let’s say the average Software developer in your area makes just $65,000 a year (or $34 per hour)*. If they build their own authenication system from the ground up, you could look at between 80 and 120 hours worth of time spent on it (given a 60% productivity rate, that comes out to 72 hours spent on the project, or $4080 spent on programming this one part of the system.  That’s a conservative estimate.  There are many ways to do authentication; and everyone gets it wrong sometimes.

What’s the cost of implementing OpenID? 5 hours worth of initial work and $500 at most? If you went with the ‘Plus’ option, it’s only $100 per year — that’s a savings of almost $3800!

That’s $3800 that could be used for far more important things than building an authentication system, and it’s one less part of the system you have to maintain in the future — which represents an immeasurable cost savings towards other projects.

In most cases, the cost doesn’t justify the benefit of rolling your own system. Let someone else take care of it.

No Aliases in Where Clauses

I was bit by the ‘No Aliases allowed in Where Clauses’.  It also explained Linq-to-SQL’s seemingly strange syntax at the same time.

The SQL Order of Operations isn’t as its written by developers, but the other way around.

  1. FROM clause
  2. WHERE clause
  3. GROUP BY clause
  4. HAVING clause
  5. SELECT clause
  6. ORDER BY clause

 

So, when you have an aliased column, it can’t use that in the WHERE Clause because it hasn’t been told what that means yet.  So what happens when you want to use the Aliased column in the Where Clause?

There’s a Hack you can use:

Wrap the entire statement (or that line, as applicable) with a SELECT statement that contains the WHERE Clause you want to alias.

For instance:

select * from ( select a + b as aliased_column from table ) dt where dt.aliased_column = whateveryouputhere.

It does it rather nicely. You can find the question and answer (not mine) on Stack Overflow.

 

Take the Long View

I was at the bank this afternoon depositing much needed money into my very dry bank account.  At about 3:50pm, all of the computers in the bank suddenly slowed to a crawl. We stood there at our respective tellers, glancing around for 5 minutes while all of the tellers asked each otehr if they were able to ‘get in’.  They apologized, customers sighed and generally no one got anything done until the ‘computers’ came back up.

Being a software guy, my first instinct is to blame the programmers.  The tired, overworked developers who were tasked with developing and deploying Version 2 Of The Bank Software; these developers (if they’re anything like any other programmer out there) probably had a budget that was overrun, a deadline that was missed, and a manager breathing down their neck.  They probably developed the application in some easy to use language for rapid development, and if they’re anything like me (and every other programmer out there) they aren’t great programmers. In fact, if you want to get statistical, there might be 1/3 of them that are ‘above average’, 1/3 ‘average’, and 1/3 that downright suck.

They probably work in a cubicle farm as part of a consulting firm, and they’re probably the lowest bidder for the project.

Now, barely any of the decisions made are the developer’s fault. And the business people at the bank probably thought they were getting the best bang for their buck — and they did.  The consulting firm probably guarantees top-notch quality and superior customer service; and the developers probably read The Pragmatic Programmer religiously.

With all of the business decisions that were made, the bank probably didn’t put an emphasis on ‘fast’. They probably didn’t say, “We need it in one year, for this cost, and we need it to be ‘fast’.” They probably assumed (much like you and I would if we weren’t programmers) that it would be lightning fast. Or maybe they’re so used to substandard application speed (I’m looking at you, Outlook) that they don’t even bat an eyelash when the system goes ‘down’ for five minutes on a busy Friday afternoon. 

The question is, Why.  Why does the system slow to a crawl on Friday afternoons?

Because business people made decisions based on the near-term; based on contractual obligations, based on cost; and based on maximizing profit.

No one probably even thought about the customer that has to wait in line.

But what’s the effect? So a customer has to wait for five minutes — no big deal, right?

Well, if that customer is his own business person, then that’s five minutes less they have to do any work. That’s five minutes of their life that isn’t spent doing whatever their business is.

Then there’s the multiplier effect of that. Someone who was waiting for that person now waits 5 extra minutes, and someone else waits five extra minutes because that person was waiting on person #1, and so on.

How do you quantify that?

Quantify it as lost business.  That three weeks you could spend delaying deployment and load testing your application could save you money when that business decides to use you for its next version.

As a developer, you’ve got to do your learning outside of work. No longer is yours a nine-to-five job.  It is a font of continual learning.  Spend an hour a night just catching up on the intracies of your programming language. Premature Optimization is the root of all evil, to be sure; but skipping optimization for time saved by your banks’ customers is a recipe for never getting that bank as a customer again.

Get out of your cubicle farm, and see your software in action. Spend five minutes on a friday afternoon with your customer and see whether or not their software slows down inexplicibly. If it does, fix it. If it doesn’t, congratulations; you’re the minority.

Bad Acting

During my time in the Army, I used to get rankled when I’d see Television or Movies portray soldiers.  Generally, they’d get something wrong; and usually, it wasn’t just something minor.  It didn’t usually bug me when it was clear they weren’t trying to get it right. It bothered the hell out of me when it looked like they tried to get it right, but failed in some way that could have been prevented if they would have spent $1000 and hired a military consultant for a day.

The same thing bothers me about naming in source code.  Well, not the military part, but the ‘bad acting’ part.  Here are a few bad names I’ve seen in source code recently, and they are all from commercial applications.

timeout  – Describes how old a file should be before it’s deleted

narrative – Refers to the ‘help’ field for a form or page.

nounset – where noun can be anything, like ‘bird’ or ‘category’ or ’email’ or ‘password’. The suffix ‘set’ is added to it to denote a ‘set’ of that noun. In some cases, it refers to a ‘template’.

That’s just a few; and there are more out there, probably in the application you work on.

What’s so wrong with bad naming?

Let’s say you have a new developer come onto your team. it doesn’t matter if this developer is just out of college or if he’s a 10 year veteran. He now has to learn what those words mean.  He then has to apply those names to the intent behind them, and daily has to remind himself of what they mean.

timeout  – fileage

narrative – instructions

nounset – nouns, or nounTemplate (This can also rely on the fact that in databases, a table is considered a collection of things; adding ‘set’ is redundant).

 

 

 

 

 

Teach Programmers to Fish

Last week, Jeff Atwood and Joel Spolsky discussed open-sourcing Stack Overflow. This week, Micah Martin (of Coding Context) asked a Stack Overflow question about releasing his site, Wikipedia Maze, as Open Source. 

Both had the fundamental question:

Should I release my code to the open-source community?

Short answer: Just releasing a project as Open Source is a mistake, for the same reason that handing a hungry man a fish is a mistake.

I’ve trolled SourceForge for years, but my interest in the myriad of projects op there is precisely zero. Why? Because handing me code doesn’t tell me why it’s useful to me. It’s like putting me in a library with no card catalogue. 

I’m not saying that you should never release a project as open-source, but that if you’re going to do so, it’s your job to make it worth something to other programmers, because the best code in the world isn’t worth anything if no one knows it’s there.

Going Fishing

Consider this gem from Quake:

float InvSqrt (float x){
    float xhalf = 0.5f*x;
    int i = *(int*)&x;
    i = 0x5f3759df – (i>>1);
    x = *(float*)&i;
    x = x*(1.5f – xhalf*x*x);
    return x;
}

If you were a smart programmer who got things done, you’d probably instantly recognize this for what it is: Awesomeness wrapped in bacon.

For the rest of us, it’s a clever piece of code that’s discussed in detail here:

The magic of the code, even if you can’t follow it, stands out as the i = 0x5f3759df – (i>>1); line. Simplified, Newton-Raphson is an approximation that starts off with a guess and refines it with iteration. Taking advantage of the nature of 32-bit x86 processors, i, an integer, is initially set to the value of the floating point number you want to take the inverse square of, using an integer cast. i is then set to 0x5f3759df, minus itself shifted one bit to the right. The right shift drops the least significant bit of i, essentially halving it.

Using the integer cast of the seeded value, i is reused and the initial guess for Newton is calculated using the magic seed value minus a free divide by 2 courtesy of the CPU.

If you were like most programmers and you scrolled across this function, you’d probably wonder what the heck it did. In fact, unless someone took the time to tell you, you might even skip over it.

 

But, what if you saw the article on Slashdot? Even if you had no idea about it before hand, seeing more than just the code immediately helps you as a programmer.

The end result?

You are now a better programmer because someone took the time to explain the code, instead of just releasing it into the wild.

Teach Me to Fish

If you’re going to release your site as open-source, then go the extra steps and take the users through the problems you had. Make it into an n-part blog post that details the neat parts. Write a book. Record a Podcast. Do something other than just releasing the code.

There are plenty of lessons we could learn from Stack Overflow and WikipediaMaze, but seeing the end result isn’t as useful to other programmers as taking them through your code, and showing them why you made the decisions you did. You get recognition and Google traffic, and the programming community learns how to fish.

Build your Company’s online Brand, before someone else does

By now, you may have heard about Adam Savage’s skirmish with AT&T over a very familiar story: Man gets overcharged for internet access, takes case to internet, and wins.

His Tweet (a message on Twitter) simply read: “AT&T is attempting to charge me 11k for a few hours of web surfing in Canada.”

His message spread amazingly quickly on Twitter. Within three hours, AT&T was the second most discussed topic on Twitter, second only to Michael Jackson.

By the end of the day, the carrier was “very gracious about taking care of it all,” Savage said, deciding to free him of those costs.

So holds the power of the internet.  This sort of thing has happened before and it will happen again. It’s even happened to me.

The funny thing is, the age old adage “There’s no such thing as bad press” isn’t true anymore, and companies like AT&T and Verizon know just how easy it is to lose customers. Why do you think they require service contracts?

I’ve been known to jump from one service provider to the other if they start treating me badly, and with this newest text messaging problem, I probably won’t jump back to AT&T any time soon. I’ll just wait for the iPhone to become a non-exclusive item.

The complainers, the people the people that bad mouth your service, now have a voice. A very loud, very obnoxious, and very addictive voice. You used to be able to dismiss them out of hand; but now one bad service call with one customer can provide years of online enjoyment.

So, what do you do?

You can’t make everyone happy; that’s obvious. But you can take those singular “We messed up” moments, and turn them into something good. In Soccer, we call those pivotal moments in games the Moment of Truth. It is the occasion when the referee has to make a decision that will forever shape players’ perception of him throughout the match (and possibly his career). It happens in almost every match, and getting that singular moment right can do a lot to repair or extend credibility where you might otherwise have little.

In software, that means openess. Not the Government 2.0 kind that really isn’t, but the kind where you address complaints and complainers head on. You can no longer pretend that they don’t exist. Steve Krug wrote in Don’t Make Me Think  about how company FAQ pages never really tell you anything you really want to know, and he’s right: they ought to.  Businesses now must hire  ‘Social Media’ experts or ‘Community Managers’ to address customer service issues in this new age of the internet. As Jeff Atwood writes:

To have a personal brand, you must do something remarkable:

 

  • lead a user group
  • create a popular open-source project
  • write a blog
  • publish a book
  • publish articles
  • speak at conferences

Do whatever you like. Pick one, pick them all, or pick something that’s not on this list.* As long as it’s public, and it advances your skills, you’re creating a personal brand. And that will help your career far more than technical chops ever will.

His advice is as salient for businesses as it is for individuals (and software companies doubly so!). What can businesses do to build their online brand?

  • participate in social media (not just press releases, please)
  • encourage your employees to participate in social media
  • only censor the business proprietary stuff
  • let the warts shine through
  • become personable

Having a ‘blessed’ social media account is not the same thing as 5,000 employees having blogs and twitter accounts. 1 voice is meaningless, but a sea of employees generally talking about their day/life/work makes me interested about those things. I never gave Dell a second thought until I saw that they have a lot of employees with Twitter accounts. Now their brand is more personal to me.  Microsoft was an early adopter of this with encouraging blogs being written by their employees. The question is, why isn’t your company building its online brand through employee crowd-sourcing?

If you don’t build your company’s online brand and keep it polished, someone will tarnish it, and you’ll only have yourself to blame.

Website Registration … (Redux)

In our last wonderful adventure, I lamented the state of website registration, and took umbridge with SQL Server Central’s approach of requiring a user to log in before viewing the article.

Since then, there have been two developments:

SQL Server Central‘s registration requirement can be beat by Firebug

The CEO of Red Gate Software personally* told me that they’d get rid of the registration requirements for that website.

All it took was Joel Spolsky mentioning it on his Google talk about Stack Overflow. 

As an aside, while I’m very pleased with Red Gate’s move, I can’t believe it took a celebrity calling them out for them to change their tune.  If they had taken an afternoon and conducted a usability test, I find it hard to believe they wouldn’t have found out that this is an issue for users.

Of course, this brings out another problem.  I’m using the blogs of two well-known programmers to add credence to what I believe as a programmer. Why shouldn’t I? They’re smart guys, and they get things done.  The problem it causes is that for all the progress we’ve had in Software Engineering, the ‘best-practices’ are still a collection of blog posts inter-mingled with light reading.

 

Unlike other disciplines, there is no Software Engineering bible that contains the canonical Dos and Don’ts. There can never be.  Once a book is published, it could be obsolete.  We move too quickly for any one book on Software engineering to have a long half-life.

Our ‘Bible’ is and probably always will be a collection of blog posts that reference books that are either already irrelevant, or soon will be.

 

As an example, Steve Krug’s book, “Don’t make me think” covers web 1.0 very well, but doesn’t cover web 2.0 at all.  How should web designers handle Ajax? What is obtrusive, what are the best practices?  We only have blogs to tell us, and perhaps that’s the way it should be.