Wednesday, October 24, 2007

Wikis and Documentation

I've identified some reasons why revision control is critical for source code:
  • Organizing concurrent development (branches) is error-prone and time-consuming.
  • Reverting code to prior versions is error-prone and time-consuming.
  • Identifying changes for a file is error-prone and time-consuming.
Am I missing something?

Should uncorrelated documentation be under revision control? I'm not referring to javadoc or any documentation that is explicitly tied to the source code. I'm referring to the documentation that accompanies a software project- code style guidelines, build procedures, review procedures, etc. Ideally, these documents should not be tied to any version of the software.
  • There is no need to support branching.
  • The only relevant version is the most recent one.
  • Identifying changes to historical versions has less value.
I think a WikiEngine is ideal for this sort of thing. I have little experience with utilizing a wiki for this purpose- so I'm wondering if anyone has a different take on the issue. Are wiki's too free-form for documentation? Fellow coders, please comment.

Build Procedures and Estimation

How do you guess the amount of resources without any idea which dependencies are affected by your work?

Process for using the build procedure to help estimate software work:
  1. Run the build procedure (make).
  2. Make a preliminary change in the module you anticipate modifying to complete the task, or create stub code for a module you anticipate changing, and connect it to the existing software.
  3. Run the build procedure again.
  4. Observe the modules that were rebuilt following your change (See Note).
  5. Use the list of modules from (4) to compose a list of modules that may need to be changed and tested to accommodate the task.
  6. Use the list from (5) to estimate the task.
Note: This assumes that your build procedure correctly manages dependencies between modules/files. Every build tool I know supports this capability first and foremost.

I don't think that this is particularly new or pioneering information for the computer nerds out there.

Great Aspectations

I was trying to create some progress output for some software I'd just written in perl. I wanted to use a simple framework to accomplish this- the only problem was that it was an impossible task without tightly-coupling the framework into the code. I could use delegates, callbacks, and other techniques, but I'd still have to integrate that functionality into the software I was building. Some would argue that this is a necessary evil- but since this code was part of an evolutionary prototype, I couldn't risk adding more work for later on.

I'd heard about aspect-oriented programming from software-engineering radio. So I found a perl module that provides some functionality. 7 lines of code later I had some progress output.

I don't know if aspect-oriented programming is here to stay, but it certainly sped up the creation of my prototype code.

Build Procedures

A "simple" build procedure could be defined thusly:

  1. Check the code out of the revision control system.
  2. make
Here are some reasons to use the simple build:
  1. A build procedure requiring more than one or two steps is error-prone.
  2. When building-to-release, there are no second chances.
  3. Any changes to the build procedure upsets (N minus 1) developers, where N=number of developers and 1=the developer making the change. You can't meaningfully change a one step procedure very easily.
What do you do when there are multiple clients receiving your software, and each receives a slightly different version? For example,
  1. Client 1 receives a reentrant version of your software, delivered with selected source code files.
  2. Client 2 receives your software as a non-reentrant fully-compiled lib.a and header files.
  3. Client 3 receives your software as a lib.so and header files for linux.
  4. Client 4 receives your software as a DLL for Windows.
  5. Client 5 receives your software with certain capabilities disabled.
  6. ...
When there are tens of clients, what is the best approach towards maintaining a simple release procedure? Here's my best guess:
  1. Maintain (under revision control, of course) a different make script (or makefile) for each client.
  2. Have a top-level script that builds all customers' builds. This prevents a shared code change from inadvertently breaking the customer's build. It also delineates which releases will need regression testing.
  3. Use labels/tags to specify when a customer's software is released. The problem I see here is that the tag database can get too large if you tag too frequently. A large number of tags can be tedious to browse. A good compromise is to use a convention for tagging that simplifies browsing.
  4. Try to convince customers to move towards the ONE-TRUE-BUILD. It may save them money if they don't have to pay for the extra support that their own personal build requires. It saves the software company money because it decreases the quantity of code to maintain.
This blog only contains the nerd stuff. The first few entries were formerly housed at schwerlog.