Doc / Linux / Managing Websites

``How I Work'' article #3

Like many people these days, I have quite a number of web sites and web pages that I'm working on. There's this one, of course; the one on my internal file server that I maintain for my family, a couple, mostly experimental, hanging off my DSL line, a `professional' page at work, an open source project or two, and so on. If I had to upload every page with ftp whenever I changed it, they'd get updated even less often than they actually do. Here's what I do instead.

NOTE: I still need to add links to most of the sample code; the current makefile template is here.

The Basics: Content Management

These days, there are three popular approaches to building and maintaining a web site:

My approach falls somewhere in between the HTML editor and web-based CMS, except that (as one might expect) it takes full advantage of the Unix/Linux software development environment. In effect, I treat a website exactly as if it was a software project. I use my favorite editor (emacs) to create and edit files, a version-control system (cvs) to maintain a version history and archive, and of course make to drive offline formatting and uploading.

The Joy of make

The make(1) program is usually thought of as a software-development tool: its main purpose is to control the process of compiling and linking programs. It does this by means of a file in each directory, called Makefile, that contains three kinds of information:

What make does is to work backwards from its list of targets, through the applicable dependencies and rules, and compare timestamps to determine which files have changed since the last time their targets were built. Then it only builds the targets that are out of date.

This means that make is perfect for such web-related tasks as creating and maintaining a consistent directory structure, creating indices and tables of contents, applying off-line formatting programs, and updating your site by uploading the files that you've changed. Let's see how that works.

Setting It Up

The first thing most people do when they're adding a new directory to their website is to add the directory to their local working copy with a command like

	mkdir foo

Then they dive in and start editing HEADER.html or index.html (depending on whether or not they want Apache's automatic index). Sometimes they'll copy in a template file before they start editing; usually this comes from another directory at the same level, if they can find one.

What I do is this:

  1. Edit Makefile in the parent directory to add the new subdirectory to the SUBDIRS list.
  2. make setup

That's it. And the make command is bound to C-xC-m in my Emacs configuration, so I don't have to leave the editor to do it. The setup target does the following:

  1. Creates any directories in the SUBDIRS list that don't exist yet.
  2. Constructs a Makefile in every subdirectory that doesn't already have one. This is easier than it sounds, because all of the rules are included from a master template file called webdir.make.
  3. Goes into each subdirectory and does make setup-dir, which creates HEADER.html if it doesn't already exist.

Since I'm going to start being more consistent about making sure that every web document is its own directory (see the previous file in this series, Documents Are Directories), the next version of the webdir.make template is going to distinguish between those directories that represent collections (and have their index.html file constructed automatically), and those that represent documents.

Making It Pretty

There are four main ways of giving your website a consistent 'look and feel':

Naturally, it's all managed using make -- I simply have a rule that tells make how to build HTML pages out of whatever format I'm using. In most cases the 'source code' I write is basically HTML with a few extra tags, like <header> and <footer>, and I give these files a .ht (almost HTML) or .xh (eXtended HTML) extension. Then I have a couple of make rules that do the work:

  .SUFFIXES: .xh .html
	$(PROCESS) $< > $@
	{ grep -s $@ .cvsignore ; } || echo $@ >> .cvsignore

  XH_FILES= $(wildcard *xh)
  XH_HTML= $(XH_FILES:.xh=.html)

There's also a definition for PROCESS, of course, that varies from site to site.

From that point on it's automatic: whenever I change a .xh source file and say make upload (see the next section), the corresponding .html file gets built (because the upload target depends on it) before it gets uploaded. No fuss, no problems.

Getting It Up

The essential thing about using make to manage your website is that you have to have a command that will upload a file without the need for user input. In particular, it mustn't stop to ask you for a password, because it's going to be executed many times (at least once in each directory).

The ssh/scp and rsync methods are the easiest to put into a Makefile, so that's what I use these days on most of my sites. The scp and rsync commands can be used interchangeably for uploading files; I use ssh to run the mkdir command for making new directories.

Here's the make magic for uploading:

# mkdir.log:  First see if we need to make the remote directory.
	@echo making remote directory
	-ssh $(HOST) mkdir $(DSTPATH)
	echo `date`  mkdir $(DSTPATH) > mkdir.log

put:: mkdir.log

# put.log:  This is the one that does most of the work.
#	Note that as a side effect we make put.bak, which is the list
#	of the most recent files so we can retry the command if it fails.
put.log:: $(FILES) $(IMAGES)
	rsync -a -u -v -e ssh $? $(HOST):$(DSTPATH)/
	echo `date`: $?					 >> put.log
	echo $?						  > put.bak

# put uses the -u flag to rsync to keep from clobbering remote files
#	that have been changed on the server.

# put:	
put:: put.log

So far I don't have any sites that allow web updates, but I'm working on some. Those will of course use cvs for the parts that visitors can change. Watch this space.

Remember the previous section where I mentioned that the make command is bound to C-xC-m in my Emacs configuration? It prompts for the target, but in most of my web directories put is the default, so I just hit C-m (that's Enter, for the ASCII-impaired) again. That's all: make a change, type three more keystrokes, and it's on the Web.

Attack of the WURM

As my understanding of website management (and the number of websites under my care) increases, my Web Upload Recursive Makefiles change. The latest incarnation of the project is called WURM, and it's very much a work in progress -- so much so, in fact, that it doesn't even have a project directory yet. So here are a couple of somewhat disjointed notes.

When things get a little more formalized, I'll put in a link to the WURM project. Meanwhile, I'll use WURM as an example for the next document in this series, The Project as Website.

$Id: index.html,v 1.5 2007-12-01 23:53:36 steve Exp $
Steve Savitzky <steve @>