Like many people these days, I have quite a number of web sites and web pages that I'm working on. There's this one, of course; the one on my internal file server that I maintain for my family, a couple, mostly experimental, hanging off my DSL line, a `professional' page at work, an open source project or two, and so on. If I had to upload every page with
ftp
whenever I changed it, they'd get updated even less often than they actually do. Here's what I do instead.NOTE: I still need to add links to most of the sample code; the current makefile template is here.
These days, there are three popular approaches to building and maintaining a web site:
My approach falls somewhere in between the HTML editor and web-based CMS,
except that (as one might expect) it takes full advantage of the
Unix/Linux software development environment. In effect, I treat a website
exactly as if it was a software project. I use my favorite
editor (emacs
) to create and edit files, a version-control
system (cvs
) to maintain a version history and archive, and
of course make
to drive offline formatting and uploading.
make
The make
(1) program is usually thought of as a
software-development tool: its main purpose is to control the process of
compiling and linking programs. It does this by means of a file in each
directory, called Makefile
, that contains three kinds of
information:
clean
.
.c
suffix can be turned into one with a
.o
suffix by running the C compiler on it.
What make
does is to work backwards from its list of targets,
through the applicable dependencies and rules, and compare timestamps to
determine which files have changed since the last time their targets were
built. Then it only builds the targets that are out of date.
This means that make
is perfect for such web-related tasks as
creating and maintaining a consistent directory structure, creating
indices and tables of contents, applying off-line formatting programs, and
updating your site by uploading the files that you've changed. Let's see
how that works.
The first thing most people do when they're adding a new directory to their website is to add the directory to their local working copy with a command like
mkdir foo
Then they dive in and start editing HEADER.html
or
index.html
(depending on whether or not they want Apache's
automatic index). Sometimes they'll copy in a template file before they
start editing; usually this comes from another directory at the same
level, if they can find one.
What I do is this:
Makefile
in the parent directory to add the new
subdirectory to the SUBDIRS
list.
make setup
That's it. And the make
command is bound to
C-xC-m in my Emacs configuration, so I don't have to leave the
editor to do it. The setup
target does the following:
SUBDIRS
list that don't
exist yet.
Makefile
in every subdirectory that doesn't
already have one. This is easier than it sounds, because all of the
rules are include
d from a master template file called
webdir.make
.
make setup-dir
,
which creates HEADER.html
if it doesn't already exist.
Since I'm going to start being more consistent about making sure that
every web document is its own directory (see the previous file in this
series, Documents Are Directories),
the next version of the webdir.make
template is going to
distinguish between those directories that represent collections (and have
their index.html
file constructed automatically), and those
that represent documents.
There are four main ways of giving your website a consistent 'look and feel':
PHP
, ASP
, JSP
,
Java servlets, Apache mod_perl
, or the old standby of
server-side includes. Very effective, and allows pages to be
customized for each reader, but puts a burden on the server and can
lead to security holes. Most ISPs permit only one or two of the
methods, usually PHP
or ASP
and server-side
includes.
Naturally, it's all managed using make
-- I simply have a
rule that tells make
how to build HTML pages out of whatever
format I'm using. In most cases the 'source code' I write is basically
HTML with a few extra tags, like <header>
and
<footer>
, and I give these files a .ht
(almost HTML) or .xh
(eXtended HTML) extension. Then I have
a couple of make
rules that do the work:
.SUFFIXES: .xh .html .xh.html: $(PROCESS) $< > $@ { grep -s $@ .cvsignore ; } || echo $@ >> .cvsignore XH_FILES= $(wildcard *xh) XH_HTML= $(XH_FILES:.xh=.html) |
There's also a definition for PROCESS
, of course, that varies
from site to site.
From that point on it's automatic: whenever I change a .xh
source file and say make upload
(see the next section),
the corresponding .html
file gets built (because the
upload
target depends on it) before it gets uploaded. No
fuss, no problems.
The essential thing about using make
to manage your website
is that you have to have a command that will upload a file without the
need for user input. In particular, it mustn't stop to ask you for a
password, because it's going to be executed many times (at least once in
each directory).
ftp
-- this is an old standby, but it's not very secure.
My ISP no longer supports it, but a lot of hosting services still do.
In order to make it work non-interactively you have to put the machine
name, user name, and password into a .netrc
file in your
home directory.
ssh
and scp
-- the modern, secure replacement
for rsh
and rcp
. Basically they let you
execute arbitrary commands and copy files over a secure, encrypted
channel. The best way to keep them from asking for a password is to
use ssh-agent
(1).
rsync
-- a recursive, efficient version of
rcp
(remote copy) that only transmits the minimum required
to make the remote copy look like the local one -- it's ideal if
there's any chance of being interrupted, since it recovers from partial
transfers automatically. Works very well over ssh
.
curl
and WebDAV. WebDAV is a set of extensions to HTTP
that let you get directory listings and upload files over the web; it's
what Microsoft calls 'web folders'. The curl
(1) command
is an improved version of the old standby wget
(1) that
lets you upload using PUT. If your server supports WebDAV this is a
pretty good way to go.
cvs
(1) -- a version control system that can be operated in
client-server mode, either over ssh
or with its own
password-protected server. It's perfect for maintaining code (e.g.,
your CGI scripts) on a remote server; it's a little clumsy if all you
want to do is upload files, but it's by far the best way to
manage a site that is maintained by multiple authors or that allows
users to make changes (comments, for example) via the web.
The ssh
/scp
and rsync
methods are
the easiest to put into a Makefile
, so that's what I use
these days on most of my sites. The scp
and
rsync
commands can be used interchangeably for uploading
files; I use ssh
to run the mkdir
command for
making new directories.
Here's the make
magic for uploading:
# mkdir.log: First see if we need to make the remote directory. mkdir.log: @echo making remote directory -ssh $(HOST) mkdir $(DSTPATH) echo `date` mkdir $(DSTPATH) > mkdir.log put:: mkdir.log # put.log: This is the one that does most of the work. # Note that as a side effect we make put.bak, which is the list # of the most recent files so we can retry the command if it fails. put.log:: $(FILES) $(IMAGES) rsync -a -u -v -e ssh $? $(HOST):$(DSTPATH)/ echo `date`: $? >> put.log echo $? > put.bak # put uses the -u flag to rsync to keep from clobbering remote files # that have been changed on the server. # put: put:: put.log |
So far I don't have any sites that allow web updates, but I'm working on
some. Those will of course use cvs
for the parts that
visitors can change. Watch this space.
Remember the previous section where I mentioned that the make
command is bound to C-xC-m in my Emacs configuration? It
prompts for the target, but in most of my web directories put
is the default, so I just hit C-m (that's Enter, for the
ASCII-impaired) again. That's all: make a change, type three more
keystrokes, and it's on the Web.
As my understanding of website management (and the number of websites under my care) increases, my Web Upload Recursive Makefiles change. The latest incarnation of the project is called WURM, and it's very much a work in progress -- so much so, in fact, that it doesn't even have a project directory yet. So here are a couple of somewhat disjointed notes.
Makefile
in any web directory starts by defining the
location of MF_DIR
, the directory that contains the
makefile
templates. It can either be a relative path or
an absolute one; this lets me keep a lot of my websites, software
projects, and so on in one huge directory tree.
include
ing, a website configuration file called
WURM.cf
, in the top-level web directory. This file
contains definitions for things like the destination hostname and
directory.
WURMfile
; if there is one, it gets used instead of the
Makefile
. This means that you can have a directory with
its own Makefile
, for example a stand-alone software
project, as a subdirectory of your website. Handy for us open-source
developers.
WURMfile
nor a
Makefile
can be recursively uploaded using
rsync -a
. This is sometimes useful when you have a
directory full of data that goes on more than one site. You can also
do it when you need to ignore the subdirectory's
WURMfile
.
When things get a little more formalized, I'll put in a link to the WURM project. Meanwhile, I'll use WURM as an example for the next document in this series, The Project as Website.