The census XML project
  • Home
  • Create
  • Edit
  • Display
  • About the file format
  • Examples
  • Tags
  • Feedback

A non-technical guide to the census xml file format

<?xml version="1.0"?>
<censusxml version="0.1">
<census>
<year>1891</year>
<country>England and Wales</country>
<header>
<administrative_county>Kesteven</administrative_county>
<civil_parish>Washingborough</civil_parish>
<municipal_borough></municipal_borough>
<municipal_ward></municipal_ward>
<urban_sanitary_district></urban_sanitary_district>
<place>Washingborough</place>
<rural_sanitary_district>Lincolnshire</rural_sanitary_district>
<parliamentary_borough>Sleaford</parliamentary_borough>
<ecclesiastical_parish>Washingborough</ecclesiastical_parish>
</header>
<household>
<person>
<address>Manor House</address>
<name>Ann Henrietta Curtois</name>
<relationship>Head</relationship>
<condition>Wid</condition>
<age_male></age_male>
<age_female>61</age_female>
<occupation>Living on own means</occupation>
<employer></employer>
<employed></employed>
<neither></neither>
<birthplace>Herefordshire Tiberton cum Medley</birthplace>
<disability></disability>
</person>
</household>
</census>
</censusxml>

There is a direct analogy between a census form and the way that xml files store data. The form consists of a series of boxes, into some of which the census enumerator entered data. Every box has a name (e.g. "Name and surname of each person"), and many of them also have a value (e.g. John Smith).

In an xml file, these boxes are replaced by tags, which look like this: <name>. In fact, every box is replaced by two tags: an opening tag <name> and a closing tag </name>, the latter containing a slash to show that it's a closing tag. Anything that appears between those two tags is the data stored in the box, so our man named John Smith would be stored as <name>John Smith</name>.

And that's pretty much all there is to it. A census xml file is just a series of these tags representing the boxes on the form, and some of them will have information between them to represent the data entered onto the form. Nothing between the opening and closing tags corresponds to an empty box on the form.

An example is shown in the grey box, which is just the way of storing the data for a 61 year old lady named Ann Henrietta Curtois - the data itself has been coloured red just to make it easy to pick out, and the indentation is there to make it easier to follow. One thing to notice is that the file in itself is rather dull and not great to look at; to display it nicely, with headers for the categories and so on it would need to be uploaded to a website such as this one, or (hopefully one day) your favourite genealogy programme. With a little bit of know-how, you can also get your web browser to display them nicely.

How does one go about making one of these files? Well, you can just type it all in by hand if you want, though this is rather tedious and is difficult to do without making mistakes. Or, you can use one of the utilities on this website to create or edit the file more automatically; they are designed so that all you have to do is type in the data, and all the tags are taken care of.

Slightly more advanced stuff

There are certain rules that the file must conform to in order to make it work. All data must be between opening and closing tags. That first line <?xml version="1.0"?> is required, because it tells the computer that this is an XML file. The first tag in the file must be <censusxml>, and the last must be </censusxml>, and each individual set of census data is contained in <census> tags. Each block of census data must contain both <year> and <country> data. The data for each individual person is between <person> tags, and the header data is between <header> tags. Note that in some censuses, this information is contained on a separate sheet to the data for the individual persons. Other than that, the rest is pretty flexible. Omitting a set of tags that are empty, for example, will not break things. Also, in the file shown the <person> data is between a set of <household> tags. If there was another person in the household they could follow the first person, but on the other hand the <household> tags can be omitted altogether should you so desire.