Statistics and the International Genealogical Index (IGI)
The only approach that I have seen is Martin Ecclestone's The diffusion of English surnames (Local Historian, 1989) which examines the 1988 edition of the IGI. The following is a paraphrase of his illuminating article. This was groundbreaking material at the time. Now, in this age of the IGI on CD-ROM and IGI on-line, are these latest versions able to deliver similar or indeed enhanced statistical information?
All statistics in the following are copyright © 1989 Martin Ecclestone, who has gracefully approved their reproduction here.
Mr Ecclestone wrote this article before the advent of the CD-ROM version of the IGI. A great advantage of the previous fiche version was its amenability to statistical analysis. The numbered frames of the fiche made it straightforward to count the number of entries per surname and the proportion of a county that any given surname entries consititutes. Repeating the exercise for each of the 39 English counties allows the geographical distribution of that surname's frequency to be tabulated.
An obvious drawback is the wide range of dates in the IGI - from 1538 to about 1900. Plus the well-known inconsistency of geographical coverage. Do these minuses nullify any findings? Mr Ecclestone attempts to address these issues.
He considers the dates objection by constructing a histogram of 2760 dates randomly selected from the Index. The resulting graph reveals a steady growth in entries from 1538, peaking in 1837, when there is a dramatic drop. This occurs because many parish record transcriptions stop in 1837 when the St Catherine's house records begin.
The histogram, and the following table reveals that the 1988 IGI entries are chiefly representative of eighteenth century England.
Note: The number of frames excludes frames with no surname.
IGI County Coverage
Column 3 of the above table is the ratio between the 1801 county populations and the number of IGI frames for each county. This ratio is 8.39 for England as a whole, but varies between 4.27 (Bedfordshire) and 37.9 (Huntingdonshire). High values represent counties that are under represented in the IGI in relation to their 1801 population, whilst conversely low values show the counties whose registers are the most complete or have been most fully transcribed.
"The tabulated ratios may be used to convert the number of frames containing a particular surname into an estimate of the 1801 population of that surname."
Mr Ecclestone cites the example of the surname Fuller. There are 7.5 frames of Fullers in the Bedford county index. Thus he estimates there were 32 (7.5 x 4.27) Fullers alive in 1801. Applying this method to the rest, results in an estimate of 4275 Fullers for England as a whole.
With my own name, Dance, there are 22 frames for the county of Worcester, which equates to a population of 130 people in 1801. I know from the censuses that the actual population in 1851 is 144, so the 130 estimate is a reasonable one. It is however important to cleanse the IGI data of any duplicates or patron submittals.
IGI Births/Marriages/Deaths Coverage
Martin Ecclestone says that "a measure of completeness of the English index is the proportion of births and marriages that are recorded as IGI entries at different periods." He gives the proportion of marriages (derived from random sampling ) as:
This is then compared with an independent estimate of the number of marriages that actually occurred during the same decades. The same procedure is used to compare IGI baptismal records with total births.
The above table summarises his results from seven selected decades. "It demonstrates that births and marriages are more or less equally recorded except during the sixteenth century" and "apart from the Commonwealth period... the IGI is 40% to 50% complete between 1600 and 1837."
Mr Ecclestone concludes that the IGI contains almost a half of the number of records possible, during the 18th century. (This figure needs to be adjusted for individual counties, as shown in the first table). Although the median date varies for each county, "since surname distributions change rather slowly, it is felt that those which are obtained from the IGI data are probably fair descriptions of the mid-eighteenth century situation."
The article then proceeds to give some case studies from actual surname examples, and shows how their diffusion can be measured. Overall, it is a fascinating article. If you are interested in the possibilities of the IGI, then seek out a copy.
The Local Historian is published by the British Association for Local History (BALH).