I have often wondered if it would be possible to condense the distribution and incidence data of a surname into a formula say, so that comparisons could be made against other names, or against the same name at various periods.
The Smallshaw Name Identification Number
One possibility advanced has been the Smallshaw Name Identification Number, devised by the late Ronal Smallshaw. Its aim is simplicity of method. The GRO (England and Wales) birth registrations are summed for 1870 and 1970, then averaged. The leading county is noted. Thus:
Smallshaw 8 Lancashire
(Source: Smallshaw, Ronald, Smallshaw Name Identification Number, in the Journal of One-Name Studies Vol 5, pp 12-13, 80-81, 123-124, 244-245, 354-355; Vol.6, 20-22, 63-64, 186.)
The Smallshaw number has been criticised for producing misleading results in the case of rare names, or names that migrated to England, whilst still having a formidable presence in Scotland, say. The single year 1870 might not be statistically representative for a name, either, and that a 10 year run (1860-69 and 1970-79) would be more accurate. A simple appendix denoting the area and year range might help? Smallshaw 8 Lancashire (10EWS)
Another possibility is a visual representation. Lasker surveyed all the General Record Office (England and Wales) marriage entries in the March 1975 quarter, and created a pair of graph showing, for for each of one hundred common surnames, the distribution in a West-East and a South-North axis. The following is the Jones pair:
The graphs vividly depict the predominance in Wales and the Midlands, and the relatively low occurrence of the name Jones in the north of England. Presumably this technique could be extended to include Scotland as well.
How the data behind the graphs was actually formulated is not covered in as much step-by-step detail as I would like:
"The graphs depict for each surname, the probabilities of local excesses or deficiencies in frequency from west to east and from south to north. The number of occurrences of each surname expected in each district if all surnames had a uniform geographic distribution was subtracted from the observed number of occurrences and a measure of the probability of each deviation occurring by chance was recorded. The west-east and south-north distributions of these values were fitted by the curves..The degree of departure of points on these curves from zero is thus an indication of the probability of increased (or decreased) frequency of the surname at the longitude or latitude at the number of kilometres east (left-hand diagram) or north (right hand diagram) of the Ordnance reference point."
(Source: G.W Lasker (1985) Surnames and genetic structure.)
One-namers habitually collect all the GRO references to their 'name'. Is there any way a data template could be programmed for the automatic creation of graphs such as the above? The Smallshaw Number is suited to localised surnames, the Lasker graphs to distributed surnames. Would the marriage of the two methods create an acceptable surname signature?