GENUKI Gazetteer maintenance

This information is for GENUKI county maintainers and documents how to integrate the gazetteer into their pages, and develop and maintain their county sections. Unfortunately we have no information for the Channel Islands.

Source files

The searches are all performed on a MySQL database using SQL queries. Changes are not made to the actual database itself, but to a set of source files which are used to replace the entire database when the need arises.

The source files are actually just a set of comma separated text files places.csv, with one file per county. The master files are in the usual place, contact Phil Stringer via the link at the bottom of the page if you can't remember where that is. There are no links into these files to make it difficult for search engines to find them, and to prevent them being harvested by somebody else. The basic file is named places.csv and when a county section is being maintained by the county maintainer, this file is placed in the standard place in your county web pages. Again contact Phil if you can't remember where that is. A special program is used to collect all the county sections and build the database from source. When any section has been changed, a database rebuild is required to make you change live. A simple rebuild is performed at 6:00 am every day, so you should see your changes live the next day. When you first take on maintenance of your county section, just place a copy of the central file in your area and ask for a rebuild before you make any changes. Otherwise changes that may be made to the central copy could be lost. The central copy is never changed once there is a devolved section.

The fields in the places.csv file are as follows, but minor changes may occur during development of this facility. Each field description contains the field name used in the MySQL database to identify it clearly in any notes, and the size and type.

  1. CCC (3 char) -Chapman county code - upper case.
  2. Location - You can use either of the following two formats:
    • GRIDREF (8 char) UK Ordnance Survey grid reference - full 8 characters. There are hints for using online maps to find exact locations. It is not possible to use Irish Ordnance Survey grid references as the scripts used to search and mainatin the gazetteer are written in Perl, and appropriate conversion routines for Irish OS grid references are not available.
    • LAT (9 char) LON (9 char) Latitude and longitude - specified as a pair of comma separated numbers e.g. "54.602699,-5.935707" In the actual database, the location uses an internal format based on the UK OS grid ref, which has been extended to cover the west of Ireland. This provides a single reference key, and a mechanism for selection. This is only visible on some urls. On all the display screens the location appears as a UK OS grid reference, or for all or Ireland, as latitude and longitude. If you need the reference key, use the link from the gaz script which gives a list of tabular results. On this look at the link to the gazetteer entry.
    • If the location field is left blank (as a temporary measure!) the centre location of the county will be used when this entry gets incorporated into the database, and the APPROX field will have 'C' put in automatically regardless of what is in the csv file.
  3. APPROX (1 char) - Flag indicating whether the grid reference is exact or approximate. The original data source contained approximate grid references giving the kilometre square, but not the exact point within it. A 'Y' or 'Yes' indicates an approximate reference or a 'N'/'No' for an exact one. A 'C' can be used to note that the location specified is that of the centre of the county ( but in such cases it is better to leave the location field empty). A 'P' can be used to indicate that the location is that of the centre of the parish rather than the actual location of the place.
  4. PLACE (32 char) - The place name.
  5. PRIME (boolean) - A flag indicating whether this is the primary entry for the Town/Parish page in the subsequent URL. 'Yes' is used for the primary entry, 'No' for the rest. For each different URL in column G there must only be one place entry with this flag set to 'Yes'.
  6. MOREPLACE (32 char) - Additional comments about the location. There are frequently multiple places in a county with the same name, and this field can be used to help distinguish them. E.g there are at least 5 Broughtons in Lancashire, and we could include here 'near Preston'.
  7. URL (90 char) - The URL of the Town/Parish page covering the area where this place is. This is typically the historic parish or township this place was in but things may have changed in modern times with the building of new towns etc. Nevertheless use this field to point to the page where you will place information about this place.
    Kain, R.J.P., Oliver, R.R., >Historic Parishes of England and Wales: an Electronic Map of Boundaries before 1850 with a Gazetteer and Metadata [computer file]. Colchester, Essex: History Data Service, UK Data Archive [distributor], 17 May 2001. SN: 4348.
    is a very good source of boundary information to help you decide which town/parish page to associate place names with.
  8. UNSPEC (boolean) - Alias flag. Some places have alternative names, e.g. English and Welsh names for the same place. Choose a name that you want to be the first to appear (primary name) and create a normal gazetteer entry for it. For all the other names create additional entries with the same gridref, but for these, set this flag to Yes. For the alias entries field E (PRIME) will always be No. This is the old technique for specifying aliases. It is much easier now to use the Alias field (Column N) rather than having separate entries for aliases.
  9. BARONY (32 char) - The name of the barony in which the place is located in Ireland. For England/Wales this can be used to hold the hundred or district for Scotland. For Ireland this field is used to link townlands to the relevant parish. As for most of the parishes we do not have any web pages the normal link via the URL field cannot be made. N.B. The name of the barony does not get displayed and so there is no requirement for multiple entries if it sits over a border.
  10. PARISH (32 char) - The name of the civil parish in which the place is located. For Ireland this field is used to link townlands to the relevant parish. If parish boundaries run through a townland, put the others in here as well, using a colon : character as a separator. Do not put in space characters next to the colon. As we do not have any web pages for most of the Irish parishes the normal link via the URL field cannot be made.
  11. TYPE (32 char) - The type of place e.g. parish, townland, hamlet. For Ireland all parishes should have the text Parish in this field and townlands the text Townland as this is used to link townlands to their parishes when we have no URL for them.
  12. QUOTE (32 char) - The name of the file containing a quote describing the place. If present this quote will appear in gazetteer entry web pages. It is planned to use the quotes extracted by Mel Lockie from Lewis's Topographical dictionaries and these are currently stored at /big/Gazeteer/quotes. This field just contains the name of the file, and not the directory in which it is held.
  13. Notes - This field will never get entered into the database, but is a place within the csv file to hold any notes that the maintainer may need particularly during developemnt of new place entries.
  14. Aliases - This does not become a database field, but is used by the database rebuild process to create additional entries with the same contents as the current entry but with the aliaas as the place name, the Prime flag set to 'N' and the Unspec(Alias) flag set to 'Y'. If there is more than one alias, use a : as a separator in the list. Avoid leaving spaces at the start and end of alias names.
  15. FHS - The code(s) for the FHS(s) covering this Town/Parish.
  16. OTHERCCC - Some parishes and places have county boundaries running through them. This field helps handle these and can hold the county code(s), separated by colon : characters if there is more than other county.
  17. HIDEME - If a parish has a county boundary running through it and we have multiple entries, then this field can be used to hide the less important ones. If you code a Y character here then it won't be returned as the result of a general search. However if we only want the results for the county it is in, then it will be shown. To help identify the predominant entry, code an N character for them. (N is the default) This technique can also be used for towns lying over county boundaries.
  18. ID - An identifier, unique within the county, for a town/parish. Each entry for which PRIME is set should have a unique ID that never changes so we can consistently refer to the town parish even if the url or location gets adjusted in the future. This is a character string and it is suggested that it be based on the name of the town/parish with additional characters where there are multiple ones with the same name.

    We plan to use this as a unique key in scripts etc. to identify a place and its database entry. So it needs to be something easy to remember, and not too long. Don't put in spaces or any odd characters that may cause problems when we use it as a parameter to a script. In the database it will have the county code as a prefix to make it unique, but let you have the flexibility to choose your own values. You don't need to put the county code in the csv file, the upload will add it in for you. And it will NOT be used to identify the county, it is just to make it it easier to choose a unique value.

    So some examples might be.

    • Lytham
    • BrougtonP for the Broughton near Preston.
    • BroughtonS for the Broughton near Salford.
    • BroughtonF for Broughton-in-Furness, It doesn't have to be the full place name, just a unique code that can be guessed quite easily.

    Now is the time to choose sensible values. At some stage we will have to automate the choice and after that there will not be an opportunity to make any changes.

The database has been contructed from source data that was based on post-1974 counties. So there are some additional sections that need to dealt with once you have started as local knowledge is required for some entries to determine which county some places are in. There may also be additional entries supplied by other people since a devolved copy of the county section was taken.

Providing additional information

Here are some tasks that county maintainers could undertake to improve the gazetteer. It may well be worthwhile recruiting a competent volunteer to do the bulk of the work but quality control procedures are likely to be needed, and cooperation to ensure the correct URLs are used on the entries.

Additional source files

Take a look at the statistics page to see if there are additional entries that need adding to a county section or which need evaluating to see if they are in that county.

The gazetteer started as a list of parishes in each pre-1974 county along with approximate grid references. Subsequently a large file of placenames with approximate grid references and post-1974 counties was obtained. A special program was written to compare each entry in this file with the parish database and choose the pre-1974 county. The technique used was to look for all places within 3 miles. If all were within the same pre-1974 county then that was chosen. If more than one county was found they were flagged as needing a manual choice as were any with no nearby parishes. Those for which a unique pre-1974 county was found were added into the database, and most entries are now in there. The rest are held in separate files for each county.

Access the files via the statistics page and copy and paste from there. The file entries contain hot links to the online mapping tools as an aid to processing their contents.

Other gazetteer settings

There are a couple of entries in the county database entry which are used by some of the search routines:

Updating Irish sections

The Irish information has come from a number of sources and needs some tidying up. The work is not complicated, just a steady check of the emtries to combine duplicates and adding locations to make the data more useful.

Usage

A number of cgi scripts are available to access the gazetteer information in a number of ways.

Controlling placement in the search results

This section was written when the only search scripts were places & nearby and refers to the displays they produce.

The search results are designed so that the places it shows are sorted according to the distance from the start point, with all the places covered by an individual Town/Parish page grouped together with it. This is achieved primarily by sorting by distance, but also using information in various database fields as well.

The search results initially appear as two sections, the first with links to GENUKI pages, and then the rest. The second section are those entries in the database with an empty URL field. It is expected that over time, second section will completely disappear as URLs are aded to the existing entries.

The entries that are grouped under a Town/Parish page entry all have the URL of the Town/Parish page. The thing that distinguishes them as being subsidiary is the PRIME flag, which is set to 'Yes' just for the Town/Parish page entry.

The places that can appear at the start of the subsidiary group as just a list of place names separated by commas without a distance and grid reference are defined as follows. They have the same grid reference and URL as the Town/Parish page entry but they also have the unspecific location flag set. This means that the place is somewhere within the Town/Parish but we don't know or won't say exactly where it is. This can also be used for alternative names where places have changed their name over time. E.g. Poulton le Sands is now called Morecambe. If you have an alternate name or alias for a place create an identical entry to the primary, but put the alias in the place name field and set the UNSPEC/alias field to Yes.

Boundaries

We have found some sources of KML data which can be used to plot boundary information on maps.