Preparing Barrington Atlas Geographic Data for Upload

The procedure partially outlined below is now deprecated and we're building better tooling to manage the process. See: BADataMunger.See also now BADigitizationSpec.

Directory Conversion

  • Use BADataMunger

Create coordinate data

  • Using ArcGIS create a geodatabase with the necessary coordinates
    • Standard local location is /badigit/extractions/ba###extractions.mdb
    • Record copies kept, zipped, in AWMC mass storage /ms/home/awmc/BAExtracted
    • create the following feature classes (all DD, WGS1984):
      • BA###Grid (polygon)
      • BA###Points (point)
      • BA###Lines (line)
      • BA###Polygons (polygon)

Grids

  • in BA###Grid create grid squares by tracing over registered raster; make sure corners snap; populate a 15-character text attribute called Gridsquare
  • Create a new feature class in the personal geodatabase containing the centroids of the grid squares
  • Using ArcMap? or ArcCatalog? data export, create two shapefiles (ephemeral), one for the contents of BAGrid, the other for BAGridCentroids:
    • ba###gridsquares.shp (ephemeral)
    • ba###gridcentroids.shp (ephemeral)
  • Using ogr2ogr, create a kml version of each, like:
    • ogr2ogr -f KML -dsco NameField?=GridSquare ba###gridsquares.kml ba###gridsquares.shp
    • ogr2ogr -f KML -dsco NameField?=GridSquare ba###gridcentroids.kml ba###gridcentroids.shp
  • Open both kml files, copy the Placemarks (only the Placemarks) out of ba###gridcentroids.kml and place them following the last Placemark in ba###gridsquares.kml, but inside the same Folder element and save combined result as ba###gridcombined.kml (ephemeral)
  • Use source:system/trunk/data/geoentities/kml/kmlgridspiff.xsl to transform ba###gridcombined.kml into ba###grid.kml
    • after verifying goodness and inserting an appropriate content header (xml comment) including license statement, put copies in:
      • /afs/isis/depts/awmc/web/pleiades/data/kml
      • /ms/depts/awmc/BAExtracted/kml
    • make an appropriate announcement via a news item on pleiades.stoa.org

Points

  • The BA###Points feature class should have the following attributes (easiest to just copy from an earlier map and dump content):
    • Label: text, 255: entirety of the label as it appears on the map
    • GridSquare: text, 25: the grid square designation (e.g., A1)
    • Type: text, 50: identifier strings from source:PleiadesEntity/trunk/data/place-types.vdex
    • Approximate: short: 0 = exact (solid symbol), 1 = approximate (hollow symbol), 2 = very approximate (hollow symbol with question mark)
    • Material: text, 50: abbreviations for materials for mines/quarries as they appear on the map
    • disambiguator: long: one-up numbers to assist munger in disambiguating like things that fall in the same grid square
    • orientation: short: 0-359 degrees, for point symbols like bridges and passes
  • On-screen digitize, going grid square by grid square
  • Zip up the map file, feature database etc. and keep an updated copy in /ms/depts/awmc/BAExtracted
  • Using ArcCatalog?, export the feature class to an "xml workspace document," being sure to select the "normalized (larger)" geometry representation
  • Create a BADataMunger config file for this map, named like BATL###_config.xml and commit it under source:BADataMunger/trunk/config
    • As a minimum, it should have map number, creator, contributors and rights; as you explore the data for this map, you may need to add entries for:
      • disambiguators
      • suppressors
      • multiples
      • directives for anomalously formatted tables
  • Manually inspect for idiosyncrasies in structure or format (e.g., Sardinia/Corsica or Map 87 vs. Map 87 inset), and take remedial steps as necessary
  • Save the directory from MSWord as "web page"; "Web Options" must first be set as follows:
    • Target browser: MSIE 5.0 or later
    • Disable features not supported by this browser = yes
    • Rely on CSS for font formatting = yes
    • Save new web pages as Web archives = no
    • Encoding: save this document as Unicode (UTF-8)
    • (others irrelevant)

Create data for Pleiades upload

this is now out of date - need to update to reflect usage of BADataMunger

  • Using ArcCatalog?, export the feature class to an "xml workspace document," being sure to select the "normalized (larger)" geometry representation
  • Run source:system/trunk/data/geoentities/arcgis/arc2csv.xsl on the resulting xml
  • Using the standard "get external data" function in access, pull this information into a table in the munge database, titling the table like BA065Geo
  • Update the table references in the following queries to match the table name:
    • aaaJoinUnion
    • aaaNonnullDisambiguate
  • Run the aaaNonnullDisambiguate action query ... this changes null diambiguator entries in the imported data into zero-length strings, which are needed for the aaaJoinUnion query to work right
  • Run the doXMLOutput() subroutine in the xmlOutput module - this will create xml files on the local filesystem at msaccess\..\xml\batlas###\xml-raw\ (see source:system/trunk/data/geoentities/xml)

Process the XML files for batch upload to Pleiades

this is now out of date - need to update to reflect usage of BADataMunger

  • Using xsltproc, transform dir.xml to produce a transform batch file that fires appropriate transforms for each data file (there is a DOS batch file to do this step: http://icon.stoa.org/trac/pleiades/browser/system/trunk/data/geoentities/makexformbat.bat)
  • The result of this step (xformbatlas###.bat) will invoke xsltproc to massage all the files in xml-raw into the proper arrangements and namespaces, and put the results in xml-cooked
  • use the data in xml-cooked for batch upload, etc.

Loading Script

http://.../load_entities?sourcedir=/path/to/dir