Rethinking gazetteers and interoperability

Greg Janée, UCSB

  • adl gazeteer protocol
  • interoperability use cases
  • gazetteer protocol revisited
  • question for the workshop
  • protocol history
    • motivation
      • define gazeteer's role in ADL architecture
      • functional definition to complememtn gaz content standard (GCS)
    • development
      • 2001 - codeveloped protocol with ESRI
      • 2003 - minor additions, deletions
      • retargetable server
        • full GCS compliant schema
        • lite schmea
      • web-based client
  • protocol characteristcs
    • defines gazetteer model
      • simplified, focus on interoperability
      • compatible with, mappable from GCS
    • seven query types
    • abstract specification
    • http+xml instantiation
    • seprate thesaurus protocol
      • Z39.19 thesaurus model
  • gazetteer model (diagram)
  • limitations/problems
    • no support for qualified placename queries
      • e.g., find "Santa Barbara, CA"
      • onerous to implememtn using low-level facilities
    • conflicting and unpredictable query semantics: 2 ways of specifiying hierarchical relationships in query
      • spatially contained within california vs. has relationship PartOf? to California
      • user standpoint results are implementation-specific, unpredictable and variable
  • interoperability use cases
    • harvest
      • aggregate distributed, esp. local gazetteers
      • need
        • protocol (OAI-PMH)
        • representation standard(s) (GCS)
    • lookup
      • find place by name or other description
      • but users want to do more than lookup places
        • cities, street addresses (geocoding), zip codes, airport codes, area code, specialized ways for distinct communities, URL of a flickr photo that has geotags ...
          • we've stepped outside the gazetteer model
          • but users want/expect this in the same interface (e.g., google maps)
        • this use case is broader than gazetteer!
        • users want to locate places in a wide variety of ways, and via other proxy objects/documents
    • reverse lookup
      • find nearby places
      • find nearby places of a given type
      • find nearest place of a given type
    • geoparse
      • identify and geolocate place references in documents
      • eg., geonames.org rss-to-georss converter
      • uses gazetteers, but not dynamic protocol -- presupposes, perhaps, use of a harvesting protocol
    • ontology
      • inferencing over knowledge base of places
      • requirements ... either you have
        • unique ids, ontology of relationships, or
        • unification of facts
        • does this require a protocol, no ... more likely harvest and post-process
  • gazetteer protocol, revisited
    • harvest - already better supported by OAI-PMH
    • lookup - too limited, rigid
    • reverse lookup - supports near, but not nearest
    • geoparse - n/a
    • ontology - n/a
  • question
    • should we rethink gazetteer interoperability
      • from: we have an entity (gazetteer) and have a protocol for accessing that entity
      • to:
        • multiple protocols oriented around use cases and functionality
        • that various kinds of entities (gazetteers, geocoders, etc.) participate in and implement use cases to varying degrees

Comments

  • JF@SB: well-illustrated distinction between gazetteer and geoservices
    • we can agree on the data model more readily than on common service
    • argument that there is no common service
  • JF@Metacarta: the goal of having many things able to ride on top of a simple "protocol"
    • copy that approach -- what's the lowest common denominator / minimal stack that could support all these more specific needs
    • what's the minimal set of data, and then more sepcific services could add to that minimal set
  • Jordan:
    • casting in the minimal triangle - specify any two of the corners and get the third?
  • ??:
    • keep transformation at the center of the model, within this triangle, for getting the whole triangle on the basis of a subset of its vertices
  • ??:
    • services are "finders" -- and these are not part of the gazetteer model
      • ubiquitous need for the finder service
      • gazetteers can help support the find service
  • JR@Edinburgh:
    • web 2.0 and folksonomy and mashup
    • standard, REST-based geodelivery services
    • Web Services (XML, SOAP, etc). UDDI, registry-based discovery
    • alot of the slots are already there - decompose functional requirements on the basis of use cases
    • what about the e-Science/Grid computing communities?
      • web services within workflow engines, that consume geographical information -
    • service-oriented approach driven by use cases
    • what do people actually want
    • part of the research agenda: are their crossmappings from the existing (competing?) protocols and approaches
  • ??:
    • is "gazetteer" a noun or a verb
    • alot of noun aspects
    • but there are srevices that don't map well into the traditional gazetteer data model
    • a more restrictive definition of the term might leave more room for a description of the range of services
  • LH
    • how to represent a gazetteer as a data object in a larger collection?
    • metadata about the gazetteer?
    • extension of gazetteers to non-geospatial resources
    • much of the information we want to interact with is in systems that are not sophisticated computational environments
    • needed: toolkit so people can build, maintain and serve out gazetteers
  • Ray?
    • SRW or SRU for query and search aspects? next generation from Z39.50
    • instead of having to invent something new, piggy back on something that already exist
    • Search and Retrieval Web Service
  • dealing with qualified place names
    • include in gazetteer?
    • construct/preindex from hierarchical relationship data in the gazetteer?

  • John@metacarta
    • attribution string in metacarta apis - licensing?
    • question: is there a calculus of creating a name for a location
      • if well-defined, people could publish functions that generate name strings
      • a gaz protocol that has as few elements in the data model as possible would be good

  • JF
    • are you arguing for a specific gazetteer model tailored for special applications?
    • all Greg's cases ... for all ofthem the simple statement is the gazetteer model is an essential, but not sufficient, solution for these problems
    • a refinement of that assertion: is the problem that the gaz models are too complex and its a simple gazetter model that needs to be this component, or is the simplification part of the service for that application?
    • don't want to forego the notion that there are complex, rich gazetteers somewhere
    • but those models are big, unwieldy difficult to use to serve other applications
    • are we seeking an abstraction layer that simplifies access to these gazetteers?
  • the problem of super models?
    • are there bridges from one to the other?
  • minimal set of model is use case dependent?
  • other ways to use the gazetteer model?
    • can you use it for events?
    • gazetteer model applies to named things
    • are addresses a form of naming?
    • association of labels with coordinates