Metadata Standard for the Montana GIS Portal

Introduction
This is a standard for metadata documents submitted for publication on the Montana GIS Portal at http://gisportal.mt.gov. The GIS Portal is a central location for the discovery of Geographic Information System data about Montana, and it provides a map viewer that allows anyone to view GIS data sets that are available through web services.

This standard has been adopted as Information Infrastructure Standards 1200.XS3 and 1200.XS4 by the Montana Department of Administration. It has also been adopted as a Best Practice by the Montana Association of Geographic Information Professionals.

Metadata is data about data. In the context of the GIS Portal, a metadata document is an XML file that contains descriptive information about a GIS data set. The XML file must conform to a standard that allows the portal software to load it into a searchable index, provide basic descriptive information about the data set, and provide information that allows users to retrieve the data set.

The standardís requirements are divided into different levels of compliance, according to whether they are mandatory technical requirements necessary for the correct operation of the portal software, mandatory informational content to ensure minimal necessary information for the evaluation, retrieval, and use of the data set by portal users, and recommended content that allows users to fully understand the data.

Mandatory Technical Requirements

  • FGDC Compliance
    Metadata documents submitted to the portal must be valid XML documents that conform to the structure set out in the Federal Geographic Data Committeeís (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM), version 2.

    There is extensive information about the CSDGM at http://www.fgdc.gov/metadata. The web site has links to free and commercial software tools available to assist publishers in creating FGDC-compliant XML metadata documents. The FGDC published a workbook describing the CSDGM, and a copy of this is available from the GIS Portal at http://gisportal.msl.mt.gov/metadata/workbook.pdf.

    The CSDGM contains information about what metadata elements are required and optional. Many of these requirements have been discredited, and it is not recommended that GIS Portal publishers pay attention to them. For the purposes of the GIS Portal, the FGDC standard is a list of elements that are available to fill out. Some metadata software tools have various levels of enforcement of the element requirements from the CSDGM. GIS Portal publishers may exercise discretion on whether to obey the rules set up by these tools.

  • GIS Portal Functional Requirements
    The following metadata elements are displayed by the portal when users find metadata records or are used by the portal to classify and provide links to the data. Documents without proper entries in these fields will not be accepted by the portal. In this and subsequent lists, the metadata element name is followed by the XML path to the element in the metadata file and a description of the required content.
    • Title (idinfo/citation/citeinfo/title). Metadata must contain a title for the data set. The title should describe the region covered by the data set (such as "Montana", "Bozeman", or "Smith Watershed") and the subject (such as "Highways" or "Water Wells"). If the data set, or the particular copy of the data set referenced by the metadata, is not being actively maintained to be up-to-date, the title should include a date, for example "2006 Color Orthophoto of Helena, Montana" or "Yellowstone County School Districts, 1980".
    • Abstract (idinfo/descript/abstract) is a concise summary of important information about the data set. In about three paragraphs or less, say everything you would want someone to know about the data, assuming this is all they were going to read.
    • Time Period of Content (idinfo/timeperd/timeinfo) is a date, dates, or range of dates when the data set was valid. Dates must be entered in the metadata in YYYYMMDD format, such as 20080131. If you do not know the exact date, you may just fill out a year or year and month, as in 1995 or 199708. The FGDC standard allows you to fill this section with a publication date, but this is strongly discouraged, unless you have no information about when the data was really collected.
    • Contact Organization (idinfo/ptcontac/cntinfo/cntorgp/cntorg) is used by the portal to fill out the Publisher seen by users in the metadata search results. This element may appear under cntperp (Contact Person Primary) rather than cntorgp (Contact Organization Primary). The portalís advanced search page allows users to search for values from this field.
    • Point of Contact (idinfo/ptcontac) should be a person or office that can answer questions about the data set. The address, city, state, and ZIP code must be filled out.
    • Theme Keywords (idinfo/keywords/theme). There must be a theme keyword section that contains a themekt element (Theme Keyword Thesaurus) whose value is "ISO 19115 Topic Category" and one or more themekey elements (Theme Keywords) whose values are chosen from the list in Appendix A. The portalís advanced search page allows users to restrict their searches to records containing one of these keywords.
    • Place Keywords (idinfo/keywords/place/placekey). The portal will show these in the Coverage Area section of the metadata results.
    • Online Linkage (idinfo/citation/citinfo/onlink) is a link to a downloadable data file, a web site about the data, or a specification that allows the portal to add a web service to its map viewer. See Appendix B for the rules for this element.
    • Bounding Coordinates (idinfo/spdom/bounding) are the latitude and longitude coordinates of a rectangle that encloses the region covered by the data set. The portalís advanced search page allows users to search for data sets whose bounding coordinates intersect, or are within, any geographic region.
    • Resource Description (distinfo/resdesc) must be filled out with a choice from a pre-set list of values. Its value is shown in the Content Type section of the search results and it helps control how the portal creates a link to the data. The portalís advanced search page allows users to search for values from this field. See Appendix B for the rules for this element.
    • Global Unique ID (esri/PublishedDocID). For the portal to recognize that a revised document you publish is the same as a document you previously published, you must insert a Global Unique ID into the document. Instructions for this are in Appendix C.

Mandatory Information
The portal managers will review your metadata and, if it is available, the data it describes. If they feel they do not understand it or if the following sections are not filled out properly, they may ask you to provide more information before they publish your metadata.

  • Originator (idinfo/citation/citeinfo/origin) is the agency or person primarily responsible for creating the data set. If, for example, this data was obtained from the Census Bureau and you made several corrections to the data, you would still probably list the Census Bureau as the originator. You may enter multiple originators if you feel this is necessary.
  • Publication Date (idinfo/citation/citeinfo/pubdate) may be the first time this data set was made available to the public or the date when this specific version of the data was released. Dates must be entered in the metadata in YYYYMMDD format, such as 20080131. If you do not know the exact date, you may just fill out the year or year and month, as in 1995 or 199708.
  • Publisher (idinfo/citation/citeinfo/pubinfo/publish) is probably YOUR ORGANIZATION. Who is making this particular version of the data available to the public? If this version of the data is essentially unchanged from something you obtained from somewhere else (aside from easily accomplished format or projection changes), this should be the organization you obtained it from that is primarily responsible for making it public. If the organization does not make it public but provided it to you, with permission to distribute it, you might want to claim status as the publisher.
  • Spatial Reference Information (spref) is the definition of the coordinate system and datum of the data. The format of this section is highly variable depending on the type of coordinate system you are using. If your metadata tool does not fill this section out automatically, refer to the instructions in the CSDGM workbook.
  • Entity and Attribute Information (eainfo) is where you should put a list of the tables and data fields that come with your data, and explanations of what they contain. Fields that have a limited domain should have a list of the allowable values and, if applicable, their meanings. See Appendix D for some examples.
  • Distributor (distinfo/distrib/cntinfo) contains information on who to contact about getting a copy of the data.
  • Metadata Date (metainfo/metd) is the latest revision date of the metadata. Dates must be entered in the metadata in YYYYMMDD format, such as 20080131. If you do not know the exact date, you may just fill out the year or year and month, as in 1995 or 199708.
  • Metadata Contact (metainfo/metc) is the person who wrote the metadata.

Other Important Metadata Fields
You are strongly encouraged to fill out these fields. In some cases, the portal managers may ask you to fill some of these out if they feel an element from this list is especially important for your data set.

  • Purpose (idinfo/descript/purpose) describes a specific purpose the data was developed for, or something you want to see other people use it for. You donít have to imagine a purpose for the data if you donít want to put anything here.
  • Access Constraints (idinfo/accconst). If there restrictions on who may obtain the data, or promises they have to make beforehand, put them here.
  • Use Constraints (idinfo/useconst). Are there uses that the data should be restricted to or discouraged from?
  • Progress (idinfo/status/progress) indicates whether the data set can be considered as a finished product. If the data is constantly being updated, you can still say it is "complete", as in having all the latest data you know about.
  • Update Frequency (idinfo/status/update). How often is the data updated?
  • Browse Graphic (idinfo/browse). If you have some sort of on-line picture or map that features the data and put a link to it in this section, the portal has a place where it will show users a thumbnail image of the picture and/or a link to it.
  • Completeness Report (dataqual/complete). Is there some subset of the data that is missing? For example, "No data is available for Dawson County" or "Streams less than two miles long have been omitted".
  • Attribute Accuracy Report (dataqual/attracc/attraccr). If there is something you should say about the accuracy of any of the dataís attribute fields, put it here. For example, "Each well ID number was independently checked against the source map by two technicians" or "Standard deviation for Calcium concentration in control samples was 8 mg/L".
  • Horizontal Positional Accuracy Report (dataqual/posacc/horizpa/horizpar). Tell us what you know about the accuracy of the coordinates, such as "GPS coordinates were not differentially corrected, and the receiver reported accuracy values of between 10 meters and 40 meters during the survey."
  • Vertical Accuracy Report (dataqual/vertacc/vertaccr). Describe the accuracy of any elevation information in the data set. "Elevations were estimated from a topographic map whose contour interval is 80 feet".
  • Source Information (dataqual/lineage/srcinfo). What documents or data sets did you obtain information for this data set from? See Appendix E for examples.
  • Process Steps (dataqual/lineage/procstep). What did you do to create the data? See Appendix E for examples.


Appendix A
Theme Keywords

Metadata documents for the Montana GIS Portal must have a Theme Keyword section that contain a Theme Keyword Thesaurus whose value is "ISO 19115 Topic Category" and at least one Theme Keyword whose value is taken from the following list. A sample XML theme keyword section that contains all the values follows.

  • farming
  • biota
  • boundaries
  • climatologyMeteorologyAtmosphere
  • economy
  • elevation
  • environment
  • geoscientificInformation
  • health
  • imageryBaseMapsEarthCover
  • intelligenceMilitary
  • inlandWaters
  • location
  • oceans
  • planningCadastre
  • society
  • structure
  • transportation
  • utilitiesCommunication

<theme>
  <themekt>ISO 19115 Topic Category</themekt>
  <themekey>farming</themekey>
  <themekey>biota</themekey>
  <themekey>boundaries</themekey>
  <themekey>climatologyMeteorologyAtmosphere</themekey>
  <themekey>economy</themekey>
  <themekey>elevation</themekey>
  <themekey>environment</themekey>
  <themekey>geoscientificInformation</themekey>
  <themekey>health</themekey>
  <themekey>imageryBaseMapsEarthCover</themekey>
  <themekey>intelligenceMilitary</themekey>
  <themekey>inlandWaters</themekey>
  <themekey>location</themekey>
  <themekey>oceans</themekey>
  <themekey>planningCadastre</themekey>
  <themekey>society</themekey>
  <themekey>structure</themekey>
  <themekey>transportation</themekey>
  <themekey>utilitiesCommunication</themekey>
</theme>


Appendix B
Resource Description and Online Linkage

The portal uses the Resource Description and Online Linkage sections of your metadata to determine how to describe whether your data is a GIS layer or not, and to determine if it can be downloaded or added to a map viewer.

Resource Description may have the following values. It is anticipated that the Montana GIS Portal will concentrate on the first three types.

  • Live Data and Maps (Web services that may be added to map applications)
  • Downloadable Data (GIS data files available for download)
  • Offline Data (Data files that you have to order)
  • Static Map Images (Map images available for download)
  • Document
  • Applications
  • Geographic Services
  • Clearinghouse
  • Map Files
  • Geographic Activities

Online Linkage is a link to a web service, a file containing GIS data or a map, or a web site that has more information about the data or how to obtain it. Depending on the value of the Resource Description and the format of the Online Linkage, the portal will create links in the search results pages that allow the data to be downloaded or added to the map viewer. The rules for this are as follows. If a combination is found that does not follow these rules, the portal will classify the record as an "Other Document" and make a link with the Online Linkage labeled "Go to Website".

  • If the Resource Description is Live Data and Maps
    • The portal will assume the data is accessible through an ArcIMS service, and make a button for adding the service to the map viewer, if the Online Linkage has the following form.
      • http://<server>/image/<service>
    • The portal will assume the data is accessible through an OGC Web Mapping Service, and make a button for adding the service to the map viewer, if the Online Linkage has any of the following forms.
      • http://<server>/<servlet-path>/com.esri.wms.Esrimap
      • http://<server>/<OGC Type>/<path>
        (OGC Type is wfs, wms, or wcs)
      • http://<server>/<path>/service=<OGC Type>
      • http://<server><path><text>request=getmap<text>

  • If the Resource Type is Downloadable Data, the portal will assume the data is available in a downloadable file, and make a button labeled "Download Data" for the link, if the Online Linkage has a file extension of zip, gz, tar, tgz, dbf, shp, rar, xls, txt, dwg, dxf, dgn, or e00 and has one of the following forms:
    • http://<server>/<path>/<filename>.<extension>
    • ftp://<server>/<path>/<filename>.<extension>

  • If the Resource Type is Static Map Images, the portal will assume the record refers to a downloadable map file and make a button labeled "View Map" for the link, if the Online Linkage has a file extension of gif, jpg, jpeg, bmp, pdf, pmt tif, tiff, cal, pct, pict, eps, mxd, av, mpg, mpeg, wmv, img, or rm and has one of the following forms:
    • http://<server>/<path>/<filename>.<extension>
    • ftp://<server>/<path>/<filename>.<extension>


Appendix C
Global Unique IDs

Each metadata document must have a global unique ID (GUID) inserted in it. This allows the portal to recognize the document after the first time it has encountered it. If you upload a document to the portal a second time, or if the portal harvests the folder the document is in more than once, it will assume it is looking at a new document and load a duplicate copy of it into the database if the document does not contain a GUID it has seen before.

The GUID section should be inserted into the metadata with a plain-text editor immediately before the last line of the file. The last line of the file should be "</metadata>." An example of a GUID section and the last line of a metadata file is shown below.

  <Esri>
   <PublishedDocID>
    {13B2A163-4EE2-4204-B553-6309DD3434C2}
   </PublishedDocID>
  </Esri>
 </metadata>

The GUID is the number between the curly braces. You must provide a different GUID for each file, and make sure you do NOT copy any GUID you see in any document. There are many free GUID generators on the Web that you can use to create unique GUIDs for your metadata files, such as http://www.guidgenerator.com/ and http://www.famkruithof.net/uuid/uuidgen.


Appendix D
Entity and Attribute Samples

These examples show the XML-formatted metadata. We hope you will be able to relate these examples to the fields your metadata editor tool presents you with. Most metadata tools will create Entity and Attribute sections that are more complicated than these. You may leave the extra information there -- these examples show a minimum set of information that is still very useful.

Example 1
This data set is a county shapefile. Shapefiles have intrinsic feature ID (FID) and Shape fields, and this one has a FIPS code field and a county name field. You could list all possible values for the FIPS code and county name, but it is not necessary.

<eainfo>
  <detailed>
    <enttyp>
      <enttypl>county.dbf</enttypl>
    </enttyp>
    <attr>
      <attrlabl>FID</attrlabl>
      <attrdef>Internal feature number</attrdef>
    </attr>
    <attr>
      <attrlabl>Shape</attrlabl>
      <attrdef>Feature Geometry</attrdef>
    </attr>
    <attr>
      <attrlabl>FIPS</attrlabl>
      <attrdef>
        Federal Information Processing Standard code for the county
      </attrdef>
    </attr>
    <attr>
      <attrlabl>County</attrlabl>
      <attrdef>County Name</attrdef>
    </attr>
  </detailed>
</eainfo>

Example 2
This data set is a personal geodatabase feature class named "signs". The business table for a feature class has the same name as the feature class. This one contains an ObjectID field, a Shape field, a field with a code that describes the signís condition (whose values are not explained anywhere in the database), and a field with a code that describes the signpost type. The geodatabase has another table that lists the allowable signpost type codes and their meanings. In this case, it is very important for the metadata to explain the condition codes, and the metadata author thought it would be useful to also make a list of the signpost type codes and their meanings.

<eainfo>
  <detailed>
    <enttyp>
      <enttypl>signs</enttypl>
    </enttyp>
    <attr>
      <attrlabl>ObjectID</attrlabl>
      <attrdef>Internal feature number</attrdef>
    </attr>
    <attr>
      <attrlabl>Shape</attrlabl>
      <attrdef>Feature Geometry</attrdef>
    </attr>
    <attr>
      <attrlabl>Condition</attrlabl>
      <attrdef>Code for sign condition</attrdef>
      <attrdomv>
        <edom>
          <edomv>1</edomv>
          <edomvd>Good</edomvd>
        </edom>
        <edom>
          <edomv>2</edomv>
          <edomvd>Fair</edomvd>
        </edom>
        <edom>
          <edomv>3</edomv>
          <edomvd>Poor</edomvd>
        </edom>
      </attrdomv>
    </attr>
    <attr>
      <attrlabl>Type</attrlabl>
      <attrdef>
        Signpost Type Code. Code values are explained in the TypeCode table.
      </attrdef>
    </attr>
  </detailed>
  <detailed>
    <enttyp>
      <enttypl>TypeCode</enttypl>
      <enttypd>List of Signpost Type Codes and their meanings</enttypd>
    </enttyp>
    <attr>
      <attrlabl>Type</attrlabl>
      <attrdef>
        Signpost Type Code. Code values are explained in the Definition field.
      </attrdef>
      <attrdomv>
        <edom>
          <edomv>1</edomv>
          <edomvd>Steel</edomvd>
        </edom>
        <edom>
          <edomv>2</edomv>
          <edomvd>Aluminum</edomvd>
        </edom>
        <edom>
          <edomv>3</edomv>
          <edomvd>Wood</edomvd>
        </edom>
      </attrdomv>
    </attr>
    <attr>
      <attrlabl>Definition</attrlabl>
      <attrdef>Explanation of the Signpost Type Code.</attrdef>
    </attr>
  </detailed>
</eainfo>


Appendix E
Source Information and Process Step Samples

This example shows the XML-formatted metadata. We hope you will be able to relate this example to the fields your metadata editor tool presents you with. Most metadata tools will create metadata that is more complicated than this. You may leave the extra information there -- this example shows a minimum set of information that is still very useful.

This example describes the sources and processing steps for a well database. Some of the well locations were downloaded from the Montana Ground Water Information Center, while others were digitized with GPS. The attributes for the GPS wells were obtained from an imaginary private database.

<lineage>
  <srcinfo>
    <srccite>
      <citeinfo>
        <origin>
          Montana Bureau of Mines and Geology
          Ground-Water Information Center

        </origin>
        <pubdate>2000</pubdate>
        <title>Well Log Data</title>
        <pubinfo>
          <pubplace>Butte, Montana</pubplace>
          <publish>Montana Bureau of Mines and Geology</publish>
        </pubinfo>
        <onlink>http://mbmggwic.mtech.edu</onlink>
      </citeinfo>
    </srccite>
    <srctime>
      <timeinfo>
        <sngdate>
          <caldate>2000</caldate>
        </sngdate>
      </timeinfo>
    </srctime>
    <srccontr>
      The locations and attributes of wells whose Source field has a value of "1" were obtained from this source.
    </srccontr>
  </srcinfo>
  <srcinfo>
    <srccite>
      <citeinfo>
        <origin>Smith Surveyors</origin>
        <pubdate>Unpublished Material</pubdate>
        <title>GPS Well Cooordinates</title>
        <othercit>
          1817 14th Avenue North
          Kalispell, MT 59901
          406-755-9999

        </othercit>
      </citeinfo>
    </srccite>
    <srctime>
      <timeinfo>
        <sngdate>
          <caldate>20010822</caldate>
        </sngdate>
      </timeinfo>
    </srctime>
    <srccontr>
      The locations of the wells whose Source field has a value of "2" were obtained via GPS survey by this source.
    </srccontr>
  </srcinfo>
  <srcinfo>
    <srccite>
      <citeinfo>
        <origin>Jones Consulting</origin>
        <pubdate>Unpublished Material</pubdate>
        <title>Well log data</title>
        <othercit>
          817 12th Street West
          Whitefish, MT 59937
          406-863-9999

        </othercit>
      </citeinfo>
    </srccite>
    <srctime>
      <timeinfo>
        <sngdate>
          <caldate>20010714</caldate>
        </sngdate>
      </timeinfo>
    </srctime>
    <srccontr>
      The attributes of the wells whose Source field has a value of "2" were obtained from this source.
    </srccontr>
  </srcinfo>
  <procstep>
    <procdesc>
      Download the well log spreadsheet from GWIC, load data as a table in ArcMap, set up an event theme using the latitude and longitude fields, and export data to a personal geodatabase feature class.
    </procdesc>
    <procdate>20010904</procdate>
  </procstep>
  <procstep>
    <procdesc>
      Load GPS well location table, and set up event theme. Load well attribute data as an Access table. Join attribute data to GPS locations using the Well ID field. Export joined tables as a feature class.
    </procdesc>
    <procdate>20010904</procdate>
  </procstep>
  <procstep>
    <procdesc>Combine the GPS well data with the GWIC well data.</procdesc>
    <procdate>20010905</procdate>
  </procstep>
</lineage>