Format Conversion Feasibility
Work Package 4 of Telematics for Libraries project BIBLINK (LB 4034)
The BIBLINK Project
Title page
Table of Contents

Previous - Next

6.2 Mapping Dublin Core to UNIMARC

Dublin (DC) Core was primarily designed to provide a simple description for networked resources. It has a specific relevance to the Web and applications of DC exist to embed metadata into the headers of HTML documents. The examples in this section show DC elements embedded in HTML but this is not supposed to imply that BIBLINK will only receive Dublin Core data in this form. Dublin Core in the form of simple ASCII text might be more appropriate for publishers to provide and any BIBLINK conversion tool would have to deal with that. Fortunately, Dublin Core was designed to be syntax independent so the precise form of syntax used will not affect the mapping tables themselves.

These mappings will attempt to indicate the level of detail required within the Dublin Core record to achieve a minimal but viable UNIMARC record

Table I: Summary Mapping from Dublin Core to UNIMARC

Dublin Core

UNIMARC

Title

200 $a Title Proper
200 $e Other Title Information (for subtitle)
517 $a Other Variant Titles (for other titles)

Creator

700 $a Personal Name - Primary Intellectual Responsibility, or if more than one:
701 $a Personal Name - Alternative Intellectual Responsibility
710 $a Corporate Body Name - Primary Intellectual Responsibility, or:
711 $a Corporate Body Name - Alternative Intellectual Responsibility
200 $f First Statement of Responsibility

Subject

610 $a Uncontrolled Subject Terms
606 Topical Name Used as Subject (for LCSH and MeSH)
675 UDC
676 DDC
680 LCC
686 Other Classification Systems

Description

330 $a Summary or Abstract

Publisher

210 $c Name of Publisher, Distributor, etc.

Contributors

701 $a Personal Name - Alternative Intellectual Responsibility
711 $a Corporate Body Name - Alternative Intellectual Responsibility
200 $g Subsequent Statement of Responsibility (if role known)

Date

210 $d Date of Publication, Distribution, etc.

Type

608 Form, Genre or Physical Characteristics Heading

Format

336 $a Type of Computer File (provisional)

Identifier

001 (mandatory for UNIMARC)
010 (ISBN)
011 (ISSN)
020 (National Bibliography Number)
300 $a (URL)

Source

324 Original Version Note

Language

101 Language of the Item
300 General Note

Relation

300 General Note

Coverage

300 General Note

Rights

300 General Note

6.3 Comments on the Dublin Core - UNIMARC mapping

Part of the reason for producing mapping tables between metadata formats is to discover areas where there are important problems. These problem areas can be very significant when the mapping is from a relatively simple metadata format to a more complex one. This is certainly the case with this mapping from Dublin Core to UNIMARC. MARC formats, when they are used for bibliographic data tend to be closely tied to particular cataloguing rules like AACR2. For example, the distinction between main and added entries defined in AACR2 for choosing access points becomes formalised in the distinction in USMARC between fields 100 (Main Entry -- Personal Name) and 700 (Added Entry -- Personal Name). Caplan and Guenther, in their Dublin Core-USMARC mapping, point out that DC CREATOR, which does not embody the concepts of main and added entry, cannot be easily mapped to USMARC [5]. UNIMARC similarly contains fields for Primary Intellectual Responsibility (700, 710 and 720), Alternative Intellectual Responsibility (701, 711, and 721) and Secondary Intellectual Responsibility but in practice is more flexible than USMARC. It suggests that if the given cataloguing code does not embody the concept of main entry "all persons, corporate bodies or families having equal responsibility may be coded as if they had alternative responsibility" [6].

In this section each of the Dublin Core (DC) metadata elements will be taken in turn and any difficulties noted. The definitions of the DC elements are taken from the Reference Description issued by OCLC [7].

6.3.1 Title

The name given to the resource by the CREATOR or PUBLISHER.

UNIMARC:

NOTES:

EXAMPLES:

200 1#$aDublin Core Metadata Element Set$eReference Description

200 1#$aOCLC/NCSA Metadata Workshop Report
517 1#$aDublin Core Report

6.3.2 Creator

The person(s) or organization(s) primarily responsible for the intellectual content of the resource. For example, authors in the case of written documents, artists, photographers, or illustrators in the case of visual resources.

Qualifier possible: TYPE.

UNIMARC:

NOTES:

  • The first problem encountered is how to distinguish between personal names and corporate names for the sake of knowing which UNIMARC field to use. It would help if some distinction could be made in DC using the TYPE qualifier.
  • There is some debate whether DC CREATOR should be mapped to 700/710 for Primary Intellectual Responsibility or 701/711 Alternative Intellectual Responsibility. As DC elements have no concept of main entry and are repeatable there is no easy way of determining from the DC CREATOR any concept of who is primarily responsible for the intellectual content. In this instance the UNIMARC manual suggests Alternative Intellectual Responsibility. A possible compromise might be to map to 700/710 when there is a single DC CREATOR element and to repeated 701/711 fields if there is more than one.
  • 700/701 Indicator 2 specifies whether a personal name is entered in direct order (Indicator 2 =0) or whether it is entered with inversion (Indicator 2 =1). Conversion software would have to be aware of this.
  • 710/711 Indicator 1 distinguishes between corporate names (= 0) and meetings (= 1).
  • 710/711 Indicator 2 denotes the order of the entry for a corporate name: In inverted form (= 0), under place or jurisdiction (= 1) or in direct order (=2).
  • EXAMPLES:

    This is an example only. In practice the corporate bodies might better be described as DC CONTRIBUTOR rather than DC CREATOR:

    <META NAME="DC.title" CONTENT="OCLC/NCSA Metadata Workshop Report">
    <META NAME="DC.creator.corporate" CONTENT ="Online Computer Library Center">
    <META NAME="DC.creator.corporate" CONTENT="National Center for Supercomputing Applications">
    <META NAME="DC.creator.personal" CONTENT="Stuart Weibel">
    <META NAME="DC.creator.personal" CONTENT="Jean Godby">
    <META NAME="DC.creator.personal" CONTENT="Eric Miller">
    <META NAME="DC.creator.personal" CONTENT="Ron Daniel">

    200 1#$aOCLC/NCSA Metadata Workshop Report
    701 #0$aStuart Weibel
    701 #0$aJean Godby
    701 #0$aEric Miller
    701 #0$aRon Daniel
    711 02$aOnline Computer Library Center
    711 02$aNational Center for Supercomputing Applications

    6.3.3 Subject

    The topic of the resource, or keywords or phrases that describe the subject or content of the resource. The intent of the specification of this element is to promote the use of controlled vocabularies and keywords. This element might well include scheme-qualified classification data (for example, Library of Congress Classification Numbers or Dewey Decimal numbers) or scheme-qualified controlled vocabularies (such as MEdical Subject Headings or Art and Architecture Thesaurus descriptors) as well.

    Qualifier possible: SCHEME.

    UNIMARC:

    NOTES:

    6.3.4 Description

    A textual description of the content of the resource, including abstracts in the case of document-like objects or content descriptions in the case of visual resources. Future metadata collections might well include computational content description (spectral analysis of a visual resource, for example) that may not be embeddable in current network systems. In such a case this field might contain a link to such a description rather than the description itself.

    UNIMARC:

    EXAMPLE:

    300 ##$aClassification schemes have a role in aiding information retrieval in a network environment, especially for providing browsing structures for subject-based information gateways on the Internet. Advantages of using classification schemes include improved subject browsing facilities, potential multi-lingual access and improved interoperability with other services. Classification schemes vary in scope and methodology, but can be divided into universal, national general , subject specific and home-grown schemes. What type of scheme is used, however, will depend upon the size and scope of the service being designed.

    6.3.5 Publisher

    The entity responsible for making the resource available in its present form, such as a publisher, a university department, or a corporate entity. The intent of specifying this field is to identify the entity that provides access to the resource.

    UNIMARC:

    NOTES:

    EXAMPLE:

    210 ##$cOnline Computer Library Center

    6.3.6 Contributor

    Person(s) or organization(s) in addition to those specified in the CREATOR element who have made significant intellectual contributions to the resource but whose contribution is secondary to the individuals or entities specified in the CREATOR element (for example, editors, transcribers, illustrators, and convenors).

    Qualifier possible: TYPE.

    UNIMARC:

    NOTES:

    6.3.7 Date

    The date the resource was made available in its present form. The recommended best practice is an 8 digit number in the form YYYYMMDD as defined by ANSI X3.30-1985 or ISO 8601-1988. In this scheme, the date element for the day this is written would be 19961203, or December 3, 1996. Many other schema are possible, but if used, they should be identified in an unambiguous manner.

    Qualifier possible: TYPE

    UNIMARC:

    NOTES:

    6.3.8 Type

    The category of the resource, such as home page, novel, poem, working paper, technical report, essay, dictionary. It is expected that RESOURCE TYPE will be chosen from an enumerated list of types. A preliminary set of such types can be found at the following: <URL:http://www.roads.lut.ac.uk/Metadata/DC-ObjectTypes.html>

    UNIMARC:

    NOTES:

    6.3.9 Format

    The data representation of the resource, such as text/html, ASCII, Postscript file, executable application, or JPEG image. The intent of specifying this element is to provide information necessary to allow people or machines to make decisions about the usability of the encoded data (what hardware and software might be required to display or execute it, for example). As with RESOURCE TYPE, FORMAT will be assigned from enumerated lists such as registered Internet Media Types (MIME types). In principle, formats can include physical media such as books, serials, or other non-electronic media.

    UNIMARC:

    NOTES:

    6.3.10 Identifier

    String or number used to uniquely identify the resource. Examples for networked resources include URLs and URNs (when implemented). Other globally-unique identifiers, such as International Standard Book Numbers (ISBN) or other formal names would also be candidates for this element.

    Qualifier possible: SCHEME.

    UNIMARC:

    NOTES:

    6.3.11 Source

    The work, either print or electronic, from which this resource is derived, if applicable. For example, an html encoding of a Shakespearean sonnet might identify the paper version of the sonnet from which the electronic version was transcribed.

    UNIMARC:

    NOTES:

    6.3.12 Language

    Language of the intellectual content of the resource. Where practical, the content of this field should coincide with the Z39.53 three character codes for written languages.

    Qualifier possible: SCHEME.

    UNIMARC:

    NOTES: