Minimum Data Set
Work Package 3 of Telematics for Libraries project BIBLINK (LB 4034)
The BIBLINK Project

RTF version

BIBLINK - LB 4034
D3.1 Minimum Data Set

Name of Client:

European Commission

Distribution List:

Pat Manson, European Commission
Project Partners

Author:

Robina Clayphan

 

Authorised by:

Ross Bourne

 

Contractual Date:

August 1997

Date of Issue:

September 1997

Issue:

1.0

Reference:

5003/del-86

Total Number of Pages:

Contact Details for Level-7:

The British Library
National Bibliographic Service
Boston Spa
Wetherby
W.Yorkshire
LS23 7BQ


Telephone: 01937 546969
Facsimile: 01937 546586

Internet: robina.clayphan@bl.uk

Table of Contents

1. Document Control

Issue

Date of Issue

Comments

0.1

August 1997

Internal distribution for comment

1.0

September 1997

Final version for delivery to Commission

1.1 Abstract

This report lists the national bibliographic functions that BIBLINK records will support and presents the list of data elements which the national bibliographic agencies wish to receive from publishers for the BIBLINK demonstrator.

1.2 Keywords

BIBLINK, Library, bibliographic, national library, data set, Dublin Core, MARC, CIP

1.3 Glossary

BIBLINK demonstrator a multi-national demonstrator, developed as part of the BIBLINK project, which provides an environment for the transmission of bibliographic records between publishers and NBAs.

Bibliographic record a discrete bibliographic description stored either manually or electronically.

CD-ROM Compact Disc Read Only Memory.

CIP Cataloguing-In-Publication records, created using information supplied pre-publication by the publisher.

Deposit of publications a system in operation in most countries, usually legally enforced, whereby publishers must deposit one or more copies of every publication with nominated libraries. Often referred to as Legal Deposit.

DOI Digital Object Identifier

DTD Document Type Definition.

Dublin Core a metadata format defined on the basis of international consensus which defines a minimal information resource description for use in a WWW environment. The term 'Dublin' is used as Dublin, Ohio is the location of OCLC's headquarters.

Electronic mail a means for an originator of information to distribute information to an unlimited number of recipients via a value added network service which mimics the functions of the paper postal services.

Electronic publication document, file, journal, etc. made available in electronic form.

Electronic publisher see publisher.

email see Electronic mail.

Format in the context of bibliographic control, the formalised structure in which the specific elements of bibliographic description are accommodated.

HTML Hypertext Mark-up Language The standard language used for creating Web documents.

HTTP HyperText Transfer Protocol. The protocol used for communication between Web clients and servers.

IFLA International Federation of Library Associations and Institutions.

Internet Publisher an organisation or person who publishes documents on the Internet. These will be on-line documents.

ISBD International Standard Bibliographic Description. There are seven specific ISBDs as well as the general ISBD(G): monographs -(M), serial publications -(S), cartographic material -(CM), non-book material - (NBM), printed mu sic -(PM), antiquarian publications -(A), computer files -(CF).

ISBN International Standard Book Number.

ISSN International Standard Serial Number.

Legal Deposit see Deposit of Publications.

MARC MAchine Readable Cataloguing. A family of formats based on ISO 2709 for the exchange of bibliographic and other related information in machine readable form. For example, USMARC and UNIMARC.

Metadata information about a publication as opposed to the content of the publication; includes not only bibliographic description but also other relevant information such as its subject, price, conditions of use, etc.

National Bibliography a listing of all national publications. May include all publications produced in that country, or in the language of that country, or sometimes about that country.

NBA National Bibliographic Agency.

Off-line publication an off-line publication is an electronic document which is bibliographically identifiable, which is stored in machine readable form on an electronic storage medium. For example a CD-ROM.

On-line publication see on-line resource.

Serial a publication in any medium issued in successive parts bearing numeric or chronological designations and intended to be continued indefinitely. Serials include periodicals; newspapers; annuals (reports, yearbooks, etc.) ; the journals, memoirs, proceedings, transactions etc. of societies; and numbered monographic series.

SGML (ISO 8879 ) Standard Generalised Mark-up Language. ISO standard for document description, separating contents and structure.

SSSH Simplified SGML for Serials Headers.

UNIMARC see MARC

URL U niform Resource Locator. The standard way to give the address of a source of information on the WWW. It contains four different parts: the protocol type, the machine name, the directory path and the file name. For example : http://WWW2.echo.lu/libraries/en/libraries.html

2. Management Overview

2.1 Executive Summary

This document is the final report on the minimum data set task in work package 3. It is drawn from a background and issues paper produced for consideration by the partners and the results of the subsequent discussion at the worksh op in Mo i Rana on 28 July 1997.

The partners reached consensus on the functions of national bibliographies that will be supported by BIBLINK records and the set of data elements necessary to do so. This report details those functions and presents a listing of the 18 data elements.

It has been agreed that this minimum data set will use the format and syntax of the Dublin Core and a definitive mapping will be produced for inclusion in the final version of D4.1.

2.2 Relationship to Other Work Packages

WP1 Metadata Formats

D1.1 produced a list of data elements that the national libraries would like to receive for electronic publications. This was a comprehensive list produced to survey the territory early in the project.

WP3 Consensus Building

D3.2 resulted in a refinement of the scope and objectives of the project.

WP4 Format Conversion (running concurrently with WP3)

Interim Report D4.1 contained a discussion paper which proposed a subset of elements from the original list, limiting the elements to those that would support the objectives agreed in D3.2. A preliminary mapping of this proposed set to Dublin Core was produced for discussion. The paper showing this preliminary mapping and outlining related issues is appended as Annex B to this document.

WP3 Minimum Data Set

The discussion paper, appended as Annex A to this report, resulted in agreement as to the functions of a national bibliography and CIP that should be supported by BIBLINK records and the final list of data elements is produced here. A definitive mapping to Dublin Core will be produced to form part of the final version of D4.1.

WP6 Authentication

Consideration of Draft Report D6.1 resulted in agreement that one data element should be included to allow for a simple checksum operation.

3. Introduction

3.1 Introduction

In the BIBLINK demonstrator national bibliographic agencies will receive bibliographic data from publishers via a limited number of interfaces and convert it to MARC formats. In order to produce a usable MARC record a certain minim um number of data elements will be required. One of the tasks of WP3 was to reach consensus between the partners as to the minimum data set to be transmitted.

An initial survey of the data requirements of national libraries for electronic publications was carried out at an early stage in the project. This exercise produced an extensive list of elements which can be found in D1.1 section 6. Since that time the scope and objectives of the project have been refined. BIBLINK now intends to produce CIP-type records. The original list was therefore reduced to contain only those elements that are needed to support the CIP functions of a national bibliography. The revised list was proposed for discussion by the partners.

Due to the timing of and interdependencies between work packages the background work for this report, including the revised list of elements, was interleaved with work on WP4 - Format Conversion Feasibility. To facilitate mappings betw een the proposed BIBLINK data set and other formats the paper that would logically form the body of this report was produced in advance and appeared as Section 9 of the interim report D4.1. It is not considered necessary to rewrite that paper but it is a ppended as Annex A to this report for ease of reference.

Based on that initial paper, one other suggested data set, the preliminary mappings produced for work package 4 and informed by the results of the publisher consensus meetings reported in D3.2, the partners reached consensus on the mini mum data set required for BIBLINK. This report outlines those discussions and presents the agreed data set.

4. The Minimum Data Set

An analysis of the uses made of the national bibliography, the functions of CIP records and consideration of relevant standards can be found in the original document produced to facilitate the mappings in WP4. That paper is appende d as Annex A to this document and to obtain the relevant background information to the following matters it is suggested that the Annex is read first.

4.1 The Purpose of the Minimum Data Set

The discussions at the workshop were based on the proposals in the discussion paper and it was confirmed that the functions listed in Annex A.3.1 were those that should be supported by the BIBLINK demonstrator. As a minimum there fore, the data obtained from publishers must be sufficient to support the following activities:

  1. registering the existence of a publication
  2. creation of a record of an acceptable standard for the national bibliography
  3. selection of publications - this applies at two levels: 1) for the national bibliographic agencies to select publications for inclusion in the national bibliography, 2) for users of the national bibliography to select public ations to acquire
  4. acquisition of publications - this also applies at two levels: 1) to allow national libraries to approach the publisher about deposit of the publication, 2) to provide sufficient information to enable users of the national b ibliography to buy the publication.

It is recognised that it becomes difficult to distinguish between the 'acquisition' and 'access' functions in relation to many online documents. Access is not a traditional function of a national bibliography but this need not be cons idered a problem for the demonstrator: there is no desire to exclude information simply because it does not fit the traditional pattern.

4.2 The Data Elements

To expedite agreement on the minimum data set consideration was given to three suggested sets. The first was the revised set produced by the survey of national libraries (Annex A.6), the second that considered by the British Librar y (Annex A.4.2). The third set was that already in use on the web forms at the Koninklijke Bibliotheek (KB) for publishers to enter data about their electronic publications . The last two of these three lists were merged and the individual elements disc ussed, modified and extended. The resulting set of elements is shown below and will be used for BIBLINK.

BIBLINK encompasses both on and off-line publications and both serials and monographic publications so this list is sufficiently comprehensive to allow data applicable to both types to be recorded. It is recognised that the meaning of the element names may appear ambiguous to a publisher generating the data. A comprehensive 'User Manual' explaining the meaning of and appropriate entries for each element will be created to be used in conjunction with the BIBLINK tools.

The following list shows the agreed set of data elements.

Element

Comment

Author

Person or body primarily responsible for the intellectual content of the publication.

Title

 

Publisher

Agency responsible for producing the publication.

Place of Publication

 

Price

Defined as a 'simple, retail price'. Anything more complex, such as price for site licence, numbers of users etc. will form part of the 'Terms and Conditions' element.

Extent (Size)

For offline publications - for recording details such as the number of physical objects.

Keywords

Possibly from a controlled vocabulary.

Description

Possibly an abstract.

Edition/version statement

This element is named to accommodate the differences in terminology used by libraries and producers of electronic documents.

Date of publication

 

System requirements

This incorporates 'Mode of Access'

offline: e.g. 386SX or higher with 4MB ram etc.

online: Internet via WWW, Adobe etc.

Format

HTML, pdf etc.

Language

Of the publication

Terms and conditions

For online items

Frequency

For serials

Identifier

URL, ISBN, DOI etc.

Contributor

Statement(s) of responsibility for multiple contributions. Can be qualified by DC 'Type' qualifier

Checksum

Simple MD5 hash applied either by NBA or publisher.

4.3 Mapping the Elements to Dublin Core

An initial mapping of the revised national libraries element list was carried out as part of work package 4 and presented as a paper which provided information for the workshop discussions. It is appended here as Annex B. Now that the minimum data set has been agreed, additional work will be necessary to map the modified element set to DC elements wherever possible and define extensions to DC to accommodate elements that cannot easily be mapped. This will be produced in the final version of D4.1.

In the Dublin Core all elements are optional and repeatable. Further discussion will be necessary to decide which elements should be regarded as compulsory for which type of publication if the record is to acceptable in the BIBLINK con text.

Further discussion of the implementation of DC and the syntax for extensions and modifications can be found in D4.1.

Annex A - Data Elements in CIP Records

A.1 Introduction

Consensus has been reached that the objective of the BIBLINK demonstrator is to produce a CIP-type record for electronic publications using data supplied by publishers. To do this, national libraries must define a minimum set of da ta elements that would form a usable record for national bibliographies. In this chapter we first briefly identify the function of a national bibliography and the function of CIP records. Various available 'core' bibliographic descriptions are discussed. The data fields identified by the BIBLINK partners are then considered. These data fields were identified as the requirements of national libraries and listed in the BIBLINK report D1.0 - Metadata Formats but it must be borne in mind that these were s pecified before the objective of producing a CIP-type record was agreed. The original list of elements is therefore comprehensive rather than minimal, and in this section we propose a reduction to the minimal requirements of CIP.

A.2 The Function of a National Bibliography and CIP records

BIBLINK is concerned with the production of CIP-type records for national bibliographies. The area where the functions of CIP intersect with those of the national bibliography is therefore the legitimate territory where the minimum set of data elements can be identified.

In order to suggest a minimum data set it is necessary to take a step back and identify the functions supported by CIP records in a national bibliography. As a minimum, the records produced by the BIBLINK demonstrator should support th ose same functions. Whilst not wanting simply to impose traditional patterns on new technology this will serve as a useful starting point for building consensus.

A.2.1 National Bibliographies

As discussed in D0.1 - Scoping Document, Section 5, selection criteria for national bibliographies have never been uniform in terms of content or coverage. At the moment most national bibliographic agencies are considering thei r policies regarding the inclusion of electronic publications but no common practice has yet emerged. In addition, the main tool for bibliographic control, the deposit of publications, may or may not be a legal requirement. National bibliographies can t herefore be seen to be very different animals in different countries.

Despite these differences, the primary role of a national bibliography is the same throughout: to create a record of the nation's published output however this may be defined. This record then serves various purposes for different user communities. By extension, a secondary role can be seen: that of serving the future as well as the present by providing a historical record of publications even if the items themselves no longer exist. Patterns of publication will be discernible to the yet unborn researcher.

The main users of national bibliographies are librarians and other information professionals whose role is that of intermediary between sources of information and those seeking to find and use that information.

These information professionals use the national bibliography for the following purposes:

  1. as a tool for selection of publications
  2. as a tool for acquisition of publications
  3. as a source for complete catalogue records
  4. as a source of information about the practice and interpretation of cataloguing standards applied by national libraries - either to apply themselves or to discern the underlying principles
  5. for stock management - to check the currency of collections and availability of literature in new areas

Records for electronic publications are needed for exactly the same purposes. The picture becomes slightly less clear-cut in the case of online documents as technology permits direct access to the publication, thereby blurring the dist inction between access and acquisition. More may therefore be demanded of a record than has traditionally been the case.

A.3 CIP Records

CIP programmes provide publishers with the opportunity to register the existence of a publication at the earliest possible moment. In addition, they can be seen to support three of the functions listed above. The data supplied by the publisher is used to create records which can be included in the national bibliography and supplied to third parties and, by providing timely information about publications, CIP records assist in the tasks of selection and acquisition. To some extent they act as an alerting service for the library community.

A.3.1 BIBLINK and CIP

As a minimum, therefore, the data obtained from publishers must be sufficient to enable the BIBLINK demonstrator to support the following activities:

  1. registering the existence of a publication
  2. creation of a record of an acceptable standard for the national bibliography
  3. selection
  4. acquisition

It is worth discussing the acquisition function a little further as this is one area where the use made of information about paper publications will differ considerably from electronic publications. In the sense used by national biblio graphies, data relating to the acquisition of a publication can be seen as supporting purchasing. In other words, giving the user the information required to contact the supplier of the publication in order to acquire it. For online publications which have to be paid for, the same may be true. At the moment, however, a commercial model for electronic publications has not fully evolved and moreover, many publications that NBs may wish to record are available without charge. Information contained in the record data can therefore lead directly to the publication rather than the supplier.

Access for the end-user is not the usual function of a national bibliography, but of a library catalogue. It would obviously be inappropriate to exclude access information from BIBLINK simply because it is not a traditional function of the national bibliography. How BIBLINK chooses to deal with this issue should be discussed by the partners as part of the consensus building process.

A.4 British Library Data Element Sets

In considering the elements appropriate to a 'core' CIP type record it may be useful to consider particular 'core' data element sets proposed at the British Library.

A.4.1 Core Data for British Library Catalogue

The British Library has recently specified a three layer model for the record structure in the BL catalogue: the first layer is the core containing the standardised description and authority data, the second layer special biblio graphic data and the third layer the data needed for stock management. For details of the required basic record for UKMARC see D4.1, Annex B, the full text of which is available as http://portico.bl.uk/nbs/marc/correc.html. This example of a core record requirement is closely related to as yet unpublished UNIMARC minimal level record (Guideline no. 5). Details of requirements for records describing electronic resources have yet to be finalised for UKMARC and so cannot be included in this Annex.

The BL core data contains a level of detail in the description that closely corresponds to the full second level as defined in AACR2 Chapter 1.0D. It is the level of description to which the BL will aspire for all newly created materi al. The second level is a fuller record than the AACR2 first level and as such may be considered too extensive for BIBLINK purposes.

BIBLINK may require that other data elements be mandatory or mandatory if applicable. At present the General Material Designation is optional. For records describing electronic resources, the term ‘[electronic resource]’ may well be n ecessary, although this could be generated automatically on conversion. The full address of publisher could be another element which would be mandatory for smaller publishers unless it is held elsewhere in the package. Certainly price statement(s) and pr ojected publication date are candidates.

A.4.2 British National Bibliography Minimum CIP Level

The absolute minimum CIP data that could be included in the British National Bibliography is informally defined by the British Library as being "enough information to support a buying decision". These data items are listed belo w.

The British Library has recently experimented with cataloguing CD-ROMs for the British National Bibliography. Although problems were encountered with installation and in finding the information from display screens, satisfactory record s could be made using existing UKMARC fields. The records are considerably longer than those produced for paper publications, largely accounted for by the 'system requirements' entry held in a notes field. A final policy decision on which fields to show in BNB has yet to be made. A pilot copy of BNB (for internal use only) has been produced containing the records.

"System requirements" would therefore need to be added to the above list to support a buying decision for a CD and "mode of access" or "format" for online publications.

A.5 Other Relevant Standards

National cataloguing rules incorporating the ISBDs need to be taken into account as these standards to a large extent form the basis of the cataloguing function of the national libraries. Where appropriate BIBLINK should take accou nt of any guidance within these standards as regards description of electronic resources. Although the aim of these standards is wider than merely identification of element sets, they may prove helpful in this process.

Identification of minimal data elements as part of the Dublin Core activity may also influence national libraries. Much international and cross-professional effort has gone into identifying the fifteen elements in the Dublin Core data e lement set.

A.5.1 ISBD(ER)

The following statements and details are taken from the final draft version of the International Standard Bibliographic Description for Electronic Resources document that has been circulated to the appropriate IFLA standing comm ittees for approval. The primary purpose of ISBD is to provide the stipulations for compatible descriptive cataloguing world-wide in order to aid the international exchange of bibliographic records throughout the library and information community, with t he objective of being subsumed in national cataloguing codes.. ISBD(ER) is the standard for electronic resources. As such, it specifies the elements required for the description and identification of electronic items, assigns an order to the elements of the description and specifies a system of punctuation. The latter has not been considered here as not being of relevance to BIBLINK at this point. Its provisions relate principally to the bibliographic records, in their various forms, produced by natio nal bibliographic agencies. ISBDs are not concerned with access points, such as name headings, as these are handled by cataloguing rules.

The following table shows the areas and elements specified in ISBD(ER). Those that are considered optional are shown in italics.

 

Area

 

Element

1

Title and statement of responsibility

1.1

Title proper

   

1.2

General material designation

   

1.3

Parallel title

   

1.4

Other title information

   

1.5

Statements of responsibility; first, subsequent

2

Edition

2.1

Edition statement

   

2.2

Parallel edition statement

   

2.3

Statements of responsibility relating to the edition: first; subsequent

   

2.4

Additional edition statement

   

2.5

Statements of responsibility following an additional edition statement: first; subsequent

3

Type and extent of resource

3.1

Designation of resource

   

3.2

Extent of resource

4

Publication and distribution etc.

4.1

Place of publication, production and/or distribution etc.: first; subsequent

   

4.2

Name of publisher, producer and/or distributor etc.

   

4.3

Statement of function of distributor

   

4.4

Date of publication, production and/or distribution etc.

   

4.5

Place of manufacture

   

4.6

Name of manufacturer

   

4.7

Date of manufacture

5

Physical description

5.1

Specific material designation and extent of item

   

5.2

Other physical details

   

5.3

Dimensions

   

5.4

Accompanying material statement

6

Series

6.1

Title proper of series or sub-series

   

6.2

Parallel title of series or sub-series

   

6.3

Other title information

   

6.4

Statements of responsibility relating to the series or sub-series: first; subsequent

   

6.5

ISSN

   

6.6

Numbering within series or sub-series

7

Note(s)

7.

These can cover any ISBD area and are optional for the most part. "System requirements" are mandatory for offline items and "Mode of access" for remote access items

8

Standard number (or alternative) and terms of availability

8.1

Standard number (or alternative)

   

8.2

Key title

   

8.3

Terms of availability and/or price

A.5.2 AACR2, level 1 record

AACR stipulates the rules for the formulation of descriptions of library materials which are based on the general framework of ISBD in terms of order of elements and punctuation. The rules set out three recommended levels of de scription containing those elements that must be given as a minimum by libraries choosing that level of description. The purpose of the catalogue for which the entry is being constructed will govern the choice of level. The simplest is the level 1 stand ard and may be considered appropriate for the CIP-type of record BIBLINK hopes to create. The elements required for this level are listed below.

Title proper/first statement of responsibility

Edition statement

Material (or type of publication) specific details

First publisher, etc., date of publication etc.

Extent of item

Notes

Standard number

A.5.3 Dublin Core

The Dublin Core Element set is dealt with in detail in WP1 Metadata formats. Since that report further refinement of the Dublin Core element structure took place at the 4 th Dublin Core Workshop in Canberra. The origin al group of thirteen elements has a been expanded to fifteen. Insofar as it is possible in the rapidly changing Internet world - it is now considered to be a stable format. A syntax has been agreed and a small group of qualifiers has been accepted. It is currently being applied in a variety of projects internationally and offers a set of elements appropriate to the objectives of BIBLINK.

It should be noted that the primary criteria for inclusion of the various Dublin Core elements was resource discovery (rather than selection or acquisition). All the elements in DC are optional and repeatable.

The following table shows the revised elements of the Dublin Core:

DC Element

DC Label

DC definition

1. Title

TITLE

The name given to the resource by the CREATOR or PUBLISHER.

2. Author or Creator

CREATOR

The person(s) or organization(s) primarily responsible for the intellectual content of the resource

3. Subject and Keywords

SUBJECT

The topic of the resource, or keywords or phrases that describe the subject or content of the resource.

4. Description

DESCRIPTION

A textual description of the content of the resource, including abstracts in the case of document-like objects or content descriptions in the case of visual resources.

5. Publisher

PUBLISHER

The entity responsible for making the resource available in its present form, such as a publisher, a university department, or a corporate entity. The intent of specifying this field is to identify the entity that provides access to the resource.

6.Other Contributors

CONTRIBUTORS

Person(s) or organization(s) in addition to those specified in the CREATOR element who have made significant intellectual contributions to the resource but whose contribution is secondary to the individuals or entities specified in the CREATOR element

7. Date

DATE

The date the resource was made available in its present form.

8. Resource Type

TYPE

The category of the resource, such as home page, novel, working paper, pre-print, technical report, essay, dictionary.

9. Format

FORMAT

The data representation of the resource, such as text/html, ASCII, Postscript file, executable application, or JPEG image. To indicate usability.

10. Resource Identifier

IDENTIFIER

String or number used to uniquely identify the resource.

11. Source

SOURCE

The work, either print or electronic, from which this resource is derived, if applicable.

12. Language

LANGUAGE

Language(s) of the intellectual content of the resource.

13. Relation

RELATION

Relationship to other resources - e.g. images in a book

14. Coverage

COVERAGE

The spatial locations and temporal durations characteristic of the resource

15. Rights Management

RIGHTS

The content of this element is intended to be a link (a URL or other suitable URI as appropriate) to a copyright notice, a rights-management statement to allow providers a means to associate terms and conditions or copyright statements with a resource.

It is recommended that the agreed CIP data elements be mapped to Dublin Core. Immediate issues that require thought are treatment of serials (at issue level), for example statements relating to frequency). Also how to deal with edition statements.

A.6 Requirements of National Libraries

As part of WP1 the partner libraries provided details of the data they would like to hold for electronic publications. It is an extensive list which can be found in full in D1.1 Metadata Formats, Section 6. Now that BIBLINK has de cided to produce CIP-type records this list has been refined to contain only the elements required to the support the functions mentioned in A.3.1 above.

Data Element

Comment

 

author, personal/corporate

Person or body primarily responsible for the intellectual content.

 

other contributors

Statements of responsibility for multiple contributions.

 

definitive title

Variants of the title can appear on boxes, accompanying information and internal sources - these often conflict.

 

date

Date of publication.

 

price

   

terms and conditions

For online items - free of charge or by account.

 

language

   

edition

   

general material designation

e.g. [computer file].

 

specific material designation

e.g. tape, diskette etc.

 

type of computer file

e.g. data, program etc.

 

extent of file

e.g. size, number of records contained.

 

additional information

e.g. sound, image, text, multimedia etc.

 

system requirements/mode of access

offline: e.g. 386SX or higher with 4MB ram etc.

online: Internet via WWW, html etc.

 

subject keywords

Controlled vocabulary.

 

unique identifier

e.g. ISBN.

 

place of publication

?

 

publisher

Agency responsible for producing the publication.

 

URL

for online publications

 

frequency

How often it will appear - for serials

 

It is the intention that this proposed set be used as the basis for discussion towards reaching consensus on the minimum BIBLINK CIP data element set.

It is recommended that to assist consensus building the proposed CIP data elements be mapped to Dublin Core.

Annex B - Notes on Mapping of National Libraries' Metadata Requirements to Dublin Core

B.1 Introduction

D1.1 Metadata Formats, Section 6 identified the metadata requirements of the participating national libraries [1]. The metadata requirements were further refined in D4.1 Format Conversion Feasibility to support only a CIP-type func tion [2]. The resulting table can be found in D4.1 Section 9.

The purpose of this paper is to map these metadata requirements to Dublin Core, and especially to identify where both qualifiers and extensions would have to be made to the Dublin Core elements.

D1.1 suggests that national libraries wish to create records in various flavours of MARC and would intend to apply detailed cataloguing rules to the content of the records. The formats used by the participating libraries vary, but are usually based on ISBD or AACR2. For the purpose of this paper, it is assumed that ISBD or AACR2 style formats would be desired for the national libraries' metadata requirements for electronic publications.

One issue is that ISBD (CF) has recently been revised, and the resulting draft document - ISBD (ER) - has proposed a number of important revisions to the standard [3]. Specifically with regard to Internet resources, the type of file de signations (currently limited to "Data" or "Program") have been reworked with considerable changes. If ISBD (ER) is approved, the General Material Designation in ISBD will change from "Computer file" to "Electronic resource".

B.2 DC Qualifiers

The DC-4 meeting in Canberra proposed the formal identification of the structure of elements and possible qualifiers in Dublin Core [4]. In response, Rebecca Guenther has recently produced a proposal for Dublin Core qualifiers/subst ructure which includes specific proposals for the qualifiers "scheme" and "type" [5]. These proposals are currently under discussion in the Dublin Core community. Guenther reiterates the Canberra meeting's insistence that the "type" qualifier should only be used to refine elements, not to extend their semantics and that each element should have a default meaning. The qualifiers can be understood as follows:

If "scheme" and "type" can not meet these principles, then an extensibility mechanism should be used.

B.3 Extensibility

Where the national libraries' metadata requirements can not be described using Dublin Core - with or without qualifiers - then new elements can be proposed.

B.4 The Mapping

BIBLINK Data Element

Dublin Core

Author, personal

Creator.Personal.

Possibly with SCHEME: "Library of Congress Name Authority File", if required.

Author, corporate

Creator.Corporate.

Possibly with SCHEME "Library of Congress Name Authority File" if required.

other contributors, personal

Contributor.Personal.

Possibly with SCHEME "Library of Congress Name Authority File" if required

other contributors, corporate

Contributor.Corporate

Possibly with SCHEME "Library of Congress Name Authority File" if required.

definitive title

Title.

date

Date.

Guenther's proposed SCHEME default for DC is ISO 8601 [e.g. 1997-07-22]. There are currently six proposed levels of granularity for dates in DC [6]:

  • Year: YYYY (e.g. 1997)
  • Year and month: YYYY-MM (e.g. 1997-07)
  • Complete date: YYYY-MM-DD (e.g. 1997-07-16)
  • Complete date plus hours and minutes: YYYY-MM-DDThh:mmTZD (e.g. 1997-07-16T19:20+01:00)
  • Complete date plus hours, minutes and seconds: YYYY-MM-DDThh:mm:ssTZD (e.g. 1997-07-16T19:20:30+01:00)
  • Complete date plus hours, minutes, seconds and decimal fractions of a second: YYYY-MM-DDThh:mm:ss.sTZD (e.g. 1997-07-16T19:20:30.45+01:00)

price

Extension to DC needed.

It could possibly be included under Rights, although this element is intended to be a link to an URL or similar. A specific extension would be better.

terms and conditions

Rights.

The Default for this element in DC is free text, although it is intended for a link to a URL.

language

Language.

Guenther suggests that the content of this DC element should coincide with NISO Z39.53 three character codes, but the default SCHEME is free text. If USMARC/Library of Congress style language codes are used (as in UNIMARC), then the SCH EME should be "Z39.53".

edition

Extension to DC needed.

general material designation

Extension to DC needed [?].

The relevant GMD is currently "Computer file" for both ISBD and AACR2. It is possible that this could become "Electronic resource" following the draft ISBD (ER). As only one GMD is required for this type of resource, it is possible tha t it could be generated as a default, and therefore is not necessary to be included in the DC elements required.

specific material designation

Format [?].

There is some debate as to how useful SMDs would be for networked electronic resources. Nancy Olson's Cataloging Internet Resources manual [7], which is based on AACR2, omits all of the physical description area "because there i s no physical item being cataloged". SMDs would, however, be appropriate for things like CD-ROMs.

More discussion is required on what information is specifically required in this area.

type of computer file

Type [?].

The current types of computer file used: "Program" and "Data", have been extended in the proposed ISBD (ER). More discussion needed on this requirement.

extent of file

Extension to DC needed [?].

As with the SMD (above) there is a need to discuss how this field would be used in relation to networked electronic resources.

additional information

Type [?].

Vague heading, standing for "sound", "image", "text", "multimedia", etc. Guenther suggests that DC Type should possibly come from an enumerated list, but default is likely to be free-text.

system requirements/mode of access

Format [?]

 

subject keywords

Subject.

Keyword is the default for DC Subject.

unique identifier

Identifier.

Guenther proposes URL as default, so other globally- unique identifiers, which would be candidates for the element, would require a relevant "scheme": URN, ISBN, ISSN, SICI, etc.).

place of publication

Extension to DC needed.

publisher

Publisher.

URL

Identifier.

URL is the default DC Identifier, so no "scheme" would be required.

frequency

Extension to DC needed.

B.5 Notes

B.6 Examples

The following examples are intended for illustration and discussion purposes only and are not definitive.

B.6.1 Web Page

Dublin Core:

Title: Taylor-Schechter Unit Home Page

Creator.Corporate: Cambridge University Library

Subject: Taylor-Schechter Genizah Research Unit; Cairo Geniza, papyrus

Publisher: University of Cambridge

Date: 19970605

Format: text/html

Identifier: http://www.lib.cam.ac.uk/Taylor-Schechter/

Language: eng

BIBLINK metadata requirements:

Author, personal

Not given.

Author, corporate

DC.creator.corporate: Cambridge University Library

contributors, personal

Not given.

contributors, corporate

Not given.

definitive title

DC.title: Taylor-Schechter Unit Home Page

date

DC.date: 19970605

price

Not relevant.

terms and conditions

Not given.

language

DC.language: eng

edition

Not relevant

general material designation

Electronic resource

specific material designation

Not known.

type of computer file

DC.format: text/html

extent of file

Not known.

additional information

Not known.

system requirements/mode of access

Not given.

subject keywords

DC.subject: Taylor-Schechter Genizah Research Unit; Cairo Geniza, papyrus

unique identifier

DC.identifier: http://www.lib.cam.ac.uk/Taylor-Schechter/

place of publication

Cambridge

publisher

DC.publisher: University of Cambridge

URL

DC.identifier: http://www.lib.cam.ac.uk/Taylor-Schechter/

frequency

Not relevant

B.6.2 CD-ROM

Author, personal

Not available.

Author, corporate

Not available.

contributors, personal

DC.contributor SCHEME=Library of Congress Name Authority File: Migne, J.P. (Jacques Paul), 1800-1875

contributors, corporate

Not available.

definitive title

DC.title: Patrologia latina database

date

DC.date: 1993

price

Not available.

terms and conditions

Not available

language

DC.language: lat

edition

Not available.

general material designation

Electronic resource

specific material designation

computer laser optical disks [CD-ROM ?].

type of computer file

data

extent of file

2 computer laser optical disks ; 4 3/4 in

additional information

Not known

system requirements/mode of access

Multimedia PC 486x or higher, 8mb memory, CD-ROM drive, sound card, SVGA 256-colour monitor, Windows 95 or Windows 3.1

subject keywords

DC.subject: Early Christian Literature; Patristics;

unique identifier

Not available.

place of publication

Cambridge

publisher

DC.publisher: Chadwyck-Healey

URL

Not available.

frequency

Not relevant

B.7 References

[1] Heery, R., et al . BIBLINK - LB 4034: D1.1 Metadata Formats . 23 December 1996. <URL: http://www.ukoln.ac. uk/metadata/BIBLINK/wp1/ >

[2] Heery, R. et al. BIBLINK - LB 4034: D4.1 Format Conversion Feasibility. July 1997.

[3] Byrum, J.D. ISBD (ER) formerly ISBD (CF). SCATNews , No. 7, March 1997. <URL: http://ifla.inist.fr /VII/s13/scatn/news7.htm >

[4] Weibel, S., Ianella, R. and Cathro, W. The 4th Dublin Core Metadata Workshop Report: DC-4, March 3 - 5, 1997, National Library of Australia, Canberra. D-Lib Magazine , June 1997. <URL: http://www.dlib.org/dlib/june97/metadata/06weibel.html>

[5] Guenther, R. Dublin Core qualifiers/substructure: a proposal . 15 April 1997. <URL: http://www.loc.gov/marc/dcq ualif.html >

[6] Wolf, M. Re: DATE format . email to meta2 list <meta2@mrrl.lut.ac.uk> from Misha Wolf <misha.wolf@reuters.com>, Wed, 16 Jul 1997 19:23:20 +0000 (GMT).

[7] Olson, N. Cataloging Internet resources: a manual and practical guide. OCLC, 1995.

<URL:http://www.oclc.org/oclc/man/9256cat/toc.htm>

Also used:

Knight, J. and Hamilton, M. Dublin Core qualifiers . 21 February 1997. <URL: http://www.roads.lut.ac.uk/metadata/DC-SubElements.html>

The Dublin Core Homepage

<URL:http://purl.oclc.org/metadata/dublin_core>