|
Minimum Data Set
Work Package 3 of Telematics for Libraries project BIBLINK (LB 4034) |
|
|
Name of Client: |
European Commission |
|
|
Distribution List: |
Pat Manson, European Commission |
|
|
Author: |
Robina Clayphan |
|
|
Authorised by: |
Ross Bourne |
|
|
Contractual Date: |
August 1997 |
|
|
Date of Issue: |
September 1997 |
|
|
Issue: |
1.0 |
|
|
Reference: |
5003/del-86 |
|
|
Total Number of Pages: |
||
|
Contact Details for Level-7: |
The British Library
Internet: robina.clayphan@bl.uk |
|
|
Issue |
Date of Issue |
Comments |
|
0.1 |
August 1997 |
Internal distribution for comment |
|
1.0 |
September 1997 |
Final version for delivery to Commission |
This report lists the national bibliographic functions that BIBLINK records will support and presents the list of data elements which the national bibliographic agencies wish to receive from publishers for the BIBLINK demonstrator.
BIBLINK, Library, bibliographic, national library, data set, Dublin Core, MARC, CIP
BIBLINK demonstrator a multi-national demonstrator, developed as part of the BIBLINK project, which provides an environment for the transmission of bibliographic records between publishers and NBAs.
Bibliographic record a discrete bibliographic description stored either manually or electronically.
CD-ROM Compact Disc Read Only Memory.
CIP Cataloguing-In-Publication records, created using information supplied pre-publication by the publisher.
Deposit of publications a system in operation in most countries, usually legally enforced, whereby publishers must deposit one or more copies of every publication with nominated libraries. Often referred to as Legal Deposit. P>
DOI Digital Object Identifier
DTD Document Type Definition.
Dublin Core a metadata format defined on the basis of international consensus which defines a minimal information resource description for use in a WWW environment. The term 'Dublin' is used as Dublin, Ohio is the location of OCLC's headquarters.
Electronic mail a means for an originator of information to distribute information to an unlimited number of recipients via a value added network service which mimics the functions of the paper postal services.
Electronic publication document, file, journal, etc. made available in electronic form.
Electronic publisher see publisher.
email see Electronic mail.
Format in the context of bibliographic control, the formalised structure in which the specific elements of bibliographic description are accommodated.
HTML Hypertext Mark-up Language The standard language used for creating Web documents.
HTTP HyperText Transfer Protocol. The protocol used for communication between Web clients and servers.
IFLA International Federation of Library Associations and Institutions.
Internet Publisher an organisation or person who publishes documents on the Internet. These will be on-line documents.
ISBD International Standard Bibliographic Description. There are seven specific ISBDs as well as the general ISBD(G): monographs -(M), serial publications -(S), cartographic material -(CM), non-book material - (NBM), printed mu sic -(PM), antiquarian publications -(A), computer files -(CF).
ISBN International Standard Book Number.
ISSN International Standard Serial Number.
Legal Deposit see Deposit of Publications.
MARC MAchine Readable Cataloguing. A family of formats based on ISO 2709 for the exchange of bibliographic and other related information in machine readable form. For example, USMARC and UNIMARC.
Metadata information about a publication as opposed to the content of the publication; includes not only bibliographic description but also other relevant information such as its subject, price, conditions of use, etc.
National Bibliography a listing of all national publications. May include all publications produced in that country, or in the language of that country, or sometimes about that country.
NBA National Bibliographic Agency.
Off-line publication an off-line publication is an electronic document which is bibliographically identifiable, which is stored in machine readable form on an electronic storage medium. For example a CD-ROM.
On-line publication see on-line resource.
Serial a publication in any medium issued in successive parts bearing numeric or chronological designations and intended to be continued indefinitely. Serials include periodicals; newspapers; annuals (reports, yearbooks, etc.) ; the journals, memoirs, proceedings, transactions etc. of societies; and numbered monographic series.
SGML (ISO 8879 ) Standard Generalised Mark-up Language. ISO standard for document description, separating contents and structure.
SSSH Simplified SGML for Serials Headers.
UNIMARC see MARC
URL U niform Resource Locator. The standard way to give the address of a source of information on the WWW. It contains four different parts: the protocol type, the machine name, the directory path and the file name. For example : http://WWW2.echo.lu/libraries/en/libraries.html
This document is the final report on the minimum data set task in work package 3. It is drawn from a background and issues paper produced for consideration by the partners and the results of the subsequent discussion at the worksh op in Mo i Rana on 28 July 1997.
The partners reached consensus on the functions of national bibliographies that will be supported by BIBLINK records and the set of data elements necessary to do so. This report details those functions and presents a listing of the 18 data elements.
It has been agreed that this minimum data set will use the format and syntax of the Dublin Core and a definitive mapping will be produced for inclusion in the final version of D4.1.
WP1 Metadata Formats
D1.1 produced a list of data elements that the national libraries would like to receive for electronic publications. This was a comprehensive list produced to survey the territory early in the project.
WP3 Consensus Building
D3.2 resulted in a refinement of the scope and objectives of the project.
WP4 Format Conversion (running concurrently with WP3)
Interim Report D4.1 contained a discussion paper which proposed a subset of elements from the original list, limiting the elements to those that would support the objectives agreed in D3.2. A preliminary mapping of this proposed set to Dublin Core was produced for discussion. The paper showing this preliminary mapping and outlining related issues is appended as Annex B to this document.
WP3 Minimum Data Set
The discussion paper, appended as Annex A to this report, resulted in agreement as to the functions of a national bibliography and CIP that should be supported by BIBLINK records and the final list of data elements is produced here. A definitive mapping to Dublin Core will be produced to form part of the final version of D4.1.
WP6 Authentication
Consideration of Draft Report D6.1 resulted in agreement that one data element should be included to allow for a simple checksum operation.
In the BIBLINK demonstrator national bibliographic agencies will receive bibliographic data from publishers via a limited number of interfaces and convert it to MARC formats. In order to produce a usable MARC record a certain minim um number of data elements will be required. One of the tasks of WP3 was to reach consensus between the partners as to the minimum data set to be transmitted.
An initial survey of the data requirements of national libraries for electronic publications was carried out at an early stage in the project. This exercise produced an extensive list of elements which can be found in D1.1 section 6. Since that time the scope and objectives of the project have been refined. BIBLINK now intends to produce CIP-type records. The original list was therefore reduced to contain only those elements that are needed to support the CIP functions of a national bibliography. The revised list was proposed for discussion by the partners.
Due to the timing of and interdependencies between work packages the background work for this report, including the revised list of elements, was interleaved with work on WP4 - Format Conversion Feasibility. To facilitate mappings betw een the proposed BIBLINK data set and other formats the paper that would logically form the body of this report was produced in advance and appeared as Section 9 of the interim report D4.1. It is not considered necessary to rewrite that paper but it is a ppended as Annex A to this report for ease of reference.
Based on that initial paper, one other suggested data set, the preliminary mappings produced for work package 4 and informed by the results of the publisher consensus meetings reported in D3.2, the partners reached consensus on the mini mum data set required for BIBLINK. This report outlines those discussions and presents the agreed data set.
An analysis of the uses made of the national bibliography, the functions of CIP records and consideration of relevant standards can be found in the original document produced to facilitate the mappings in WP4. That paper is appende d as Annex A to this document and to obtain the relevant background information to the following matters it is suggested that the Annex is read first.
The discussions at the workshop were based on the proposals in the discussion paper and it was confirmed that the functions listed in Annex A.3.1 were those that should be supported by the BIBLINK demonstrator. As a minimum there fore, the data obtained from publishers must be sufficient to support the following activities:
It is recognised that it becomes difficult to distinguish between the 'acquisition' and 'access' functions in relation to many online documents. Access is not a traditional function of a national bibliography but this need not be cons idered a problem for the demonstrator: there is no desire to exclude information simply because it does not fit the traditional pattern.
To expedite agreement on the minimum data set consideration was given to three suggested sets. The first was the revised set produced by the survey of national libraries (Annex A.6), the second that considered by the British Librar y (Annex A.4.2). The third set was that already in use on the web forms at the Koninklijke Bibliotheek (KB) for publishers to enter data about their electronic publications . The last two of these three lists were merged and the individual elements disc ussed, modified and extended. The resulting set of elements is shown below and will be used for BIBLINK.
BIBLINK encompasses both on and off-line publications and both serials and monographic publications so this list is sufficiently comprehensive to allow data applicable to both types to be recorded. It is recognised that the meaning of the element names may appear ambiguous to a publisher generating the data. A comprehensive 'User Manual' explaining the meaning of and appropriate entries for each element will be created to be used in conjunction with the BIBLINK tools.
The following list shows the agreed set of data elements.
|
Element |
Comment |
|
Author |
Person or body primarily responsible for the intellectual content of the publication. |
|
Title |
|
|
Publisher |
Agency responsible for producing the publication. |
|
Place of Publication |
|
|
Price |
Defined as a 'simple, retail price'. Anything more complex, such as price for site licence, numbers of users etc. will form part of the 'Terms and Conditions' element. |
|
Extent (Size) |
For offline publications - for recording details such as the number of physical objects. |
|
Keywords |
Possibly from a controlled vocabulary. |
|
Description |
Possibly an abstract. |
|
Edition/version statement |
This element is named to accommodate the differences in terminology used by libraries and producers of electronic documents. |
|
Date of publication |
|
|
System requirements |
This incorporates 'Mode of Access' offline: e.g. 386SX or higher with 4MB ram etc. online: Internet via WWW, Adobe etc. |
|
Format |
HTML, pdf etc. |
|
Language |
Of the publication |
|
Terms and conditions |
For online items |
|
Frequency |
For serials |
|
Identifier |
URL, ISBN, DOI etc. |
|
Contributor |
Statement(s) of responsibility for multiple contributions. Can be qualified by DC 'Type' qualifier |
|
Checksum |
Simple MD5 hash applied either by NBA or publisher. |
An initial mapping of the revised national libraries element list was carried out as part of work package 4 and presented as a paper which provided information for the workshop discussions. It is appended here as Annex B. Now that the minimum data set has been agreed, additional work will be necessary to map the modified element set to DC elements wherever possible and define extensions to DC to accommodate elements that cannot easily be mapped. This will be produced in the final version of D4.1.
In the Dublin Core all elements are optional and repeatable. Further discussion will be necessary to decide which elements should be regarded as compulsory for which type of publication if the record is to acceptable in the BIBLINK con text.
Further discussion of the implementation of DC and the syntax for extensions and modifications can be found in D4.1.
Consensus has been reached that the objective of the BIBLINK demonstrator is to produce a CIP-type record for electronic publications using data supplied by publishers. To do this, national libraries must define a minimum set of da ta elements that would form a usable record for national bibliographies. In this chapter we first briefly identify the function of a national bibliography and the function of CIP records. Various available 'core' bibliographic descriptions are discussed. The data fields identified by the BIBLINK partners are then considered. These data fields were identified as the requirements of national libraries and listed in the BIBLINK report D1.0 - Metadata Formats but it must be borne in mind that these were s pecified before the objective of producing a CIP-type record was agreed. The original list of elements is therefore comprehensive rather than minimal, and in this section we propose a reduction to the minimal requirements of CIP.
BIBLINK is concerned with the production of CIP-type records for national bibliographies. The area where the functions of CIP intersect with those of the national bibliography is therefore the legitimate territory where the minimum set of data elements can be identified.
In order to suggest a minimum data set it is necessary to take a step back and identify the functions supported by CIP records in a national bibliography. As a minimum, the records produced by the BIBLINK demonstrator should support th ose same functions. Whilst not wanting simply to impose traditional patterns on new technology this will serve as a useful starting point for building consensus.
As discussed in D0.1 - Scoping Document, Section 5, selection criteria for national bibliographies have never been uniform in terms of content or coverage. At the moment most national bibliographic agencies are considering thei r policies regarding the inclusion of electronic publications but no common practice has yet emerged. In addition, the main tool for bibliographic control, the deposit of publications, may or may not be a legal requirement. National bibliographies can t herefore be seen to be very different animals in different countries.
Despite these differences, the primary role of a national bibliography is the same throughout: to create a record of the nation's published output however this may be defined. This record then serves various purposes for different user communities. By extension, a secondary role can be seen: that of serving the future as well as the present by providing a historical record of publications even if the items themselves no longer exist. Patterns of publication will be discernible to the yet unborn researcher.
The main users of national bibliographies are librarians and other information professionals whose role is that of intermediary between sources of information and those seeking to find and use that information.
These information professionals use the national bibliography for the following purposes:
Records for electronic publications are needed for exactly the same purposes. The picture becomes slightly less clear-cut in the case of online documents as technology permits direct access to the publication, thereby blurring the dist inction between access and acquisition. More may therefore be demanded of a record than has traditionally been the case.
CIP programmes provide publishers with the opportunity to register the existence of a publication at the earliest possible moment. In addition, they can be seen to support three of the functions listed above. The data supplied by the publisher is used to create records which can be included in the national bibliography and supplied to third parties and, by providing timely information about publications, CIP records assist in the tasks of selection and acquisition. To some extent they act as an alerting service for the library community.
As a minimum, therefore, the data obtained from publishers must be sufficient to enable the BIBLINK demonstrator to support the following activities:
It is worth discussing the acquisition function a little further as this is one area where the use made of information about paper publications will differ considerably from electronic publications. In the sense used by national biblio graphies, data relating to the acquisition of a publication can be seen as supporting purchasing. In other words, giving the user the information required to contact the supplier of the publication in order to acquire it. For online publications which have to be paid for, the same may be true. At the moment, however, a commercial model for electronic publications has not fully evolved and moreover, many publications that NBs may wish to record are available without charge. Information contained in the record data can therefore lead directly to the publication rather than the supplier.
Access for the end-user is not the usual function of a national bibliography, but of a library catalogue. It would obviously be inappropriate to exclude access information from BIBLINK simply because it is not a traditional function of the national bibliography. How BIBLINK chooses to deal with this issue should be discussed by the partners as part of the consensus building process.
In considering the elements appropriate to a 'core' CIP type record it may be useful to consider particular 'core' data element sets proposed at the British Library.
The British Library has recently specified a three layer model for the record structure in the BL catalogue: the first layer is the core containing the standardised description and authority data, the second layer special biblio graphic data and the third layer the data needed for stock management. For details of the required basic record for UKMARC see D4.1, Annex B, the full text of which is available as http://portico.bl.uk/nbs/marc/correc.html. This example of a core record requirement is closely related to as yet unpublished UNIMARC minimal level record (Guideline no. 5). Details of requirements for records describing electronic resources have yet to be finalised for UKMARC and so cannot be included in this Annex.
The BL core data contains a level of detail in the description that closely corresponds to the full second level as defined in AACR2 Chapter 1.0D. It is the level of description to which the BL will aspire for all newly created materi al. The second level is a fuller record than the AACR2 first level and as such may be considered too extensive for BIBLINK purposes.
BIBLINK may require that other data elements be mandatory or mandatory if applicable. At present the General Material Designation is optional. For records describing electronic resources, the term [electronic resource] may well be n ecessary, although this could be generated automatically on conversion. The full address of publisher could be another element which would be mandatory for smaller publishers unless it is held elsewhere in the package. Certainly price statement(s) and pr ojected publication date are candidates.
The absolute minimum CIP data that could be included in the British National Bibliography is informally defined by the British Library as being "enough information to support a buying decision". These data items are listed belo w.
The British Library has recently experimented with cataloguing CD-ROMs for the British National Bibliography. Although problems were encountered with installation and in finding the information from display screens, satisfactory record s could be made using existing UKMARC fields. The records are considerably longer than those produced for paper publications, largely accounted for by the 'system requirements' entry held in a notes field. A final policy decision on which fields to show in BNB has yet to be made. A pilot copy of BNB (for internal use only) has been produced containing the records.
"System requirements" would therefore need to be added to the above list to support a buying decision for a CD and "mode of access" or "format" for online publications.
National cataloguing rules incorporating the ISBDs need to be taken into account as these standards to a large extent form the basis of the cataloguing function of the national libraries. Where appropriate BIBLINK should take accou nt of any guidance within these standards as regards description of electronic resources. Although the aim of these standards is wider than merely identification of element sets, they may prove helpful in this process.
Identification of minimal data elements as part of the Dublin Core activity may also influence national libraries. Much international and cross-professional effort has gone into identifying the fifteen elements in the Dublin Core data e lement set.
The following statements and details are taken from the final draft version of the International Standard Bibliographic Description for Electronic Resources document that has been circulated to the appropriate IFLA standing comm ittees for approval. The primary purpose of ISBD is to provide the stipulations for compatible descriptive cataloguing world-wide in order to aid the international exchange of bibliographic records throughout the library and information community, with t he objective of being subsumed in national cataloguing codes.. ISBD(ER) is the standard for electronic resources. As such, it specifies the elements required for the description and identification of electronic items, assigns an order to the elements of the description and specifies a system of punctuation. The latter has not been considered here as not being of relevance to BIBLINK at this point. Its provisions relate principally to the bibliographic records, in their various forms, produced by natio nal bibliographic agencies. ISBDs are not concerned with access points, such as name headings, as these are handled by cataloguing rules.
The following table shows the areas and elements specified in ISBD(ER). Those that are considered optional are shown in italics.
|
Area |
Element |
||
|
1 |
Title and statement of responsibility |
1.1 |
Title proper |
|
1.2 |
General material designation |
||
|
1.3 |
Parallel title |
||
|
1.4 |
Other title information |
||
|
1.5 |
Statements of responsibility; first, subsequent |
||
|
2 |
Edition |
2.1 |
Edition statement |
|
2.2 |
Parallel edition statement |
||
|
2.3 |
Statements of responsibility relating to the edition: first; subsequent |
||
|
2.4 |
Additional edition statement |
||
|
2.5 |
Statements of responsibility following an additional edition statement: first; subsequent |
||
|
3 |
Type and extent of resource |
3.1 |
Designation of resource |
|
3.2 |
Extent of resource |
||
|
4 |
Publication and distribution etc. |
4.1 |
Place of publication, production and/or distribution etc.: first; subsequent |
|
4.2 |
Name of publisher, producer and/or distributor etc. |
||
|
4.3 |
Statement of function of distributor |
||
|
4.4 |
Date of publication, production and/or distribution etc. |
||
|
4.5 |
Place of manufacture |
||
|
4.6 |
Name of manufacturer |
||
|
4.7 |
Date of manufacture |
||
|
5 |
Physical description |
5.1 |
Specific material designation and extent of item |
|
5.2 |
Other physical details |
||
|
5.3 |
Dimensions |
||
|
5.4 |
Accompanying material statement |
||
|
6 |
Series |
6.1 |
Title proper of series or sub-series |
|
6.2 |
Parallel title of series or sub-series |
||
|
6.3 |
Other title information |
||
|
6.4 |
Statements of responsibility relating to the series or sub-series: first; subsequent |
||
|
6.5 |
ISSN |
||
|
6.6 |
Numbering within series or sub-series |
||
|
7 |
Note(s) |
7. |
These can cover any ISBD area and are optional for the most part. "System requirements" are mandatory for offline items and "Mode of access" for remote access items |
|
8 |
Standard number (or alternative) and terms of availability |
8.1 |
Standard number (or alternative) |
|
8.2 |
Key title |
||
|
8.3 |
Terms of availability and/or price |
AACR stipulates the rules for the formulation of descriptions of library materials which are based on the general framework of ISBD in terms of order of elements and punctuation. The rules set out three recommended levels of de scription containing those elements that must be given as a minimum by libraries choosing that level of description. The purpose of the catalogue for which the entry is being constructed will govern the choice of level. The simplest is the level 1 stand ard and may be considered appropriate for the CIP-type of record BIBLINK hopes to create. The elements required for this level are listed below.
Title proper/first statement of responsibility
Edition statement
Material (or type of publication) specific details
First publisher, etc., date of publication etc.
Extent of item
Notes
Standard number
The Dublin Core Element set is dealt with in detail in WP1 Metadata formats. Since that report further refinement of the Dublin Core element structure took place at the 4 th Dublin Core Workshop in Canberra. The origin al group of thirteen elements has a been expanded to fifteen. Insofar as it is possible in the rapidly changing Internet world - it is now considered to be a stable format. A syntax has been agreed and a small group of qualifiers has been accepted. It is currently being applied in a variety of projects internationally and offers a set of elements appropriate to the objectives of BIBLINK.
It should be noted that the primary criteria for inclusion of the various Dublin Core elements was resource discovery (rather than selection or acquisition). All the elements in DC are optional and repeatable.
The following table shows the revised elements of the Dublin Core:
|
DC Element |
DC Label |
DC definition |
|
1. Title |
TITLE |
The name given to the resource by the CREATOR or PUBLISHER. |
|
2. Author or Creator |
CREATOR |
The person(s) or organization(s) primarily responsible for the intellectual content of the resource |
|
3. Subject and Keywords |
SUBJECT |
The topic of the resource, or keywords or phrases that describe the subject or content of the resource. |
|
4. Description |
DESCRIPTION |
A textual description of the content of the resource, including abstracts in the case of document-like objects or content descriptions in the case of visual resources. |
|
5. Publisher |
PUBLISHER |
The entity responsible for making the resource available in its present form, such as a publisher, a university department, or a corporate entity. The intent of specifying this field is to identify the entity that provides access to the resource. |
|
6.Other Contributors |
CONTRIBUTORS |
Person(s) or organization(s) in addition to those specified in the CREATOR element who have made significant intellectual contributions to the resource but whose contribution is secondary to the individuals or entities specified in the CREATOR element |
|
7. Date |
DATE |
The date the resource was made available in its present form. |
|
8. Resource Type |
TYPE |
The category of the resource, such as home page, novel, working paper, pre-print, technical report, essay, dictionary. |
|
9. Format |
FORMAT |
The data representation of the resource, such as text/html, ASCII, Postscript file, executable application, or JPEG image. To indicate usability. |
|
10. Resource Identifier |
IDENTIFIER |
String or number used to uniquely identify the resource. |
|
11. Source |
SOURCE |
The work, either print or electronic, from which this resource is derived, if applicable. |
|
12. Language |
LANGUAGE |
Language(s) of the intellectual content of the resource. |
|
13. Relation |
RELATION |
Relationship to other resources - e.g. images in a book |
|
14. Coverage |
COVERAGE |
The spatial locations and temporal durations characteristic of the resource |
|
15. Rights Management |
RIGHTS |
The content of this element is intended to be a link (a URL or other suitable URI as appropriate) to a copyright notice, a rights-management statement to allow providers a means to associate terms and conditions or copyright statements with a resource. |
It is recommended that the agreed CIP data elements be mapped to Dublin Core. Immediate issues that require thought are treatment of serials (at issue level), for example statements relating to frequency). Also how to deal with edition statements.
As part of WP1 the partner libraries provided details of the data they would like to hold for electronic publications. It is an extensive list which can be found in full in D1.1 Metadata Formats, Section 6. Now that BIBLINK has de cided to produce CIP-type records this list has been refined to contain only the elements required to the support the functions mentioned in A.3.1 above.
|
Data Element |
Comment |
|
|
author, personal/corporate |
Person or body primarily responsible for the intellectual content. |
|
|
other contributors |
Statements of responsibility for multiple contributions. |
|
|
definitive title |
Variants of the title can appear on boxes, accompanying information and internal sources - these often conflict. |
|
|
date |
Date of publication. |
|
|
price |
||
|
terms and conditions |
For online items - free of charge or by account. |
|
|
language |
||
|
edition |
||
|
general material designation |
e.g. [computer file]. |
|
|
specific material designation |
e.g. tape, diskette etc. |
|
|
type of computer file |
e.g. data, program etc. |
|
|
extent of file |
e.g. size, number of records contained. |
|
|
additional information |
e.g. sound, image, text, multimedia etc. |
|
|
system requirements/mode of access |
offline: e.g. 386SX or higher with 4MB ram etc. online: Internet via WWW, html etc. |
|
|
subject keywords |
Controlled vocabulary. |
|
|
unique identifier |
e.g. ISBN. |
|
|
place of publication |
? |
|
|
publisher |
Agency responsible for producing the publication. |
|
|
URL |
for online publications |
|
|
frequency |
How often it will appear - for serials |
It is the intention that this proposed set be used as the basis for discussion towards reaching consensus on the minimum BIBLINK CIP data element set.
It is recommended that to assist consensus building the proposed CIP data elements be mapped to Dublin Core.
D1.1 Metadata Formats, Section 6 identified the metadata requirements of the participating national libraries [1]. The metadata requirements were further refined in D4.1 Format Conversion Feasibility to support only a CIP-type func tion [2]. The resulting table can be found in D4.1 Section 9.
The purpose of this paper is to map these metadata requirements to Dublin Core, and especially to identify where both qualifiers and extensions would have to be made to the Dublin Core elements.
D1.1 suggests that national libraries wish to create records in various flavours of MARC and would intend to apply detailed cataloguing rules to the content of the records. The formats used by the participating libraries vary, but are usually based on ISBD or AACR2. For the purpose of this paper, it is assumed that ISBD or AACR2 style formats would be desired for the national libraries' metadata requirements for electronic publications.
One issue is that ISBD (CF) has recently been revised, and the resulting draft document - ISBD (ER) - has proposed a number of important revisions to the standard [3]. Specifically with regard to Internet resources, the type of file de signations (currently limited to "Data" or "Program") have been reworked with considerable changes. If ISBD (ER) is approved, the General Material Designation in ISBD will change from "Computer file" to "Electronic resource".
The DC-4 meeting in Canberra proposed the formal identification of the structure of elements and possible qualifiers in Dublin Core [4]. In response, Rebecca Guenther has recently produced a proposal for Dublin Core qualifiers/subst ructure which includes specific proposals for the qualifiers "scheme" and "type" [5]. These proposals are currently under discussion in the Dublin Core community. Guenther reiterates the Canberra meeting's insistence that the "type" qualifier should only be used to refine elements, not to extend their semantics and that each element should have a default meaning. The qualifiers can be understood as follows:
If "scheme" and "type" can not meet these principles, then an extensibility mechanism should be used.
Where the national libraries' metadata requirements can not be described using Dublin Core - with or without qualifiers - then new elements can be proposed.
|
BIBLINK Data Element |
Dublin Core |
|
Author, personal |
Creator.Personal. Possibly with SCHEME: "Library of Congress Name Authority File", if required. |
|
Author, corporate |
Creator.Corporate. Possibly with SCHEME "Library of Congress Name Authority File" if required. |
|
other contributors, personal |
Contributor.Personal. Possibly with SCHEME "Library of Congress Name Authority File" if required |
|
other contributors, corporate |
Contributor.Corporate Possibly with SCHEME "Library of Congress Name Authority File" if required. |
|
definitive title |
Title. |
|
date |
Date. Guenther's proposed SCHEME default for DC is ISO 8601 [e.g. 1997-07-22]. There are currently six proposed levels of granularity for dates in DC [6]:
|
|
price |
Extension to DC needed. It could possibly be included under Rights, although this element is intended to be a link to an URL or similar. A specific extension would be better. |
|
terms and conditions |
Rights. The Default for this element in DC is free text, although it is intended for a link to a URL. |
|
language |
Language. Guenther suggests that the content of this DC element should coincide with NISO Z39.53 three character codes, but the default SCHEME is free text. If USMARC/Library of Congress style language codes are used (as in UNIMARC), then the SCH EME should be "Z39.53". |
|
edition |
Extension to DC needed. |
|
general material designation |
Extension to DC needed [?]. The relevant GMD is currently "Computer file" for both ISBD and AACR2. It is possible that this could become "Electronic resource" following the draft ISBD (ER). As only one GMD is required for this type of resource, it is possible tha t it could be generated as a default, and therefore is not necessary to be included in the DC elements required. |
|
specific material designation |
Format [?]. There is some debate as to how useful SMDs would be for networked electronic resources. Nancy Olson's Cataloging Internet Resources manual [7], which is based on AACR2, omits all of the physical description area "because there i s no physical item being cataloged". SMDs would, however, be appropriate for things like CD-ROMs. More discussion is required on what information is specifically required in this area. |
|
type of computer file |
Type [?]. The current types of computer file used: "Program" and "Data", have been extended in the proposed ISBD (ER). More discussion needed on this requirement. |
|
extent of file |
Extension to DC needed [?]. As with the SMD (above) there is a need to discuss how this field would be used in relation to networked electronic resources. |
|
additional information |
Type [?]. Vague heading, standing for "sound", "image", "text", "multimedia", etc. Guenther suggests that DC Type should possibly come from an enumerated list, but default is likely to be free-text. |
|
system requirements/mode of access |
Format [?] |
|
subject keywords |
Subject. Keyword is the default for DC Subject. |
|
unique identifier |
Identifier. Guenther proposes URL as default, so other globally- unique identifiers, which would be candidates for the element, would require a relevant "scheme": URN, ISBN, ISSN, SICI, etc.). |
|
place of publication |
Extension to DC needed. |
|
publisher |
Publisher. |
|
URL |
Identifier. URL is the default DC Identifier, so no "scheme" would be required. |
|
frequency |
Extension to DC needed. |
Edition or Version
Place of Publication
Frequency of Publication
The following examples are intended for illustration and discussion purposes only and are not definitive.
Title: Taylor-Schechter Unit Home Page
Creator.Corporate: Cambridge University Library
Subject: Taylor-Schechter Genizah Research Unit; Cairo Geniza, papyrus
Publisher: University of Cambridge
Date: 19970605
Format: text/html
Identifier: http://www.lib.cam.ac.uk/Taylor-Schechter/
Language: eng
BIBLINK metadata requirements:
|
Author, personal |
Not given. |
|
Author, corporate |
DC.creator.corporate: Cambridge University Library |
|
contributors, personal |
Not given. |
|
contributors, corporate |
Not given. |
|
definitive title |
DC.title: Taylor-Schechter Unit Home Page |
|
date |
DC.date: 19970605 |
|
price |
Not relevant. |
|
terms and conditions |
Not given. |
|
language |
DC.language: eng |
|
edition |
Not relevant |
|
general material designation |
Electronic resource |
|
specific material designation |
Not known. |
|
type of computer file |
DC.format: text/html |
|
extent of file |
Not known. |
|
additional information |
Not known. |
|
system requirements/mode of access |
Not given. |
|
subject keywords |
DC.subject: Taylor-Schechter Genizah Research Unit; Cairo Geniza, papyrus |
|
unique identifier |
DC.identifier: http://www.lib.cam.ac.uk/Taylor-Schechter/ |
|
place of publication |
Cambridge |
|
publisher |
DC.publisher: University of Cambridge |
|
URL |
DC.identifier: http://www.lib.cam.ac.uk/Taylor-Schechter/ |
|
frequency |
Not relevant |
|
Author, personal |
Not available. |
|
Author, corporate |
Not available. |
|
contributors, personal |
DC.contributor SCHEME=Library of Congress Name Authority File: Migne, J.P. (Jacques Paul), 1800-1875 |
|
contributors, corporate |
Not available. |
|
definitive title |
DC.title: Patrologia latina database |
|
date |
DC.date: 1993 |
|
price |
Not available. |
|
terms and conditions |
Not available |
|
language |
DC.language: lat |
|
edition |
Not available. |
|
general material designation |
Electronic resource |
|
specific material designation |
computer laser optical disks [CD-ROM ?]. |
|
type of computer file |
data |
|
extent of file |
2 computer laser optical disks ; 4 3/4 in |
|
additional information |
Not known |
|
system requirements/mode of access |
Multimedia PC 486x or higher, 8mb memory, CD-ROM drive, sound card, SVGA 256-colour monitor, Windows 95 or Windows 3.1 |
|
subject keywords |
DC.subject: Early Christian Literature; Patristics; |
|
unique identifier |
Not available. |
|
place of publication |
Cambridge |
|
publisher |
DC.publisher: Chadwyck-Healey |
|
URL |
Not available. |
|
frequency |
Not relevant |
[1] Heery, R., et al . BIBLINK - LB 4034: D1.1 Metadata Formats . 23 December 1996. <URL: http://www.ukoln.ac. uk/metadata/BIBLINK/wp1/ >
[2] Heery, R. et al. BIBLINK - LB 4034: D4.1 Format Conversion Feasibility. July 1997.
[3] Byrum, J.D. ISBD (ER) formerly ISBD (CF). SCATNews , No. 7, March 1997. <URL: http://ifla.inist.fr /VII/s13/scatn/news7.htm >
[4] Weibel, S., Ianella, R. and Cathro, W. The 4th Dublin Core Metadata Workshop Report: DC-4, March 3 - 5, 1997, National Library of Australia, Canberra. D-Lib Magazine , June 1997. <URL: http://www.dlib.org/dlib/june97/metadata/06weibel.html>
[5] Guenther, R. Dublin Core qualifiers/substructure: a proposal . 15 April 1997. <URL: http://www.loc.gov/marc/dcq ualif.html >
[6] Wolf, M. Re: DATE format . email to meta2 list <meta2@mrrl.lut.ac.uk> from Misha Wolf <misha.wolf@reuters.com>, Wed, 16 Jul 1997 19:23:20 +0000 (GMT).
[7] Olson, N. Cataloging Internet resources: a manual and practical guide. OCLC, 1995.
<URL:http://www.oclc.org/oclc/man/9256cat/toc.htm>
Also used:
Knight, J. and Hamilton, M. Dublin Core qualifiers . 21 February 1997. <URL: http://www.roads.lut.ac.uk/metadata/DC-SubElements.html>
The Dublin Core Homepage
<URL:http://purl.oclc.org/metadata/dublin_core>