|
Authentication
Work Package 6 of Telematics for Libraries project BIBLINK (LB 4034) |
Title page Table of Contents |
According to the Technical Annex, the objectives of this work package are to:
According to a first scoping memo on workpackage 6 distributed in April 1996 by Titia van der Werf, related but more specific objectives are to:
3.2.1 Introduction
It is not our intention to replicate existing studies and research on authentication methods and techniques. This report will consider the needs for authentication in the BIBLINK context and examine the suitability of various existing authentication procedures and methods. A most important objective of this workpackage therefore is to determine how authentication can best be defined within the BIBLINK scope. This was discussed extensively at the BIBLINK workshop on authentication, held at Madrid on January 10th, 1997. The workshop results delimit the scope of this workpackage and define what authentication means in the BIBLINK context.
3.2.2 Authentication Workshop
At the workshop, held in Madrid in January 1997, several authentication models were discussed. The following models are an adaptation of those rendered in the minutes [ref. 1].
1) Authentication of metadata during the transfer of metadata
One model applies authentication to the metadata, during the transfer process between publisher and national bibliographic agency. This model addresses issues like data integrity and security and the correct identification of the sender and receiver of data. This type of authentication is part of the work of WP5 (transmission of data) because authentication during transmission is taken care of by the communication protocol used.
|
Publisher |
Transfer process |
Library |
|
Publisher produces metadata and electronically transfers metadata to the library. |
A mechanism is required here to authenticate the metadata exchanged between publisher, library and third parties. This type of authentication is part of the work of WP5, transmission of data, and will probably come 'free' from the chosen communication software. |
The national bibliographic agency produces metadata and electronically transfers metadata to the publisher. |
2) Authentication of metadata after the cataloguing process
Another model applies authentication to the metadata, after it has been processed by the national bibliographic agency. After the exchange of raw metadata elements with publishers, the national bibliographic agencies proceed to produce full bibliographic records. These are based on authoritative data provided by the publishers and added value data. These records may be re-used, manipulated and adapted to other requirements by third parties (other libraries, trade bibliographic agencies, Internet services). A mechanism to assure that the identity and integrity of the original record is maintained may be necessary. Authentication of the metadata itself , once processed as a finished product, is not, however within the scope of BIBLINK.
|
Library |
Cataloguing process |
Outside World |
|
The library collects authenticated metadata and produces metadata to describe a publication. |
The metadata is processed into a full bibliographic record, according to documented cataloguing rules. The enhanced record is added to the National Bibliography database. |
The library may choose to provide full bibliographic records from the NB to third parties for re-use. The authentication of the NB record is outside the scope of BIBLINK. |
Yet another model applies authentication to the electronic publication itself, as a means to ensure that a copy of this publication is an authentic copy, identical to the original publication. Two different situations arise, depending on whether the publication is held in a controlled environment (publishing server or deposit server) or in an uncontrolled environment (Internet public domain fileserver).
3) Authentication of a publication within the controlled environment
This model applies authentication within the controlled domain of the national bibliographic agency, in particular. But it could be extended to apply within a controlled library network (Pica, for example) or a controlled network of libraries and publishers with distributed document servers. It applies, a priori, to all types of electronic documents stored in a controlled environment (i.e. off-line publications or snap-shots of on-line publications). The authentication ensures that an item held in an electronic document store is the same item as identified and described by a specific set of metadata managed by the national bibliographic agency.
|
Library |
One-to-one relationship |
Library |
|
The library produces metadata to describe an item. |
A mechanism is required here to link the metadata at the time of creation with the item it describes. The link should ensure that any update or change in the content of the item (migration to another format for preservation) entails an update of the metadata. This mechanism may also require some form of version control |
The item is stored physically within the library's controlled environment (deposit). The storage procedure is documented. |
4) Authentication of a publication outside the controlled environment
This model attempts to apply authentication when the item is not within the domain of a library (i.e. on-line publications), and applies when items are not deposited or collected by the library. This model derives from the BIBLINK scoping statement (see D0.1), that the National Bibliography need not necessarily describe only items held in the deposit collection. This model applies a priori to all types of material, but with special emphasis on online material. It may indeed be fairly safely assumed that the bulk of the off-line material will be deposited at the library and thereby fall under the regime of authentication model 3). The authentication process in model 4) then would have to verify that metadata sets from the National Bibliography actually match with copies of items they identify outside the domain of the library, at any given time.
|
Library |
One-to-one-relationship |
Outside World |
|
The library produces metadata to describe an item which is available outside the controlled environment of the library (e.g. an external Web publication). |
A mechanism is required here to ensure that at any time, the metadata managed by the library describes the item available on-line. This mechanism may also require some form of version control. |
The item is held on external storage media with unknown document management rules. In the public domain, the item may be duplicated and altered without notice. |
This latter model may require bi-lateral agreements with managers of electronic document stores in order to attain clear procedural rules. If such electronic archive managers on the Internet for example, are willing to participate in BIBLINK, it should be possible to show, during the demonstration stage of the project, the feasibility of this type of authentication model.
3.2.3 Working definition of authentication:
During the workshop on authentication the following definition for BIBLINK purposes was opted:
BIBLINK shall take 'authentication' to mean the guarantee that a piece of metadata actually describes a given electronic publication, and only that publication. In other words, there is a one-to-one relationship between an electronic publication and its metadata and this relationship can be authenticated.
3.2.4 Conclusion
BIBLINK authentication purposes are defined by 2 models identified as: model 3) authentication of a publication within the controlled environment and model 4) authentication of a publication outside the controlled environment.
For reasons set forth above special emphasis lies on online material.
Because the crux of BIBLINK is to involve publishers, model 3) will not be restricted to the controlled environment of the national library only (deposit). In fact, the deposit is a particular instance of model 3, which does not involve publishers nor any other party and is therefore, strictly speaking, not within the scope of BIBLINK. The NEDlib project is intended to develop a generic deposit library infrastructure for electronic publications and will deal with model 3) in detail. However, instances where publishers are actively taking part in a controlled environment (like in the WebDOC project, for example), may provide interesting input for model 3 and yet stay within BIBLINK scope.
Electronic document delivery projects may provide interesting illustrations of model 3) and model 4) implementations. A short review of document delivery projects is therefore required to study both models in more detail - with a focus on authentication procedures and techniques applied in those projects.
This Glossary describes terms which are most relevant in the context of the BIBLINK project and this deliverable.
Archival Records: Archival documents
ATHENS: ATHENS is an authentication service. It enables controlled access to
subscription services provided by third party suppliers. It is made available to the UK education & research community via the NISS and BIDS services.
Authentication: 1. (Archives) The act of determining that a document, or a reproduction of a document, is what it purports to be. 2. (Databases) Confirmation that a record entered on a database is of the approved standard. 3. (Security) Process whereby the receiver of a digital message can be confident of the identity of the sender and the integrity of the message.
Bibliographic control: the creation, development, organisation, management and exploitation of records prepared firstly to describe items held in libraries or on databases, and secondly to facilitate user access to such items.
Bibliographic description: a set of formalised data elements describing a publication.
Bibliographic record: a discrete bibliographic description stored either manually or electronically.
Bibliographic verification: a process whereby the accuracy and completeness of a given set of bibliographic data on a publication is verified with "verification tools". Typically, these tools are library catalogs, bibliographic services and in print lists. In the context of inter-library loan, bibliographic verification is also a means to determine the existence, imprint, location and the access data needed to order a specific title.
CIP: Cataloguing-In-Publication records, created using information supplied pre-publication by the publisher.
CD-ROM: Compact Disc Read Only Memory.
CoBRA+: CoBRA+ is a "concerted action" amongst European national libraries funded by the European Commission. Its aim is to discuss issues and identify actions needed to promote R & D work which will stimulate resource sharing and service developments at a European level.
Copyright: Legal protection of the author of an original work with intellectual content. It gives the owner of copyright exclusive rights concerning the reproduction, distribution, display or performance of the copyrighted work.
Database (DB): a computer Program for entering, storing and retrieving items of information in a structured fashion.
Decomate: Document delivery project aiming to provide end-user access to copyright materials in electronic form.
Deposit of publications: a system in operation in most countries, usually legally enforced, whereby publishers must deposit one or more copies of every publication with nominated libraries. Often referred to as Legal Deposit.
DESIRE: Development of a European Service for Information and Research.
Digital signature: Unique codes derived by an algorithm from the document. They function like handwritten signatures for printed documents.
Digital time stamping: Digital time stamping is an extension of the digital signature technique. It involves sending the digital signature to a secure authentication server. The server then returns a new signature which is a combination of the original signature, the signature of the previous document the server has authenticated, and a date-time stamp. this signature, also called a certificate, is included in the bibliographic description in order to allow other copies of the document to be checked against the authenticated version.
DOI: Digital Object Identifier. The Association of American Publishers has designed a system for marking digital objects in order to facilitate electronic commerce and enable copyright management systems. This system is called the DOI-system.
DC: Dublin Core Metadata Initiative.
Electronic publication: document, file, journal, etc. made available in electronic form.
eLib: Electronic Libraries Programme of the UK.
Embedded metadata: Metadata contained within the body of the resource.
Encryption: The mechanism of coding data transmitted by various telecommunication systems so that only authorised users may have access to it: this may be relevant for sensitive information or to ensure that only those paying for a certain service can obtain it. Increasingly sold as Utilities for use in Computer security to prevent unauthorised access to data.
Fingerprint: Digital identification system and method for electronic documents.
Format: In the context of bibliographic control, the formalised structure in which the specific elements of bibliographic description are accommodated.
Hashing: Hashing is a cryptographic technique to uniquely identify an electronic document.
HTML: Hypertext Mark-up Language. The standard language used for creating Web documents.
HTTP: HyperText Transfer Protocol. The protocol used for communication between client and server applications on the Internet.
IETF: Internet Engineering Taskforce.
ILL: Inter-Library Loan.
Imprint: This reflects the year in which the copyright was gained, which may not be the year the title was released.
Infobike: electronic document delivery project from the eLib Programme
Internet: the world wide network of computer systems connected to each other.
ISBD: International Standard Bibliographic Description. There are seven specific ISBDs as well as the general ISBD(G): monographs -(M), serial publications -(S), cartographic material -(CM), non-book material -(NBM), printed music -(PM), antiquarian publications -(A), computer files -(CF).
ISSN: International Standard Serial Number.
Legal Deposit: see Deposit of Publications.
MARC: Machine Readable Cataloguing. A family of formats based on ISO 2709 for the exchange of bibliographic and other related information in machine readable form. For example USMARC and UNIMARC.
Metadata: information about a publication as opposed to the content of the publication; includes not only bibliographic description but also other relevant information such as its subject, price, conditions of use, etc.
Monograph: a : publication either complete in one part or complete, or intended to be completed, in a finite number of separate parts. A non-serial publication.
Multimedia: a publication in which images, sound and text are integrated.
National Bibliography: a listing of all national publications. May include all publications produced in that country, or in the language of that country, or sometimes about that country.
NEDlib: Networked European Deposit Library (NEDlib) is a project proposal drafted under the auspices of CoBRA+. Its goal is the joint development by national libraries in Europe of a generic deposit system for electronic publications with a view to achieve long term storage, preservation and use.
On-line publication: an on-line publication is an electronic document which is bibliographically identifiable, which is stored in machine readable form on an electronic storage medium and which is available on-line. For example - a Web page.
Off-line publication: an off-line publication is an electronic document which is bibliographically identifiable, which is stored in machine readable form on an electronic storage medium. For example a CD-ROM.
Pica: Pica is the library automation centre in the Netherlands. It develops systems and services for libraries and other information providing institutions. Its objective is to improve the efficiency of information supply by promoting co-operation between the participating institutions.
PICS: Platform for Internet Content selection, an infrastructure for associating labels with Internet content.
Publications: documents containing either text or sound or images, or combinations of these, packaged for wider distribution, whether off-line (e.g. printed book, CD-ROM) or on-line (e.g. Web, database for information retrieval).
PKI: The public key infrastructure (PKI) is a system of digital certificates, certificate authorities, registration authorities, certificate management service, and X.500 directories that verify the identity and authority of each party involved in any transaction over the Internet.
Publisher: a person or organisation that produces documents and makes them available. Newly emerging publishers may produce and distribute documents electronically - for instance, on the Web.
RFC: Request For Comments, a method by which standards (sic) are proposed and agreed, usually with reference to the Internet.
Serial: a publication in any medium issued in successive parts bearing numeric or chronological designations and intended to be continued indefinitely. Serials include periodicals; newspapers; annuals (reports, yearbooks, etc.); the journals, memoirs, proceedings, transactions etc. of societies; and numbered monographic series.
Time stamp: Document certifying the fact that a certain file existed at a certain time.
Voluntary Deposit: see Deposit of Publications.
W3C: World Wide web Consortium
WebDOC: WEBDOC is a project in which Pica, university libraries and (commercial) publishers co-operate in building a central catalogue of electronic documents accessible via the Web. Document delivery via the central catalogue WebCAT is a major goal of this project.
1. Transmission and Authentication Workshop Minutes. BIBLINK project document. 5003/DEL/61.