|
Identification
Work Package 2 of Telematics for Libraries project BIBLINK (LB 4034) |
Title page Table of Contents |
In this section a brief overview and comparison of
the different schemes described in section 5 is given. Each of
the schemes is then compared and evaluated separately according
to the requirements. A conclusion is drawn for each identification
scheme.
The comparison of the different identification schemes
can be found in table 2. Each identification scheme is compared
to the requirements set up in section 4.3.
For explanation of the columns and a complete listing
of the requirements, see section 4.3:
| Id-scheme | 1. Coverage | 2. Authorised | 3. Standard | 4. Uniqueness | 5. Persistence | 6. Extensibility | 7. Human readable | 8. Transportable | 9. Validation | |
| ISSN | Only serial items, medium independent | Under control of a central agency. | Yes.
ISO 3297, ANSI/NISO Z39.9 | Yes | Yes | No | Yes | Yes | Yes | |
| ISBN | All monographs, medium independent. | Under control of a central agency. | Yes.
ISO 2108 | Yes in principle. Depending on the publishers. | Yes. | No. | Yes | Yes | Yes | |
| SICI | Serial issues and articles, medium independent. | Under control of a central agency. | Yes.
ANSI/NISO Z39.56 | Yes in principle. | Yes. | Yes. | Yes | Yes | Yes. | |
| PII | Items within serial or monographic titles, medium independent. | No responsible authority for assignment. Assigned by the publishers. | No. The ISSN or ISBN included are both ISO standards. | No. | Yes. | Yes. | Yes. | Yes | Yes | |
| DOI | On-line and off-line(?) documents. | Under control of a central agency. | No. Under development. | Yes | Yes | Yes | Yes | Yes | Depending on external resolution. | |
| URN | All Internet documents. | Under control of a central agency. | No. Under development. | Yes. | Yes. | Yes. | Yes | Yes | Depending on external resolution. | |
| PURL | All World Wide Web documents. | No | No. | Yes. | Uncertain. Depending on the resolver service. | Yes. | Yes. | Yes | Depending on external resolution. |
Table 2: Comparison of identification schemes.
Each of the identification schemes is evaluated individually
on the basis of the investigations made in section 5 and a conclusion
is drawn on whether or not the identification scheme meets the
requirements set up for a BIBLINK identifier.
The ISSN can only be used for serials, including
electronic serials, both off-line and on-line. There might be
restrictions regarding "the size" of the serial, or
rather the importance of the publisher. World-wide, ISSN is not
assigned to small pamphlets or leaflets published by, for instance
small special interest groups or to electronic serials that are
regarded as an advertisement (e.g. that only consist of abstracts
or a table of contents) for the serial on paper. This is not a
major problem within the BIBLINK scope, since most of these documents
would be excluded from the national bibliography anyway.
The ISSN is an ISO standard (ISO 3297) and under
control of both an international and by national or regional agencies.
An ISSN will be both globally unique and persistent.
The same content issued on different media (e.g. paper, CD-ROM
and World Wide Web) will, according to the guidelines, be assigned
different ISSNs. The identification scheme may not be extended
but recently a new medium code has been implemented in the ISSN
Register and a new linking field has been added in the ISSN format
to link the different medium editions.
The ISSN may easily be transcribed for citation purposes
and transported by Internet protocols. ISSNs could easily be validated
locally.
The ISSN is widely used on all paper documents and
is, to an increasing extent, also used on electronic documents.
6.3.1.1 ISSN conclusion
The ISSN meets all the requirements set up for a BIBLINK identifier except that it does not cover all documents in the BIBLINK scope. Section 4.1 Requirement introduction stated that it will not be possible to recommend only one identification scheme for BIBLINK. Based on this premise, it is not a significant problem that the ISSN does not cover all documents in the scope. The ISSN could be used for serials and other identifiers for non-serial documents.
Although the ISSN can be assigned to all electronic
serials, the practice has varied widely in the past according
to the policies of (the institutions hosting) ISSN centres in
different countries. Some ISSN agencies will assign ISSNs to electronic
serials only at the request of a publisher, while other agencies
will go looking for new serials. Journals published, for instance,
on paper, CD-ROM and on the Internet may be given three different
ISSNs by some agencies, while other agencies will assign an ISSN
only to the paper issue.
This might be a question of resources, not necessarily
of principle. Basically the ISSN network has adopted a policy
of assigning ISSNs also to all electronic serial publications.
However several ISSN centres have less technical knowledge or
resources or are part of institutions which do not yet have definite
policies concerning this issue. Some policies can, for example,
be closely related to the national depository policies or laws.
The ISSN is widely used and is a very well known
identifier to all traditional publishers. The same may not be
the case for «new publishers» of electronic publications.
Many of the «new publishers» are not true publishers
but organisations who only publish, for instance, a news bulletin.
They are not necessarily acquainted with the ISSN system.
It might be argued that it is irrelevant to assign
ISSNs to on-line serials as an ISSN is primarily seen as a control
number for ordering system or connected to other advantages for
physical media, for instance benefiting from postal reductions.
This will change when trading of on-line publications via the
Internet becomes more widespread. On the other hand some publishers,
for example in the academic sector, may want to use ISSN for prestige
reasons - recognised paper journals have ISSNs, so should electronic
journals.
In the new guidelines from the International ISSN/ISBN
agency a more general need for an unique identifier is emphasised,
e.g. securing bibliographic control and using identifiers as
a control number in databases.
The new guidelines from the international ISBN/ISSN
agencies, which include all electronic documents (both on-line
and off-line), might encourage further use. With one common practice
for all electronic documents, for both ISSN and ISBN, it might
be easier for the national agencies to establish and enforce a
national policy and practice. The spread of the ISSN for electronic
publications will depend on the extent to which national practice
can promote the use of the ISSN on new electronic serials, especially
when it comes to spreading the use among «new publishers».
The ISSN is recommended as an identifier for BIBLINK.
The ISBN was established to cover all printed monographs,
but is now also used on «monographs» issued on microforms,
audio- and videocassettes and on CD-ROMs and floppy disks. According
to the new guidelines from the international ISBN agency on-line
documents will also be included.
The ISBN is a recognised standard and the assignment
of group and publisher identifiers is under the control of international
and national agencies, while the assignment of ISBNs to individual
documents is done by the publishers themselves. Assignment is
under control of international and national guidelines.
In principle an ISBN is unique. The same content
issued in different versions, e.g. on paper and on CD-ROM will,
according to the guidelines, be assigned different ISBNs. The
ISBN is also, in principle, persistent but ultimately it is the
individual publisher who is responsible for ensuring that the
same ISBN is not assigned to several documents and for not reusing
the numbers. ISBNs could be misused by publishers and the only
sanction against this breach of the rules is for the ISBN agency
to refuse to issue any more ISBNs to the publisher or to refuse
to take any books with reused numbers into any databases the agency
may be responsible for (e.g. Whitakers Books in Print).
The identification scheme may not be extended, but
may easily be transcribed for citation purposes and be transported
by Internet protocols. ISBNs could easily be validated locally.
The ISBN is widely used on all paper documents and
has also, to some extent, been assigned to off-line publications
but, to date, not to on-line documents (though there are some
exceptions).
The ISBN meets most of the requirements for a BIBLINK
identifier. There might be a problem with the reuse of numbers,
but this is not regarded as a major problem. The extensibility
could become a problem if the large number of electronic documents
on the Internet were to be allocated ISBNs.
To date, the ISBN has not been assigned to electronic
documents to a significant degree. ISBNs are allocated to off-line
multimedia and composite products that are sold in the book trade
(e.g. CD-ROMs with manual). When traditional publishers issue
multimedia they tend to give them ISBNs, the same is not the case
when the publisher is a «new publisher» or when the
document is published on the Internet.
The purpose of the ISBN has been to facilitate transactions,
e.g. ordering, and this has, up until now, not been appropriate
for on-line products. Online monographic publications on the Internet
are still mainly public domain publications and are not taking
part in the traditional trading and distribution chains. One problem
is that, up until now, Internet technology has not offered the
possibility of secured online transactions. Some of the ISBN agencies
are strategically situated in the centre of the trading and distribution
chain, and this is why Internet publications might not even come
to the attention of ISBN agencies. The ISBN agencies will probably
become more active on the Internet when the trading of online
publications via the Internet becomes more widespread.
In the new guidelines from the International ISSN/ISBN
agency a more general need for a unique identifier is emphasised,
e.g. securing bibliographic control and using identifiers as
a control number in databases.
The new guidelines from the international ISBN/ISSN
agencies, which include all electronic documents (both on-line
and off-line), might encourage their use further. With a single
common practice for all electronic documents, for both the ISSN
and the ISBN, it might be easier for the national agencies to
establish and enforce a national policy and practice. The spread
of the ISBN on electronic publications will depend on the extent
to which national practice can promote the use of ISBN on new
electronic documents, especially when it comes to spreading the
use among «new publishers».
In some cases, on-line documents are likely to change
on a fairly regular basis and consequently ISSN may be more appropriate.
The ISBN is recommended as an
identifier for BIBLINK.
The use of the SICI is limited to serials. A SICI
can be used to identify both a complete document and items in
a document, e.g. articles. The SICI may also be used for electronic
documents, as long as they contain a location number or an enumeration.
This will exclude a large proportion of the documents in the BIBLINK
scope - all non-serial documents. The recent completion of the
BICI will obviously solve this problem.
The SICI is an recognised international standard.
The assignment of a SICI to a document is done by the individual
publishers or others in need of a SICI identifier for a document.
The SICI will be both unique and persistent. There
is a small theoretical possibility that two contributions
can have identical values, but tests indicate that the duplicate
values occur only once per million contributions. Even so the
SICI, as a whole, will be unique. There is also a possibility
that a contribution might be defined by more than one SICI. A
SICI may be derived from different sources and depending on the
information (more or less complete) available when the SICI is
constructed, the same item might be given different contribution
identifiers. This is not regarded as a problem and is not very
likely to occur.
The latest version of the scheme is extended to include
contributions other than articles, e.g. tables of contents. In
principle the SICI could be extended similarly for other reasons,
though this is not a proposition at the moment.
The SICI may easily be transcribed by humans for
citation purposes and be transported by Internet protocols. SICIs
can easily be validated locally however, local comparison may
be more difficult.
The SICI is widely used, not only by publishers but
also by other members of the bibliographic community. SICIs are
primary an aid to finding existing published articles or issues.
The SICI meets most of the requirements set up for
a BIBLINK identifier. SICI does not cover all documents in the
BIBLINK scope.
Section 4.1 Requirement introduction stated
that it will not be possible to recommend only one identification
scheme for BIBLINK. Based on this premise, it is not a problem
that the SICI does not cover all documents in the scope. The SICI
could be used on all serial documents and other identifiers on
non-serial documents, for instance the BICI.
The SICI includes an ISSN and can consequently only
be assigned to documents which are already assigned (or can be
assigned) an ISSN. The use of SICIs on electronic documents will
therefore depend on the policy for assigning ISSN to documents
(See 6.3.1 ISSN). This is not regarded as a problem in the BIBLINK
context, since all documents in the scope could be assigned an
ISSN.
The SICI is recommended as an identifier for BIBLINK.
The PII may be used on all electronic documents,
both serial and non-serial. The identifier is designed to be used
at the item level, i.e. articles in a serial or chapters in a
book and not at the medium level, i.e. on complete documents.
The PII is therefore only appropriate for identifying documents
in the BIBLINK context when single articles or chapters are presented
to the national library for bibliographic recording and/or deposit.
The PII is not under control of any responsible authority and is not a recognised international standard. The identifier is assigned to the documents by the publishers themselves. This implies that the uniqueness and the persistence of the identifier depends on the individual publishers.
Even so, the PII is «guaranteed» to be
both globally unique and persistent. However, it is not unique
in the way described in section 4.3: «A document published
in several versions (e.g. on CD-ROM and World Wide Web) should
have separate unique identifiers». An article published
on, for example, paper and the Internet will carry the same PII
(carrying the ISSN of the original document). An important point
about the PII is that it identifies articles independently from
their packaging unit. Consequently a PII may have one ISSN prefix
and appear in a publication with a different ISSN.
The identifier may, in principle, be extended to
identify components (tables, graphics) and manifestations (e.g.
SGML) of an item. The identifier may easily be transcribed by
humans for citation purposes and be transported by the common
Internet protocols. The PII can be validated locally.
The PII is a «new» identifier, adopted
for all articles published by the publishers involved in the PII
initiative from 1996 onwards, by Springer and some other primary
publishers as well as by their secondary databases (e.g. Chemical
Abstracts). The use of PII is encouraged by the initiators and
is expected to spread. The PII is primarily aimed at documents
of interest to scientific publishers.
The PII fails to meet the most important requirements
set up for a BIBLINK identifier. The fact that the same identifier
will be assigned to items (e.g. articles) issued on different
media is a major objection to the identification scheme. An important
demand of a BIBLINK identifier is that it will identify different
versions of a publication separately.
Another objection is that it is assigned at the item
level and not at the medium level. The PII could therefore not
be used to identify complete documents, which will be the majority
of the documents dealt with in the BIBLINK context. In the main
it will be complete books or serials that will be presented to
the national library for bibliographic recording and/or deposit.
Within a serial contents database an identification scheme like
PII could be interesting, but this interest is outside the BIBLINK
scope. If and when publishers start to publish articles on demand
or to provide a database with full-text articles, an identifier
at the item level might be useful.
The PII is not recommended as an identifier for BIBLINK.
The DOI can be used on all types of electronic documents.
The DOI is designed for on-line documents and there are, at the
time of writing, some concerns about its use for off-line products.
DOIs can also be used for new products beyond traditional print
and online equivalents to print, for example software plug-ins
which create chemical models that can be analysed and rotated.
The DOI may be used to identify any electronic publication
at any level of granularity, both at the item level and at the
medium level. A journal can have a DOI, each of the articles
it contains may also have a DOI and each of the pictures in those
articles may have a DOI.
The DOI is not as yet a recognised standard and is
still under development. According to the plans, the identifier
will be under control of a central agency, which will provide
identifiers to publishers and quality control for the entire DOI
system. The DOI is a «new» initiative, the project
only started in 1996, but the underlying Handle System is already
available and it seems likely that the DOI system will be widely
adopted by the publishing industry. A demonstrator system is now
available and it is planned that a working prototype will be unveiled
at the Frankfurt Book Fair in October 1997.
The DOI is guaranteed to be globally unique, persistent
and extensible. The identifier may easily be transcribed for
humans for citation purposes and be transported by the common
Internet protocols.
The DOI will depend on an external resolution service
for validation. At present most library systems are programmed
to validate and compare identifiers like ISBN and ISSN. If a DOI
is used the library system could not perform these controls locally
but would have to make contact with an external resolution service
for comparison and validation. This would require adaptation of
the existing library systems and well functioning and stable resolution
services.
Except for not being a standard, the DOI meets most
of the requirements set up for a BIBLINK identifier and also,
importantly, the DOI is specifically developed to identify electronic
documents. In addition, the DOI will act as a resolution service
for electronic documents, in a similar way to the PURL system
and what is planned in URN systems. Therefore DOIs may also solve
the «moving URLs» problem. Presumably the DOI system
may eventually become part of a wider URN system.
The DOI is recommended as an identifier for BIBLINK.
An important question is how widespread the use of the identifier
will be, particularly in Europe. Since the DOI will be a very
useful identifier for BIBLINK, the national libraries should encourage
the development of the identifier.
The URN is being developed to identify any electronic
resources on the Internet. Off-line documents, e.g. CD-ROMs or
floppy disks could not be assigned a URN.
The URN will be under the control of an authorised
partner, a «Namespace identifier» who will be responsible
for a correct assignment of URNs.
According to the requirements, the URN will be both
globally unique, persistent and any scheme defined as a «Namespace»
must be extensible. A URN can be transported by Internet protocols
and easily be transcribed by humans for citation purposes.
The URN will depend on an external resolution service
for validation. Currently most library systems are programmed
to validate and compare identifiers like ISBN and ISSN as they
are entered into the system. If a URN is used, the library system
could not perform these functions locally, but would have to make
contact with an external resolution service for comparison and
validation. This would require adaptation of the existing library
systems and well functioning and stable resolution services.
The URN is not a standard as yet and is still under
development.
The development of the URN is closely connected
to the use of URLs. A URL (Uniform Resource Locator) is the most
common way of providing access to an Internet resource and is
used in the hypertext links in World Wide Web documents. URLs
are also recorded in bibliographic catalogues, sometimes as a
hypertext links to the document itself. A URL is not a unique
and persistent identification for an electronic document . The
URL is both the name and the address (location) of the document.
If the document is moved, the URL will change and the document
may be difficult to locate. Duplicate copies of the same content
will have different URLs. This is a general problem for all users
of Internet resources.
There is reason to believe that the URN will be a
part of a resolution service, which will link persistent and unique
URNs to one or more current URLs. A URC (Uniform Resource Characteristics)
is one way of developing a resolution service. A URC could contain
the URN and at least one URL.In addition, a URC might contain
bibliographic and other information about the resource, so called
metadata. This would solve the «moving URLs» problem
and makes it very likely that URNs will be widely adopted on the
Internet
The fact that other identification schemes, for instance
ISSN or SICI, may be recognised as a «Namespace» could
make the URN easy to adopt for publishers. They can continue to
use an identification scheme, which is useful for other reasons
than location, e.g. ordering or management rights and at the same
time have a URN assigned to their documents.
The URN meets most of the requirements for a BIBLINK
identifier, except that it can not be used to identify off-line
documents. Section 4.1 Requirement introduction stated
that it will not be possible to recommend only one identification
scheme for BIBLINK. Based on this premise, it is not a significant
problem that the URN does not cover all documents in the scope.
The URN could be used for all on-line documents and other identifiers
could be used for off-line documents.
There is the concern that the identifier does not
«exist» as yet, but when (if) it does the identifier
will be very suitable for identifying all on-line documents in
the BIBLINK scope. The success of the URNs may be dependent on
a well organised resolution service.
The URN is recommended as an identifier for BIBLINK.
The PURL may be used to identify any World Wide Web
document, but can not be assigned to off-line documents. A PURL
can be assigned at both the document level and at the item level,
i.e. to individual parts of the document.
The PURL is not a standard and will not become one.
The PURL service is set up as a short term solution and will be
replaced by URNs when this identification scheme is available.
There is no international or national authority responsible
for assigning PURLs. The resolver service (OCLC or other) will
be responsible for the linkage between PURLs and URLs, but publishers
may allocate PURLs for their own documents.
PURLs are unique and easily extensible. Their persistence
is dependent on the lifetime of the resolver service. However
it is expected that organisations that are willing to set up a
PURL resolution service show commitment on a long term basis.
PURLs can be easily transcribed by humans for citation purposes
and transported by the common Internet protocols
PURLs depend on an external resolution service for
validation. Currently most library systems are programmed to validate
and compare identifiers like ISBN and ISSN locally, as they are
entered into the system. If a PURL is used, the library system
could not perform these functions locally, but would have to make
contact with an external resolution service for comparison and
validation. This would require adaptation of existing library
systems and well functioning and stable resolution services.
Important requirements for a BIBLINK identifier are
not met by the PURL. The PURL initiative is a short term solution
and is not be recommended as an identifier for BIBLINK.
A PURL may still be of interest to the national libraries as a temporary solution for keeping track of Internet resources until the URN is available. If documents recorded in the national bibliography are assigned a PURL, many of the problems concerning electronic document and catalogue maintenance will be solved.
| Next | Table of Contents |