INTERNET-DRAFT A. Van Kerckhoven
Document: draft-avk-bib-music-rec-00.txt Fibonacci
June 21, 1999
Expires December 21, 1999
Music Records Markup Language (MuReML)
1. Status of this memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC 2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
To view the list Internet-Draft Shadow Directories, see
http://www.ietf.org/shadow.html.
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
Copyright Notice
Copyright (C) The Internet Society (1999). All Rights Reserved.
2. Abstract
Many music libraries, music centers, authors societies, music
publishers, music shops, broadcasting companies and public need to
share bibliographic musical records. No standard format exists and
exchanging musical records involves an important pre- and/or
post-processing of these data.
Searching, sorting and cataloging music bibliographical records does
not currently follow any standard. The researcher needs, in each
library and each music information center, to use different
procedure. In some cases, it is just impossible to obtain the
targeted result.
This document defines the requirements for a standard musical
bibliographic format.
3. Introduction
Many music libraries, music centers, authors societies, music
publishers, music shops broadcasting companies and public need to
share bibliographic musical records. No standard format exists and
exchanging musical records involves an important pre- and/or
post-processing of these data.
Searching, sorting and cataloging music bibliographical records does
not currently follow any standard. The researcher needs, in each
library and each music information center, to use different
procedure. In some cases, it is just impossible to obtain the
targeted result.
The following format may be the base of a standard for musical
bibliographic records.
It designed as an XML application, as defined by W3C in
REC-xml-19980210 accessible at http://www.w3.org/TR/REC-xml.html.
It features all properties of XML metalanguage : a structural
extensibility, validity controls, independence of data and
formatting, and it allows heterogeneity of data sources and targets.
XML will likely be supported by the main web browsers in a short
future.
This format fits the goals and recommendations of RFC-2413 (Dublin
Core Metadata for Resource Discovery) :
- Simplicity of creation and maintenance
- Commonly understood semantics
- Conformance to existing and emerging standards
- International scope and applicability
- Extensibility
- Interoperability among collections and indexing systems
Organizations may automate to any degree (or not at all) both the
creation of these records (about their own publications) and the
handling of the records received from other organizations.
This format is designed to be simple, for people and for machines,
to be easy to read ("human readable") and create without any special
programs.
The focus of this format has been into many aspects of digital
libraries including searching and accessing techniques that do not
necessarily use bibliographic records (for example, natural language
processing, automatic and full-text indexing). However, the
continued use of bibliographic records is expected to remain an
important part of the library system environment of the future and
its use is an important link between the servers of records and the
clients on site, on line or using a distributed media.
The use of this format is free and encouraged. There are no
limitation on its use.
3. Formal Syntax
The following syntax specification uses the Extensible Markup
Language (XML) 1.0.
3.1. Character set
The characters set used is defined by ISO-10646 (coding characters
on 32 bits) and permits the use of symbols and non latin alphabets.
It is preferable, but not mandatory to explicitly declare it.
All entities defined by ISO-10646 are permitted but "&" and "<".
They can be used, as any other entity, on pointing the referring
ISO-10646 code or the pre-defined XML entities with the standard XML
syntax :
& or & for "&"
$#60; or < for "<"
3.2. Languages
XML's support for multiple human languages, using the "xml:lang"
attribute, handles cases where the same character set is employed by
multiple human languages. In consequence, MuReML is a multi-language
format. It gives the possibility to labellize the chosen language
for each field, and the default language of the record. XML syntax
applied to ISO-639 (for language) and optionally to ISO-3166 (for
regional linguistic particularities) may be used.
My Foot!
Mon Œil!
3.3. Specific formattings
Data of each field may refer to any adequate ISO norm for its
representation. According to XML specification, this norm will be
declared in the opening tag.
F
1999-10-02T20:30:00Z
3.4. Cases
Data are case-sensitive.
3.5. White space and End-of-line handling
The Music Records Markup Language, as a subset of XML, has the same
white space handling and end-of-line handling as specified in
Extensible Markup Language (XML) 1.0 (W3C Recommendation
10-February-1998).
3.6. Grammar
XML has been chosen because it is a flexible, self-describing,
structured data format that supports rich schema definitions, and
because of its support for multiple character sets. XML's self-
describing nature allows any property's value to be extended by
adding new elements.
This format is a "tagged" format with self-explaining alphabetic
tags. It should be possible to prepare and to read bibliographic
records using any text editor, without any special programs.
It is very easy to adapt any database for reading and writing this
format. Converters may be developed to transform such data from this
format to plain text or HTML for example.
As an XML application, the lay-out and the design of the formatted
data may be freely cosen by normalized style sheet mechanism like
Cascading Style Sheet (CSS1, CSS2) or Extensible Stylesheet Language
(XSL).
Since linear records are unable to efficiently manage the relation
between the different kinds of information involved in music records
management, the relational aspect of cataloguing must be maintained.
Each element has a descriptive name intended to convey a common
semantic understanding.
Each packet of data considered in this format contains all
significant information regarding a specific aspect of a record.
This involves that several linked tables with several fields are
necessary. Some fields are mandatory to insure integrity of the
records and consistency of the links.
Each packet starts with an indentifier (eventually random). This
identifier is to check the relative identity of each packet and to
make it traceable. A community of users may use it to identify its
own packets.
3.7. The tables
The various tables must follow the format described below. This
diagram constitutes the minimum requirement of the format. It can be
extended with other tables for particular uses. To fit the needs of
musical records management (for example : highest hierarchy of
tables, unnecessary differentiation of the various contributors...),
this structure lightly diverges from the recommendations of the
Dublin Core Metadata Element Set.
Some tables as one-to-many relationship with others. It involves
that some tables may be repeated as needed, for example for works
with several rights-holders (composer, author, arranger, publisher,
sub-publisher...) or for media with including several versions.
Tables are also optional. They may appear in any order inside a
particular packet.
MEDIA
Records relative to the supports of the versions.
OCCURRENCE
Records relative to the occurrence of a particular version in a
particular format.
RAPPORT
Records relative to the rapport between a particular version
and a rights-holder.
RIGHTS-HOLDER
Records relative to the rights-holders of the works
(composers, librettist, arranger, publisher, sub-publisher...).
VERSION
Records relative to the instrumental versions of the works.
WORK
Records relative to the original works.
3.8. The fields
The various fields should follow the format described below. These
diagrams constitute the minimum requirement of the format. They can
be extended with other fields for particular uses. These
complementary fields names (or tags) have to be built in accordance
of XML requirements.
These fields are repeatable. A missing mandatory field invalidates
the packet.
Each field tag name begins with the parent table name, followed
by an underscore. For example : Monochrone
[M] means Mandatory; a record without it is invalid.
[O] means Optional (here to illustrate the extensibility of MuReML)
[L:FIELD] designs the table targeted by a link. Just the fields are
parts of this format. Links will be optionally used in the database
systems to optimize the data management and the consultation of the
records.
PACKET
-----
[M] id
MEDIA
-----
[M] id
[M] title
[M] type
[O] producer
[O] localization
[O] keywords
[O] notes
OCCURRENCE
---------
[M] id
[M] id_version [l:version]
[M] id_media [l:media]
[O] keywords
[O] notes
VERSION
-------
[M] id
[M] id_work [l:work]
[M] specificity
[O] opus
[O] start_composition_date
[O] start_composition_place
[O] end_composition_date
[O] end_composition_place
[O] keywords
[O] notes
WORK
----
[M] id
[M] title
[O] original language title
[O] US-english title
[O] start_composition_date
[O] start_composition_place
[O] end_composition_date
[O] end_composition_place
[O] duration
[O] citations
[O] keywords
[O] notes
RAPPORT
-------
[M] id
[M] id_rights-holder [l:rights-holder]
[M] id_version [l:version]
[M] status
[O] keywords
[O] notes
RIGHTS-HOLDER
-------------
[M] id
[M] name
[O] type
[O] contact
[O] keywords
[O] notes
3.9. Meta Format and DTD
MuReML is an open format. Communities of users may enlarge it to
their own needs. The minimal elements needed in a DTD to fit the
MuReML specifications are :
4. Example
---------------------- Begin of Example -------------------
AVK990127223015
BS542187935
Two works for four hands
music sheet
Big Deal Publishing
Produced by the publisher
a12354879654-12
PF2H0001
BS542187935
a12354879655-13
PF2H0002
BS542187935
PF2H0001
00000001
piano four-hand
102
ordered by the publisher and dedicated
to AmF
PF2H0002
00000002
piano four-hand
102
ordered by the publisher
00000001
La bella Postina
1999-02-05
00:12:30
modules, rehearsals, repetitive, composer's
introduction
00000002
Jazz
1999-01-15
1999-01-30
00:09:00
5478985251454117
BE_ED001
PF2H0001
original publisher
5478985251454117
BE_CP001
PF2H0001
composer
5478985251454117
BE_CP002
PF2H0002
composer
BE_ED001
Big Deal Publishing
publisher
Alain Van Kerckhoven
post-modernism, classical, Devreese,
Lachert, Lysight, Mendes, Pelecis
Founded in 1989
BE_CP001
Lachert, Piotr
composer
Lachert, Piotr
post-modernism, computer music,
letters music
BE_CP002
Lysight, Michel
composer
Lysight, Michel
post-modernism, repetitive music,
minimalism
------------------------- End of Example -------------------
Indentations and line-breaks are for convenient visualization.
This example illustrates a music sheet (MEDIA BS542187935) titled
"Two works for four hands". It includes one version (PF2H0001 and
PF2H0002) of two different works : "La bella Postina" (00000001) and
"Jazz" (00000002). The first one is published and has two
rights-holders : the publisher Alain Van Kerckhoven (BE_ED001) and
the composer Piotr Lachert (BE_CP001). The second one is unpublished
and has been reproduced with a simple agreement of the composer, who
has the all rights : Michel Lysight (BE_CP002).
For reference, the above example contains 3,405 characters.
5. Mandatory fields description
PACKET_ID
Any value (random, sequential or inductive) marking the beginning
and the end of each packet, in order to avoid merging of packets
in case of a media default.
MEDIA_ID
Identifies the media record and is used in management of these
records.
MEDIA_TITLE
Main title of the media. If necessary, sub-titles or translations of
the title have to fill other fields.
MEDIA_TYPE
Type of support : music sheet, CD, CD-ROM, DVD... Formats of the
information can be described in other fields (encoding, file type,
standard, compression...).
occurrence_ID
Identifies the occurrence of a version in a media.
occurrence_VERSION_ID
Points to a specific version.
occurrence_MEDIA_ID
Points to a specific media.
VERSION_ID
Identifies the version record and is used in management of these
records.
VERSION_WORK_ID
Points to a specific work.
VERSION_SPECIFICITY
Main information making this version different from the other
versions of the same work. It will often contain formation data :
clarinet and piano, flute and piano etc.
WORK_ID
Identifies the work record and is used in management of these
records.
WORK_TITLE
Main title of the media. If necessary, sub-titles or translations
of the title have to fill other fields.
RAPPORT_ID
Identifies the rapport between a rights-holder and a version.
RAPPORT_ID_RIGHTS-HOLDER
Points to a specific rights-holder
RAPPORT_ID_VERSION
Points to a specific rights-holder.
RAPPORT_STATUS
Describes the status of the rights-holder regarding the pointed
version. A composer may be an arranger, and a publisher may be
a librettist.
RIGHTS-HOLDER_ID
Identifies a rights-holder record and is used in management of these
records.
RIGHTS-HOLDER_NAME
Name of the rights-holder (person of company). This includes
composers, publishers, sub-publishers, librettists, transcribers,
illustrators, arrangers, orchestrators etc.
6. Security Considerations
The Music Records Markup Language, as a subset of XML, has the same
security considerations as specified in [RFC-2376].
7. Acknowledgments
This document has benefited greatly from the luminous suggestion by
Mark Needlaman to move my first format proposition
(draft-avk-bib-music-rec-01.txt) into a XML application.
Thanks to John Stracke for introducing the Dublin Core to me.
Thanks to Steve Coya and to IESG for critics of the first release of
this memo.
Thanks to the "lazy bits" of Brussels. You know who you are.
Thanks to Mireille.
8. References
[1] Alvestrand, H., "Tags for the identification of languages", RFC
1766, UNINETT, March 1995.
[2] Berners-Lee, T., and D. Connolly, "HyperText Markup Language
Specification - 2.0", RFC 1866, MIT/LCS, November 1995.
[3] W3C XML Working Group (WG), "Extensible Markup Language (XML)
1.0", W3C Recommendation, February 1998.
[4] Weibel, S., Kunze, J., Lagoze, C., Wolf, M., "Dublin Core
Metadata Element Set"
[5] Weibel, S., Kunze, J., Lagoze, C., Wolf, M., "Dublin Core
Metadata for Resource Discovery" RFC-2413
[6] Date and Time Formats (based on ISO 8601), W3C Technical Note,
[7] Ohta, M., "Character Sets ISO-10646 and ISO-10646-J-1", RFC
1815, Tokyo Institute of Technology, Juy 1995.
[8] ISO-639, "Code for the representation of names of languages.",
International Standards Organization, 1988
[9] ISO-3166, "Codes for the Representation of Names of Countries",
International Standards Organization, May 1981.
9. Author's Address
Alain Van Kerckhoven
avenue Broustin 110
B-1083 Brussels
Belgium
Phone: +32 2 420.21.21
Fax : +32 2 420.05.05
EMail: alain@avk.org
10. Full Copyright Statement
Copyright (C) The Internet Society (1999). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph
are included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Alain