Matroska Tags
slhomme@matroska.org
moritz@bunkus.org
dave@dericed.com
art
cellar
This document defines the Matroska tags, namely the tag names and their respective semantic meaning.
Matroska aims to become THE standard of multimedia container formats. It can store timestamped multimedia data but also chapters and tags. The Tag Elements add important metadata to identify and classify the information found in a Matroska Segment. It can tag a whole Segment, separate Track Elements, individual Chapter Elements or Attachment Elements.
While the Matroska tagging framework allows anyone to create their own custom tags, it's important to have a common set of values for interoperability. This document intends to define a set of common tag names used in Matroska.
This document is a work-in-progress specification defining the Matroska file format as part of the IETF Cellar working group. It uses basic elements and concepts already defined in the Matroska specifications defined by this workgroup.
Tag values can be either strings or binary blobs. This document inherits security considerations from the EBML and Matroska documents.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
When a Tag is nested within another Tag, the nested Tag becomes an attribute of the base tag. For instance, if you wanted to store the dates that a singer used certain addresses for, that singer being the lead singer for a track that included multiple bands simultaneously, then your tag tree would look something like this:
Targets
TrackUID
BAND
LEADPERFORMER
ADDRESS
DATE
DATEEND
ADDRESS
DATE
In this way, it becomes possible to store any Tag as attributes of another tag.
Multiple items SHOULD never be stored as a list in a single TagString. If there is more than one tag of a certain type to be stored, then more than one SimpleTag SHOULD be used.
For authoring Tags outside of EBML, the following XML syntax is proposed used in mkvmerge. Binary data SHOULD be stored using BASE64 encoding if it is being stored at authoring time.
There is a debate between people who think all tags SHOULD be free and those who think all tags SHOULD be strict. If you look at this page you will realise we are in between.
Advanced-users application might let you put any tag in your file. But for the rest of the applications, they usually give you a basic list of tags you can use. Both have their needs. But it's usually a bad idea to use custom/exotic tags because you will probably be the only person to use this information even though everyone else could benefit from it. So hopefully when someone wants to put information in one's file, they will find an official one that fit them and hopefully use it ! If it's not in the list, this person can contact us any time for addition of such a missing tag. But it doesn't mean it will be accepted... Matroska files are not meant the become a whole database of people who made costumes for a film. A website would be better for that... It's hard to define what SHOULD be in and what doesn't make sense in a file. So we'll treat each request carefully.
We also need an official list simply for developers to be able to display relevant information in their own design (if they choose to support a list of meta-information they SHOULD know which tag has the wanted meaning so that other apps could understand the same meaning).
To be able to save tags from other systems to Matroska we need to translate them to our system. There is a translation table on our site.
The TagName SHOULD always be written in all capital letters and contain no space.
The fields with dates SHOULD have the following format: YYYY-MM-DD hh:mm:ss.mss YYYY = Year, MM = Month, DD = Days, HH = Hours, mm = Minutes, ss = Seconds, mss = Milliseconds. To store less accuracy, you remove items starting from the right. To store only the year, you would use, "2004". To store a specific day such as May 1st, 2003, you would use "2003-05-01".
Fields that require a Float SHOULD use the "." mark instead of the "," mark. To display it differently for another local, applications SHOULD support auto replacement on display. Also, a thousandths separator SHOULD NOT be used.
For currency amounts, there SHOULD only be a numeric value in the Tag. Only numbers, no letters or symbols other than ".". For instance, you would store "15.59" instead of "$15.59USD".
The TargetType element allows tagging of different parts that are inside or outside a given file. For example in an audio file with one song you could have information about the album it comes from and even the CD set even if it's not found in the file.
For application to know what kind of information (like TITLE) relates to a certain level (CD title or track title), we also need a set of official TargetType names. For now audio and video will have different values & names. That also means the same tag name can have different meanings depending on where it is (otherwise we would end up with 15 TITLE_ tags).
TargetTypeValue
Audio strings
Video strings
Comment
70COLLECTIONCOLLECTIONthe high hierarchy consisting of many different lower items
60EDITION / ISSUE / VOLUME / OPUSSEASON / SEQUEL / VOLUMEa list of lower levels grouped together
50ALBUM / OPERA / CONCERTMOVIE / EPISODE / CONCERTthe most common grouping level of music and video (equals to an episode for TV series)
40PART / SESSIONPART / SESSIONwhen an album or episode has different logical parts
30TRACK / SONGCHAPTERthe common parts of an album or a movie
20SUBTRACK / PART / MOVEMENTSCENEcorresponds to parts of a track for audio (like a movement)
10-SHOTthe lowest hierarchy found in music or movies
An upper level value tag applies to the lower level. That means if a CD has the same artist for all tracks, you just need to set the ARTIST tag at level 50 (ALBUM) and not to each TRACK (but you can). That also means that if some parts of the CD have no known ARTIST the value MUST be set to nothing (a void string "").
When a level doesn't exist it MUST NOT be specified in the files, so that the TOTAL_PARTS and PART_NUMBER elements match the same levels.
Here is an example of how these organizational tags work: If you set 10 TOTAL_PARTS to the ALBUM level (40) it means the album contains 10 lower parts. The lower part in question is the first lower level that is specified in the file. So if it's TRACK (30) then that means it contains 10 tracks. If it's MOVEMENT (20) that means it's 10 movements, etc.
The following is a complete list of the supported Matroska Tags. While it is possible to use Tag names that are not listed below, this is not recommended as compatibility will be compromised. If you find that there is a Tag missing that you would like to use, then please contact the Matroska team for its inclusion in the specifications before the format reaches 1.0.
Nesting Information tags are intended to contain other tags.
Tag Name
Type
Description
ORIGINAL-A special tag that is meant to have other tags inside (using nested tags) to describe the original work of art that this item is based on. All tags in this list can be used "under" the ORIGINAL tag like LYRICIST, PERFORMER, etc.
SAMPLE-A tag that contains other tags to describe a sample used in the targeted item taken from another work of art. All tags in this list can be used "under" the SAMPLE tag like TITLE, ARTIST, DATE_RELEASED, etc.
COUNTRYUTF-8The name of the country (biblio ISO-639-2) that is meant to have other tags inside (using nested tags) to country specific information about the item. All tags in this list can be used "under" the COUNTRY_SPECIFIC tag like LABEL, PUBLISH_RATING, etc.
Tag Name
Type
Description
TOTAL_PARTSUTF-8Total number of parts defined at the first lower level. (e.g. if TargetType is ALBUM, the total number of tracks of an audio CD)
PART_NUMBERUTF-8Number of the current part of the current level. (e.g. if TargetType is TRACK, the track number of an audio CD)
PART_OFFSETUTF-8A number to add to PART_NUMBER when the parts at that level don't start at 1. (e.g. if TargetType is TRACK, the track number of the second audio CD)
Tag Name
Type
Description
TITLEUTF-8The title of this item. For example, for music you might label this "Canon in D", or for video's audio track you might use "English 5.1" This is akin to the TIT2 tag in ID3.
SUBTITLEUTF-8Sub Title of the entity.
Nested Information includes tags contained in other tags.
Tag Name
Type
Description
URLUTF-8URL corresponding to the tag it's included in.
SORT_WITHUTF-8A child element to indicate what alternative value the parent tag can have to be sorted, for example "Pet Shop Boys" instead of "The Pet Shop Boys". Or "Marley Bob" and "Marley Ziggy" (no comma needed).
INSTRUMENTSUTF-8The instruments that are being used/played, separated by a comma. It SHOULD be a child of the following tags: ARTIST, LEAD_PERFORMER or ACCOMPANIMENT.
EMAILUTF-8Email corresponding to the tag it's included in.
ADDRESSUTF-8The physical address of the entity. The address SHOULD include a country code. It can be useful for a recording label.
FAXUTF-8The fax number corresponding to the tag it's included in. It can be useful for a recording label.
PHONEUTF-8The phone number corresponding to the tag it's included in. It can be useful for a recording label.
Tag Name
Type
Description
ARTISTUTF-8A person or band/collective generally considered responsible for the work. This is akin to the TPE1 tag in ID3.
LEAD_PERFORMERUTF-8Lead Performer/Soloist(s). This can sometimes be the same as ARTIST.
ACCOMPANIMENTUTF-8Band/orchestra/accompaniment/musician. This is akin to the TPE2 tag in ID3.
COMPOSERUTF-8The name of the composer of this item. This is akin to the TCOM tag in ID3.
ARRANGERUTF-8The person who arranged the piece, e.g., Ravel.
LYRICSUTF-8The lyrics corresponding to a song (in case audio synchronization is not known or as a doublon to a subtitle track). Editing this value when subtitles are found SHOULD also result in editing the subtitle track for more consistency.
LYRICISTUTF-8The person who wrote the lyrics for a musical item. This is akin to the TEXT tag in ID3.
CONDUCTORUTF-8Conductor/performer refinement. This is akin to the TPE3.
DIRECTORUTF-8This is akin to the IART tag in RIFF.
ASSISTANT_DIRECTORUTF-8The name of the assistant director.
DIRECTOR_OF_PHOTOGRAPHYUTF-8The name of the director of photography, also known as cinematographer. This is akin to the ICNM tag in Extended RIFF.
SOUND_ENGINEERUTF-8The name of the sound engineer or sound recordist.
ART_DIRECTORUTF-8The person who oversees the artists and craftspeople who build the sets.
PRODUCTION_DESIGNERUTF-8Artist responsible for designing the overall visual appearance of a movie.
CHOREGRAPHERUTF-8The name of the choregrapher
COSTUME_DESIGNERUTF-8The name of the costume designer
ACTORUTF-8An actor or actress playing a role in this movie. This is the person's real name, not the character's name the person is playing.
CHARACTERUTF-8The name of the character an actor or actress plays in this movie. This SHOULD be a sub-tag of an ACTOR tag in order not to cause ambiguities.
WRITTEN_BYUTF-8The author of the story or script (used for movies and TV shows).
SCREENPLAY_BYUTF-8The author of the screenplay or scenario (used for movies and TV shows).
EDITED_BYUTF-8This is akin to the IEDT tag in Extended RIFF.
PRODUCERUTF-8Produced by. This is akin to the IPRO tag in Extended RIFF.
COPRODUCERUTF-8The name of a co-producer.
EXECUTIVE_PRODUCERUTF-8The name of an executive producer.
DISTRIBUTED_BYUTF-8This is akin to the IDST tag in Extended RIFF.
MASTERED_BYUTF-8The engineer who mastered the content for a physical medium or for digital distribution.
ENCODED_BYUTF-8This is akin to the TENC tag in ID3.
MIXED_BYUTF-8DJ mix by the artist specified
REMIXED_BYUTF-8Interpreted, remixed, or otherwise modified by. This is akin to the TPE4 tag in ID3.
PRODUCTION_STUDIOUTF-8This is akin to the ISTD tag in Extended RIFF.
THANKS_TOUTF-8A very general tag for everyone else that wants to be listed.
PUBLISHERUTF-8This is akin to the TPUB tag in ID3.
LABELUTF-8The record label or imprint on the disc.
Tag Name
Type
Description
GENREUTF-8The main genre (classical, ambient-house, synthpop, sci-fi, drama, etc). The format follows the infamous TCON tag in ID3.
MOODUTF-8Intended to reflect the mood of the item with a few keywords, e.g. "Romantic", "Sad" or "Uplifting". The format follows that of the TMOO tag in ID3.
ORIGINAL_MEDIA_TYPEUTF-8Describes the original type of the media, such as, "DVD", "CD", "computer image," "drawing," "lithograph," and so forth. This is akin to the TMED tag in ID3.
CONTENT_TYPEUTF-8The type of the item. e.g. Documentary, Feature Film, Cartoon, Music Video, Music, Sound FX, ...
SUBJECTUTF-8Describes the topic of the file, such as "Aerial view of Seattle."
DESCRIPTIONUTF-8A short description of the content, such as "Two birds flying."
KEYWORDSUTF-8Keywords to the item separated by a comma, used for searching.
SUMMARYUTF-8A plot outline or a summary of the story.
SYNOPSISUTF-8A description of the story line of the item.
INITIAL_KEYUTF-8The initial key that a musical track starts in. The format is identical to ID3.
PERIODUTF-8Describes the period that the piece is from or about. For example, "Renaissance".
LAW_RATINGUTF-8Depending on the COUNTRY it's the format of the rating of a movie (P, R, X in the USA, an age in other countries or a URI defining a logo).
ICRAbinaryThe ICRA content rating for parental control. (Previously RSACi)
Tag Name
Type
Description
DATE_RELEASEDUTF-8The time that the item was originally released. This is akin to the TDRL tag in ID3.
DATE_RECORDEDUTF-8The time that the recording began. This is akin to the TDRC tag in ID3.
DATE_ENCODEDUTF-8The time that the encoding of this item was completed began. This is akin to the TDEN tag in ID3.
DATE_TAGGEDUTF-8The time that the tags were done for this item. This is akin to the TDTG tag in ID3.
DATE_DIGITIZEDUTF-8The time that the item was transferred to a digital medium. This is akin to the IDIT tag in RIFF.
DATE_WRITTENUTF-8The time that the writing of the music/script began.
DATE_PURCHASEDUTF-8Information on when the file was purchased (see also ).
Tag Name
Type
Description
RECORDING_LOCATIONUTF-8The location where the item was recorded. The countries corresponding to the string, same 2 octets as in Internet domains, or possibly ISO-3166. This code is followed by a comma, then more detailed information such as state/province, another comma, and then city. For example, "US, Texas, Austin". This will allow for easy sorting. It is okay to only store the country, or the country and the state/province. More detailed information can be added after the city through the use of additional commas. In cases where the province/state is unknown, but you want to store the city, simply leave a space between the two commas. For example, "US, , Austin".
COMPOSITION_LOCATIONUTF-8Location that the item was originally designed/written. The countries corresponding to the string, same 2 octets as in Internet domains, or possibly ISO-3166. This code is followed by a comma, then more detailed information such as state/province, another comma, and then city. For example, "US, Texas, Austin". This will allow for easy sorting. It is okay to only store the country, or the country and the state/province. More detailed information can be added after the city through the use of additional commas. In cases where the province/state is unknown, but you want to store the city, simply leave a space between the two commas. For example, "US, , Austin".
COMPOSER_NATIONALITYUTF-8Nationality of the main composer of the item, mostly for classical music. The countries corresponding to the string, same 2 octets as in Internet domains, or possibly ISO-3166.
Tag Name
Type
Description
COMMENTUTF-8Any comment related to the content.
PLAY_COUNTERUTF-8The number of time the item has been played.
RATINGUTF-8A numeric value defining how much a person likes the song/movie. The number is between 0 and 5 with decimal values possible (e.g. 2.7), 5(.0) being the highest possible rating. Other rating systems with different ranges will have to be scaled.
Tag Name
Type
Description
ENCODERUTF-8The software or hardware used to encode this item. ("LAME" or "XviD")
ENCODER_SETTINGSUTF-8A list of the settings used for encoding this item. No specific format.
BPSUTF-8The average bits per second of the specified item. This is only the data in the Blocks, and excludes headers and any container overhead.
FPSUTF-8The average frames per second of the specified item. This is typically the average number of Blocks per second. In the event that lacing is used, each laced chunk is to be counted as a separate frame.
BPMUTF-8Average number of beats per minute in the complete target (e.g. a chapter). Usually a decimal number.
MEASUREUTF-8In music, a measure is a unit of time in Western music like "4/4". It represents a regular grouping of beats, a meter, as indicated in musical notation by the time signature.. The majority of the contemporary rock and pop music you hear on the radio these days is written in the 4/4 time signature.
TUNINGUTF-8It is saved as a frequency in hertz to allow near-perfect tuning of instruments to the same tone as the musical piece (e.g. "441.34" in Hertz). The default value is 440.0 Hz.
REPLAYGAIN_GAINbinaryThe gain to apply to reach 89dB SPL on playback. This is based on the Replay Gain standard. Note that ReplayGain information can be found at all TargetType levels (track, album, etc).
REPLAYGAIN_PEAKbinaryThe maximum absolute peak value of the item. This is based on the Replay Gain standard.
Tag Name
Type
Description
ISRCUTF-8The International Standard Recording Code, excluding the "ISRC" prefix and including hyphens.
MCDIbinaryThis is a binary dump of the TOC of the CDROM that this item was taken from. This holds the same information as the MCDI in ID3.
ISBNUTF-8International Standard Book Number
BARCODEUTF-8EAN-13 (European Article Numbering) or UPC-A (Universal Product Code) bar code identifier
CATALOG_NUMBERUTF-8A label-specific string used to identify the release (TIC 01 for example).
LABEL_CODEUTF-8A 4-digit or 5-digit number to identify the record label, typically printed as (LC) xxxx or (LC) 0xxxx on CDs medias or covers (only the number is stored).
LCCNUTF-8Library of Congress Control Number
Tag Name
Type
Description
PURCHASE_ITEMUTF-8URL to purchase this file. This is akin to the WPAY tag in ID3.
PURCHASE_INFOUTF-8Information on where to purchase this album. This is akin to the WCOM tag in ID3.
PURCHASE_OWNERUTF-8Information on the person who purchased the file. This is akin to the TOWN tag in ID3.
PURCHASE_PRICEUTF-8The amount paid for entity. There SHOULD only be a numeric value in here. Only numbers, no letters or symbols other than ".". For instance, you would store "15.59" instead of "$15.59USD".
PURCHASE_CURRENCYUTF-8The currency type used to pay for the entity. Use ISO-4217 for the 3 letter currency code.
Tag Name
Type
Description
COPYRIGHTUTF-8The copyright information as per the copyright holder. This is akin to the TCOP tag in ID3.
PRODUCTION_COPYRIGHTUTF-8The copyright information as per the production copyright holder. This is akin to the TPRO tag in ID3.
LICENSEUTF-8The license applied to the content (like Creative Commons variants).
TERMS_OF_USEUTF-8The terms of use for this item. This is akin to the USER tag in ID3.
In the Target list, a logical OR is applied on all tracks, a logical OR is applied on all chapters. Then a logical AND is applied between the Tracks list and the Chapters list to know if an element belongs to this Target.