Playlist Ad-Hoc Group | L. Gonze |
M. Friedrich | |
R. Kaye | |
D. Brown | |
January 2005 |
XSPF Version 1
XML Shareable Playlist Format ("spiff")
We describe an XML playlist format which is open, moderately simple, and carefully engineered.
1 Introduction
1.1 Example
2 Administration
2.1 History
2.2 Acknowledgements
2.3 terminology
2.3.1 URI, URLs and URNs
2.3.2 Requirements notation
3 Abstractions
3.1 Defining playlists
3.2 What a playlist is not
3.3 Shareability
3.4 Content resolver
3.5 Fuzzy names
4 Element definitions
4.1 elements
4.1.1 playlist
4.1.1.1 attributes
4.1.1.1.1 xmlns
4.1.1.1.2 version
4.1.1.2 elements
4.1.1.2.1 title
4.1.1.2.2 creator
4.1.1.2.3 annotation
4.1.1.2.4 info
4.1.1.2.5 location
4.1.1.2.6 identifier
4.1.1.2.7 image
4.1.1.2.8 date
4.1.1.2.9 license
4.1.1.2.10 attribution
4.1.1.2.11 link
4.1.1.2.11.1 attributes
4.1.1.2.11.1.1 rel
4.1.1.2.11.2 content
4.1.1.2.12 meta
4.1.1.2.12.1 attributes
4.1.1.2.12.1.1 rel
4.1.1.2.12.2 content
4.1.1.2.13 extension
4.1.1.2.13.1 attributes
4.1.1.2.13.1.1 application
4.1.1.2.13.2 content
4.1.1.2.14 trackList
4.1.1.2.14.1 elements
4.1.1.2.14.1.1 track
4.1.1.2.14.1.1.1 elements
4.1.1.2.14.1.1.1.1 location
4.1.1.2.14.1.1.1.2 identifier
4.1.1.2.14.1.1.1.3 title
4.1.1.2.14.1.1.1.4 creator
4.1.1.2.14.1.1.1.5 annotation
4.1.1.2.14.1.1.1.6 info
4.1.1.2.14.1.1.1.7 image
4.1.1.2.14.1.1.1.8 album
4.1.1.2.14.1.1.1.9 trackNum
4.1.1.2.14.1.1.1.10 duration
4.1.1.2.14.1.1.1.11 link
4.1.1.2.14.1.1.1.11.1 attributes
4.1.1.2.14.1.1.1.11.1.1 rel
4.1.1.2.14.1.1.1.11.2 content
4.1.1.2.14.1.1.1.12 meta
4.1.1.2.14.1.1.1.12.1 attributes
4.1.1.2.14.1.1.1.12.1.1 rel
4.1.1.2.14.1.1.1.12.2 content
4.1.1.2.14.1.1.1.13 extension
4.1.1.2.14.1.1.1.13.1 attributes
4.1.1.2.14.1.1.1.13.1.1 application
4.1.1.2.14.1.1.1.13.2 content
5 Requirements for XSPF generators
6 Requirements for XSPF players
6.1 Graceful failure
6.2 Relative paths
6.3 Extension URIs
7 Usecases for playlists
7.1 Flag player application
7.2 Allow streaming
7.3 Collecting fragmented resources
7.4 Alternate media types
7.5 Caching derived info
7.6 Metadata storage
7.7 Authoring compilations for expressive reasons
8 Recipes
8.1 How do I set relative paths in an XSPF playlist, for example if I want to use it as a file manifest?
8.2 How to I convert XSPF to M3U?
8.3 How to I convert XSPF to HTML?
8.4 How to I convert XSPF to SMIL?
8.5 How to I convert XSPF to Soundblox?
8.6 How do I customize XSPF? Should I use namespaces?
8.7 How do I validate XSPF?
8.8 How do I use MusicBrainz metadata?
8.9 How do I refer to a BitTorrent?
8.10 How do I refer to a Magnet or sha1: URI?
§ References
§ Author's Addresses
A IANA Considerations
A.1 MIME media type name
A.2 MIME subtype name
A.3 Mandatory parameters
A.4 Optional parameters
A.5 Translated into plain english
A.6 File extension
A.7 Security Considerations
§ Intellectual Property and Copyright Statements
There is no XML format for playlists that can measure up to the standards of the formats for web pages (HTML), weblogs (RSS), and web graphs (RDF/XML). It is evident that there is a need, because XML is the preferred data description language of the moment and as a result the tools and skills to use it are ubiquitous.
It is also evident that existing playlist formats fall short. ASX (for Windows Media Player) and the iTunes library format are proprietary. ASX resembles XML in that it uses angle brackets, but is not XML by any means. M3U, RAM, and M4U are flat files; QuickTime is binary; Pls is in the Windows .ini format; Gnomoradio RDF is RDF, not XML. SMIL addresses a much larger problem space than the average MP3 player. The timing model of RSS doesn't fit audio and video. Forcing timing models into HTML, as HTML+Time does, creates an unintelligible feature set. Few of these formats are well documented. None of these formats make simple features easy to code and hard features possible. Only one is an open standard. Not one offers playlist interoperability across major vendors.
The question for software developers is why should I support this new XML playlist format? The choice is mainly between M3U and SMIL. Almost every MP3 player accepts M3U but also invents an XML playlist format. Inventing a format creates work, for example to study related formats; you should use XSPF to avoid the work. SMIL, on the other hand, is a prescription for a kind of application that is different from an MP3 player -- it describes layouts in time, while XSPF describes concepts common among MP3 players. Given a song with the comment "danceable!", SMIL might have an instruction to write that text in the upper left in a bold sans-serif font, while XSPF would tell an MP3 player that the text is a comment and say nothing about formatting.
A very simple document looks like this:
<?xml version="1.0" encoding="UTF-8"?> <playlist version="1" xmlns = "http://xspf.org/ns/0/"> <trackList> <track><location>file:///mp3s/song_1.mp3</location></track> <track><location>file:///mp3s/song_2.mp3</location></track> <track><location>file:///mp3s/song_3.mp3</location></track> </trackList> </playlist>
or this:
<?xml version="1.0" encoding="UTF-8"?> <playlist version="1" xmlns = "http://xspf.org/ns/0/"> <trackList> <track><location>http://example.com/song_1.mp3</location></track> <track><location>http://example.com/song_2.mp3</location></track> <track><location>http://example.com/song_3.mp3</location></track> </trackList> </playlist>
Our group started work in February 2004, achieved rough consensus on version 0 in April 2004, did implementations and fine tuning throughout summer and fall 2004, and declared the tuned version to be version 1 in January 2005. Version 1 is not far from being frozen and code-ready.
This document describes version 1, which is not ready for implementation. Version 0, the previous one, is stable and frozen -- developers can assume that it will not change.
The home of our working group on the web is http://xspf.org.
We have benefitted a great deal from the contributions of Dan Brickley, Kevin Marks and Ian C. Rogers, each of whom strongly influenced the shape of the format. We are grateful for comments and feedback from Ryan Shaw, Alf Eaton, Steve Gedikian, Russell Garrett, and Ben Tesch. Special thanks to the developers Tomas Franzén (who participated in our work from the very beginning), Jim Garrison, Brander Lien, and Fabricio Zuardi, and to everyone who contributed their time and skill on the mailing list and wiki.
The terms URI, URL, and URN should be interpreted here as follows: a URL is an address of something that can be fetched by a computer; a URN is a name of something which may be purely an abstraction; a URI is either. In this document, //playlist[@xmlns], //playlist/identifier, meta[@rel], link[@rel], and //playlist/trackList/track/identifier are URNs, all other elements are URLs.
See also: RFC2396bis; URIs, URLs, and URNs: Clarifications and Recommendations 1.0
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document MUST NOT be interpreted as described in [RFC2119]. In this document these should be interpreted to mean that something shouted is important. XSPF is not a standards track document, it is an ad-hoc project by a group of individuals. Developers may, however, find that [RFC2119] is a useful source anyway.
An XSPF playlist describes a sequence of objects to be rendered. Objects might be audio, video, text, playlists, or any other media type. The function of a playlist is to identify the objects and communicate their order.
The function of a playlist is not to communicate metadata about the composer, song title, etc. Metadata is hard and there are many providers already. We decided that we couldn't compete, and that there was no need for us to try. Moreover, good metadata does not travel well -- every user has to recreate it. Metadata should come from external sources and namespaces like MusicBrainz or Gracenote; this what the XSPF link and meta elements are for.
The function of a playlist is not to store derived information about objects that a user has a copy of. A playlist is not a catalog. A catalog is computed across hard data like files; it stores information like filesystem paths and the contents of ID3 tags. This data has no value on any machine but the one on which it originated. Sharing this data would be a privacy and security violation. Software which needs access to this data has no reason to maintain it in a standard format, because it has no reason to allow access to it. Standardizing this data would be fruitless, because there are an endless number of measurements that software might take and store. Derived information belongs in a catalog.
Things a playlist is not, then, are a metadata format or a catalog. We took care to enable these features, but also to avoid duplicating their functionality, poorly.
If there is no reason for a playlist to be shared, there is no need for a new format. Even a buggy format does no damage if it is created and consumed by the same software on the same machine. The need for a new format only comes up when a playlist travels from one machine to another, for example when it is published on the internet.
One type of shareability is between different pieces of software on the same machine. It is common for playlists created with one application to not be usable by another application on the same machine because of different or conflicting interpretations of the playlist format. M3U suffers from this very badly, because M3U playlists often reference files according to a base path which changes from application to application. The XSPF group aimed to fix this by providing unambiguous definitions.
The other type of shareability is between different machines. For playlists to be meaningful on different machines, they must be able to identify network resources. Audio and video objects are often abstractions like "movie X by director Y" rather than computer-friendly objects like "whatever file can be gotten from the URL http://foo/x/y". To handle this problem, we have provided support for media objects to be found via queries; XSPF identifiers are fuzzy names.
On a surface level you can use XSPF like any other playlist format. Drop a bunch of filenames into an XSPF document, prepend "file://" to each, and you're ready to go. Under the surface there is much more.
The guiding design principle was to separate the functionality of a catalog of files from the functionality of a list of songs. Most MP3 players have some sort of cache for file information. This cache stores a list, or catalog, of available files and metadata from ID3 tags and other sources. XSPF is not a catalog format. XSPF exists only to say which songs to play. Almost everything in XSPF is for the purpose of answering the question which resource, rather than the question what is this resource.
If XSPF is not a catalog format, what is it? XSPF is an intermediate format. We expected a new kind of software called a content resolver to do the job of converting XSPF to a plain old list of files or URLs. A content resolver would be smart enough to keep your playlists from breaking when you move your MP3s from /mp3s to /music/mp3. It would be able to figure out that a playlist entry by the artist "Hank Williams" with the title "Your Cheating Heart" could be satisfied by the file /mp3s/hankwilliams/yourcheatingheart.mp3. It might even know how to query the iTunes music store or another online provider to locate and download a missing song.
The content resolver maintains the catalog of your songs in whatever format it prefers. It might use a flatfile, a file in the Berkeley DB format, or a SQL database. It might use only ID3 metadata, but it might also know how to query MusicBrainz or another metadata service.
Any given track can be identified in a number of ways. We provided means for absolute identifiers like URLs, filesystem paths and secure hashes, but also for query-based identifiers -- free text fields like artist and work title and numeric fields for song length, all of which together should be enough for a good content resolver to turn into files.
Notice that the namespace is 0 but the version is 1. This is because version 1 playlists are backwards compatible with version 0 parsers.
Human-readable name of the entity (author, authors, group, company, etc) that authored the playlist. xspf:playlist elements MAY contain exactly one.
A human-readable comment on the playlist. This is character data, not HTML, and it may not contain markup. xspf:playlist elements MAY contain exactly one.
URL of a web page to find out more about this playlist. Likely to be homepage of the author, and would be used to find out more about the author and to find more playlists by the author. xspf:playlist elements MAY contain exactly one.
Canonical ID for this playlist. Likely to be a hash or other location-independent name. MUST be a legal URN. xspf:playlist elements MAY contain exactly one.
URL of an image to display in the absence of a //playlist/trackList/image element. xspf:playlist elements MAY contain exactly one.
Creation date (not last-modified date) of the playlist, formatted as a XML schema dateTime. xspf:playlist elements MAY contain exactly one.
A sample date is "2005-01-08T17:10:47-05:00". PHP to produce such a string from a unix timestamp is:
$main_date = date("Y-m-d\TH:i:s", $timestamp); $tz = date("O", $timestamp); $tz = substr_replace ($tz, ':', 3, 0);
Note: in version 0 of XSPF, this was specifed as an ISO 8601 date. xsd:dateTime is the same thing (with better documentation) for almost every date in history, and there are no playlist creation dates that might be different.
URL of a resource that describes the license under which this playlist was released. xspf:playlist elements may contain zero or one license element.
An ordered list of URIs. The purpose is to satisfy licenses allowing modification but requiring attribution. If you modify such a playlist, move its //playlist/location or //playlist/identifier element to the top of the items in the //playlist/attribution element. xspf:playlist elements MAY contain exactly one xspf:attribution element.
Such a list can grow without limit, so as a practical matter we suggest deleting ancestors more than ten generations back.
<attribution> <location>http://bar.com/modified_version_of_original_playlist.xspf</location> <identifier>somescheme:original_playlist.xspf</identifier> </attribution>
The link element allows non-XSPF web resources to be included in XSPF documents without breaking XSPF validation. xspf:playlist elements MAY contain zero or more link elements.
<link rel="http://foaf.example.org/namespace/version1">http://socialnetwork.example.org/foaf/mary.rdfs</link>
The meta element allows non-XSPF metadata to be included in XSPF documents without breaking XSPF validation. xspf:playlist elements MAY contain zero or more meta elements.
<meta rel="http://example.org/key">value</meta>
The extension element allows non-XSPF XML to be included in XSPF documents without breaking XSPF validation. The purpose is to allow nested XML, which the meta and link elements do not. xspf:playlist elements MAY contain zero or more extension elements.
<playlist xmlns:cl="http://example.com"> <extension application="http://example.com"> <cl:clip start="25000" end="34500"/> </extension> </playlist>
Ordered list of xspf:track elements to be rendered. The sequence is a hint, not a requirement; renderers are advised to play tracks from top to bottom unless there is an indication otherwise.
If an xspf:track element cannot be rendered, a user-agent MUST skip to the next xspf:track element and MUST NOT interrupt the sequence.
xspf:playlist elements MUST contain one and only one trackList element. The trackList element my be empty.
URL of resource to be rendered. Probably an audio resource, but MAY be any type of resource with a well-known duration, such as video, a SMIL document, or an XSPF document. The duration of the resource defined in this element defines the duration of rendering. xspf:track elements MAY contain zero or more location elements, but a user-agent MUST NOT render more than one of the named resources.
Canonical ID for this resource. Likely to be a hash or other location-independent name, such as a MusicBrainz identifier or isbn URN (if there existed isbn numbers for audio). MUST be a legal URN. xspf:playlist elements MAY contain zero or more identifier elements.
Human-readable name of the track that authored the resource which defines the duration of track rendering. This value is primarily for fuzzy lookups, though a user-agent may display it. xspf:track elements MAY contain exactly one.
Human-readable name of the entity (author, authors, group, company, etc) that authored the resource which defines the duration of track rendering. This value is primarily for fuzzy lookups, though a user-agent may display it. xspf:track elements MAY contain exactly one.
A human-readable comment on the track. This is character data, not HTML, and it may not contain markup. xspf:track elements MAY contain exactly one.
URL of an image to display for the duration of the track. xspf:track elements MAY contain exactly one.
Human-readable name of the collection from which the resource which defines the duration of track rendering comes. For a song originally published as a part of a CD or LP, this would be the title of the original release. This value is primarily for fuzzy lookups, though a user-agent may display it. xspf:track elements MAY contain exactly one.
Integer with value greater than zero giving the ordinal position of the media on the xspf:album. This value is primarily for fuzzy lookups, though a user-agent may display it. xspf:track elements MAY contain exactly one. It MUST be a valid XML Schema nonNegativeInteger.
The time to render a resource, in milliseconds. It MUST be a valid XML Schema nonNegativeInteger. This value is only a hint -- different XSPF generators will generate slightly different values. A user-agent MUST NOT use this value to determine the rendering duration, since the data will likely be low quality. xspf:track elements MAY contain exactly one duration element.
The link element allows non-XSPF web resources to be included in xspf:track elements without breaking XSPF validation.
<link rel="http://foaf.org/namespace/version1">http://socialnetwork.org/foaf/mary.rdfs</link>
The meta element allows non-XSPF metadata to be included in xspf:track elements without breaking XSPF validation.
<meta rel="http://example.org/key">value</meta>
The extension element allows non-XSPF XML to be included in XSPF documents without breaking XSPF validation. The purpose is to allow nested XML, which the meta and link elements do not. xspf:playlist elements MAY contain zero or more extension elements.
<playlist xmlns:cl="http://example.com"> <trackList> <track> <extension application="http://example.com"> <cl:clip start="25000" end="34500"/> </extension> </track> </trackList> </playlist>
To ensure interoperability, conforming applications MUST generate playlists that follow the definitions listed in section 4 (element descriptions). A Relax NG schema has been provided to test for syntactic conformance.
If a media player is unable to render a resource, the show MUST go on. Playlists exist in time; a player that stops processing when it encounters an error is considered broken; it is not conformant with the standard; it must be shunned by the community and made an outcast. Players will frequently encounter resources that they cannot render -- this is not a fatal error unless the player stops processing the playlist.
Relative paths MUST be resolved according to the XML Base specification or IETF RFC 2396:
The rules for determining the base URI can be be summarized as follows (highest priority to lowest): The base URI is embedded in the document's content. The base URI is that of the encapsulating entity (message, document, or none). The base URI is the URI used to retrieve the entity. The base URI is defined by the context of the application.
Scenario: A user clicks on a link to an audio or video object in their browser. The browser needs to hand the object off to a helper application like an MP3 player. If there is an intermediate playlist object between the browser and helper application, and the browser needs to ensure that the right helper is launched, the playlist needs to be of a type which is mapped to the same helper application.
Typical solution: Use a dedicated playlist format for almost every media subtype. For Real audio there is RAM; for MP4 video there is M4U; for MP3 there is M3U; even though RAM, M4U and M3U are almost identical in syntax. The QuickTime format is able to avoid this problem only because the container format and media format are integrated -- a QuickTime file is both a playlist and a media object.
XSPF' solution: The XSPF format does not yet have a solution to this problem, because the working group has not yet tackled it. (Though I can speculate that a content resolver in between the browser and helper application would have the means to do it).
Scenario: A user clicks on an audio or video link. Before handing off control to the helper application, the browser must download whatever the link points to. For streaming media this makes no sense; either the download will never finish or waiting for a complete download defeats the purpose.
Typical solution: rather than linking to an audio or video document, link to a playlist containing a URL of an audio or video document. Playlists used for this purpose often contain only a single URL. The Pls format, which is used for MP3-based webcasting, and which contains a single URL of a never-ending stream, takes this approach.
XSPF' solution: any reasonably compact playlist format supports this equally well. This rules out iTunes library format and sometimes QuickTime, but allows XSPF along with M3U, Pls and other relatively terse formats.
Scenario: There is a very large object like a DVD rip. The likelyhood of downloading the entire object in one shot is low, so the object has been split into pieces. The object then needs to be reassembled on the client side.
Typical solution: Create a zip file or tarball, which use checksums to ensure integrity of the download; start by sending a playlist which acts a file manifest and allows a user agent to download sub-objects in digestible chunks. However, a manifest has to express paths to related objects according to a filesystem which does not exist on the client, there has to be agreement between the client and server on how to interpret relative paths in a playlist. The problem is that few playlist formats -- only SMIL, to my knowledge -- define the meaning of relative paths in a playlist.
XSPF' solution: XSPF clearly defines the meaning of relative paths according to the rule that a client must interpret relative paths in a playlist according to the XML Base specification or IETF RFC 2396.
Scenario: There is a renderer which is capable of rendering one form of a media object but not another. The server is able to deliver the object in either format, but it needs to communicate URLs for both. Though HTTP content negotiation can be used for instances where the renderer contacts the server directly, it doesn't support protocol negotiation, and it can't be used in non-HTTP protocols.
Typical solution: This is particularly a problem for Real, which has a large installed base of obsolete software to be babied. The solution is to delver alternate URLs within the same playlist and allow the client to choose. The RAM format allows both a pnm: and a rtsp: URL within the same playlist, separated by a line containg the keyword "--stop--".
XSPF' solution: An XSPF track object can contain multiple identifiers or locations for the same media object.
Scenario: An MP3 player needs to access information about media objects which is too expensive to compute in real time. For a large number of file a user can't wait to re-read ID3 tags, computing SHA1 hashes, or perform a fourier transform for each.
Typical solution: An MP3 player computes the information once, the first time it encounters an object, then caches the data. The iTunes library format stores computed information like ID3 data in the global catalog and playlist.
XSPF' solution: XSPF defers this information to an external module called the content resolver, and mandates that the information not be included in shared playlists.
Scenario: A user needs information about high level concepts like artist and song title rather than machine-level concepts like file name and bit rate. How should artist and song title be communicated, and how should they be stored?
Typical solution: Derive the metadata according to an application-defined process like extracting ID3 tags, then then store a copy of the metadata in any playlists that reference a media object. The EXTINF property of the extended M3U format is used in this way.
XSPF' solution: XSPF defers this functionality to other sources. Metadata is hard; there are already many projects to deal with it, some of which are very good. Metadata is attached to an XSPF track according to whatever syntax an imported vocabulary defines. XML namespaces may be used, but the preferred syntax is the XSPF link and meta elements. (These elements allows us to validate metadata from external sources, while namespaces don't.)
Scenario: A businessperson wants to make a batch of videos of related talks from a conference because watching them in a shared context gives a deeper understanding of the subject as a whole.
Typical solution: A user compiles copies of the videos and puts them in the same location, maybe in the same directory on a web server, maybe in the same directory on a hard drive. The user then puts the locations, whether paths or URLs, into a file in the M3U format.
XSPF' solution: The XSPF trackList element contains a sequence of track elements, each of which points to one of the objects.
See the XML Base specification or IETF RFC 2396.
Use the xspf2m3u.xsl stylesheet.
Use the xspf2html.xsl stylesheet.
Use the xspf2smil.xsl stylesheet.
Use the xspf2soundblox.xsl stylesheet.
Use the meta or link elements. Use meta if the element contains a single value, like "blue" or "rock"; use link if the element contents are a URL. Try to avoid using namespaces to add fields, because namespaced items cannot be validated by an XSPF validator.
Matthias Friedrich has created an XML schema for XSPF version 1 at http://www.stud.uni-karlsruhe.de/~uy7l/xspf-1.xsd.
Robert Kaye has created a Relax NG schema for XSPF version 0 draft 8 at http://mayhem-chaos.net/stuff/xspf-draft8.rng. You can use Jing to invoke it.
For users of Emacs nxml-mode, Ryan Shaw has posted a .rnc version of Robert's schema at http://lists.musicbrainz.org/pipermail/playlist/2004-October/000429.html. This is just a matter of putting the .rnc file in the schema/ subdirectory of your nxml-mode installation. nxml-mode will find it automatically and add it to the list of available schemas; if you begin authoring an XSPF playlist, nxml-mode will choose the correct schema by examining the root element name.
Rather than include the literal artist name, song duration, etc, for a track within a playlist, MusicBrainz gives the URL of an XML file containing these items. Assume that the MusicBrainz definition of what a track listing means is at http://musicbrainz.org/track. (There is nothing at that URL, which is fine -- the URL in an XSPF meta[@rel] attribute works the same way as the URL in an XML namespace declaration). A typical track listing has a URL like http://musicbrainz.org/mm-2.1/track/bdc846e7-6c26-4193-82a6-8d1b5a4d3429.
<track> <identifier>bdc846e7-6c26-4193-82a6-8d1b5a4d3429</identifier> <title>Smoke Two Joints</title> <creator>Sublime</creator> <duration>175466</duration> <meta rel="http://musicbrainz.org/track">http://musicbrainz.org/mm-2.1/track/bdc846e7-6c26-4193-82a6-8d1b5a4d3429</meta> </track>
[RFC2119] | Bradner, S.,"Key words for use in RFCs to Indicate Requirement Levels", BCP14, RFC2119, March1997. |
Lucas Gonze | |
EMail: | [email protected] |
Matthias Friedrich | |
EMail: | [email protected] |
Robert Kaye | |
EMail: | [email protected] |
Dave Brown | |
EMail: | [email protected] |
xspf+xml
(This name is provisional, meaning that we have not yet found a volunteer to steer it through the name-granting process).
"charset", per RFC3023.