Efficiently Reading ID3v2 Tags Over HTTP?

← Back to Stories (view on slashdot.org)

Efficiently Reading ID3v2 Tags Over HTTP?

Posted by Cliff on Tuesday May 18, 2004 @02:11AM from the can-we-not-swallow-the-whole-file dept.

Paul Crowley asks: "Given an HTTP URL for an MP3 file, what's the best way to read its ID3 tags on a GNU/Linux system? It shouldn't be necessary to fetch the whole file: HTTP byteranges should make it possible to fetch only the tiny fraction that's needed, for a big saving in network bandwidth. However, existing ID3v2 libraries are designed to read local files. Extending these libraries for this purpose, or implementing a new one, would be a big job. What's the clean solution - is FUSE the best way, or is there a simpler way that doesn't require root privs? Can I do it using the existing id3lib binary?"

7 of 65 comments (clear)

Min score:

Reason:

Sort:

You'd have to extend the API by Ayanami+Rei · 2004-05-18 02:15 · Score: 4, Interesting

You'd better be prepared to extend the API with a URL handler...

There's no point adding http:// support without also adding ftp:// URL support. FTP supports range fetching as well.

So you have handlers for http:// URLs, ftp:// URLs, and file:// URLs.

Then you'd have to map all the old (compatibility) file-oriented APIs into the new function handlers for file://. (Or maybe the opposite, map file:// into the old API, leaving the old implementation intact)

--
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
HTTP 499 by cryptor3 · 2004-05-18 02:28 · Score: 4, Interesting

It seems like it shouldn't be that hard. You just initiate the HTTP transfer and then cancel it as soon as you have as much data as you need.

I haven't actually done it, but speaking as a server operator, when I look through my server logs, you see some hits that end with status code 499, meaning that the transfer was aborted. So you just have the client software you're writing close the HTTP connection after it locates the end of the ID3 tag. It's probably not 100% efficient, but obviously a lot better than reading the whole MP3 file.

I'm assuming you're doing this in C/C++, but I'll try to do a prototype in perl.
1. Re:HTTP 499 by eyeball · 2004-05-18 03:37 · Score: 4, Informative
  
  From the ID3v2 FAQ:
  
  Q: Where is an ID3v2 tag located in an MP3 file?
  
  It is most likely located at the beginning of the file. Look for the marker "ID3" in the first 3 bytes of the file.
  
  If it's not there, it could be at the end of the file (if the tag is ID3v2.4). Look for the marker "3DI" 10 bytes from the end of the file, or 10 bytes before the beginning of an ID3v1 tag.
  
  That's the problem -- it could be at the end, requiring you to spin through all x bytes (most likely megs) until you get to the end.
  
  --
  
  _______
  2B1ASK1
2. Re:HTTP 499 by pbox · 2004-05-18 05:18 · Score: 4, Insightful
  
  This is why
  
  1. read first 3 bytes with http bytrange
  2. if id3, process tag from byte 0
  3. else read last 10 bytes
  4. if 3di, process tag from backwards
  5. else, see if there is a id3v1 tag at the end
  6. if yes, read last 10 bytes before id3v1
  7. if 3di, then process backwards
  
  So it is possible. He just needs to read the fricking id3 tag definitions.
  
  --
  Code poet, espresso fiend, starter upper.
ID3v2 Sucks by DeadSea · 2004-05-18 02:38 · Score: 5, Informative

As somebody who has tried to write libraries that read ID3v2 tags, I'd have to say I hate them. The standard is clear and well documented, but the chosen format is horrible. It is very hard to write a parser correctly. It would have been so much better to embed an XML document at the front of the MP3 file. Instead they decided to make each field in a special binary format prepended by a length field.
The number of checks you have to do is phenominal. The biggest worry is buffer overflow where the length given is greater than the actual length of the tag and you read more than is in the file. There are just hundreds of such edge cases. Libraries for ID3v2 are likely to be buggy, crashy, and just no fun.
1. Re:ID3v2 Sucks by Fweeky · 2004-05-18 07:30 · Score: 4, Informative
  
  foobar2000 uses APEv2 tags on MP3's by default; the standard's just as flexible (well, as much as anyone wants anyway), but, well, you just need to compare filesizes for their handlers; an ID3v2 reader/writer I saw was ~150k of code -- the APEv2 one was 15k. They're always at the end, but obviously since fb2k is the only player I'm aware of which supports it the appeal may be limited. You can at least mix them with ID3v1, which should be good enough for portables.
  
  And before anyone goes off on one because it's non-standard, I'll point out that MP3 has *no* provision for metadata. ID3v1 and 2's are just as arbitary addons as APEv2; they're just older (and lamer, either in big limitations or extreme overcomplication).
  
  I believe the recommended *standard* way of attaching metadata to an MP3 now is to put it in an MP4 container, which has it's own more sensible format. Again, I'm pretty sure foobar2000 (maybe with some plugin in the Special Installer) can put them in, and I think they should play on anything which knows about MP4. Fully reversable too.
Vorbis comments UTF-8 by rillian · 2004-05-18 09:47 · Score: 4, Informative

Vorbis-comments are ASCII only, right?

No. The field names are ACSII only (actually a printable subset minus '=') but the contents of the fields are specified as UTF-8.

The intention was you could put arbitrary binary data in there too, but there's no general mechanism for marking it as anything else. So any non-UTF-8 use would be application specific.