Efficiently Reading ID3v2 Tags Over HTTP?
Paul Crowley asks: "Given an HTTP URL for an MP3 file, what's the best way to read its ID3 tags on a GNU/Linux system? It shouldn't be necessary to fetch the whole file: HTTP byteranges should make it possible to fetch only the tiny fraction that's needed, for a big saving in network bandwidth. However, existing ID3v2 libraries are designed to read local files. Extending these libraries for this purpose, or implementing a new one, would be a big job. What's the clean solution - is FUSE the best way, or is there a simpler way that doesn't require root privs? Can I do it using the existing id3lib binary?"
Why couldn't you save the result of the remote HTTP access to a temporary local file and allow the libraries access to that file?
You'd better be prepared to extend the API with a URL handler...
There's no point adding http:// support without also adding ftp:// URL support. FTP supports range fetching as well.
So you have handlers for http:// URLs, ftp:// URLs, and file:// URLs.
Then you'd have to map all the old (compatibility) file-oriented APIs into the new function handlers for file://. (Or maybe the opposite, map file:// into the old API, leaving the old implementation intact)
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
ID3v2 data *IS* at the start of the file, normally takes the first 4kB or so (depends on the padding settings).
It seems like it shouldn't be that hard. You just initiate the HTTP transfer and then cancel it as soon as you have as much data as you need.
I haven't actually done it, but speaking as a server operator, when I look through my server logs, you see some hits that end with status code 499, meaning that the transfer was aborted. So you just have the client software you're writing close the HTTP connection after it locates the end of the ID3 tag. It's probably not 100% efficient, but obviously a lot better than reading the whole MP3 file.
I'm assuming you're doing this in C/C++, but I'll try to do a prototype in perl.
The number of checks you have to do is phenominal. The biggest worry is buffer overflow where the length given is greater than the actual length of the tag and you read more than is in the file. There are just hundreds of such edge cases. Libraries for ID3v2 are likely to be buggy, crashy, and just no fun.
From your question, it sounds like you already have figured out how to use http to grab the relevant byte range. I don't know anything about ID3 tags but if they are at the head of the file, then you just need to get the head, if they are at the tail, then just get the tail. Save the relevant byte ranges as files locally on a directory structure that is based on the URL so that http:/mess-o-mp3.com/content/this.mp3 would map to /mess-o-mp3.com/content/this.mp3. If necessary append or prepend dummy mp3 file so that existing ID3v2 libraries that are designed to read local files can read the ID3 tags. Then just run the "existing ID3v2 library" against the file tree that you have just built, and translate the output from describing the contents of the file system to describing the contents of the Internet.
This doesn't see so complicated to me
Well, *actually* from what I understand, the id3v2 tags *can* be put anywhere in the file, so you could have (for instance) an hour-long mp3 of a radio stream and have the tags change with each song. I'm not sure if anything actually supports that, though. Regardless, if you tag with id3v2, it'll be right at the beginning.
Al Qaeda has ninjas!
Look at the MP3::Info Perl module, you might recognize the author's handle. It reads (and writes) tag info. It's used by the "jukebox" module Apache::MP3 (sample site) to generate pages with track info.
Basically every web jukebox out there does something like this so I'm sure there's plenty of other code available to work from. The mod_perl way is to put SetHandler perl-script then PerlHandler [name of module] in your httpd.conf file so when a URL request falls within that Location or Directory, the perl module handles returning whatever you want it to return.
If he needs to scan hundreds or thousands of files, that WILL add up in a hurry. Also, if he's clever, he can take advantage persistant HTTP connections, diable Nagle's Algorithm and really get a performance boost. Especially over a "slow" link.
This is a boring sig
Here is Netscape's JWZ hilariously sad-but-true rant about the ID3 header format:
CDDB: Feel the Pain
In case you didn't know, the file format that CDDB (and FreeDB) use is complete garbage. In addition to random idiotic crap like it being impossible to unambiguously represent a song title that has a slash in it, it's rocket science to figure out how long a song is supposed to be. I need this info not only to display it in Gronk (my MP3 jukebox software), but also for some error-checking that my CD-ripping scripts do, so that I don't end up with truncated files if there was a crash or a full disk or something.
So get this. CDDB files contain junk like this:
# Track frame offsets:
# 150
# 18265
# 32945
# 49812
# Disc length: 3603 seconds
#
DISCID=...
DTITLE=...
(You'd think that the fact that it's in a comment would mean something, but no: you have to parse both comments and non-comments, begging the question of what they thought "comment" means.)
Those numbers are the starting sectors of each track on the disc. There are 75 sectors per second. So you convert those to seconds by dividing, and then find the length of each track by subtracting each from the previous. Oh, but wait, they don't give you the sector address of the end of the last track: for that one, it's expressed in seconds instead of sectors, for no sensible reason. Still, the info is there, right?
Uh, almost.
It turns out that if the last track on a CD is a data track (an ISO9660 file system) then there is a gap between the last track (the data track) and the second-to-last track (the last audio track.) This gap is exactly 11400 sectors (152 seconds, 2:32.) On some discs, you can actually see this track, it's a differently-shiny ring. Why's it there? I don't know. Why's it that size? I don't know. What if the data track is not the last track on the CD? (Does that even work?) I don't know.
So what this means is, when computing the length that a track should be, you have to subtract 152 seconds from the length of the second-to-last track, only if the last track is a data track.
How do you tell whether the last track is a data track, without having the CD in question physically in your computer? By hoping that the CDDB file contains the words "data track" in the title of that track, I guess. Yeah, that's reliable.
And, just to keep things interesting, it turns out that older versions of grip and cdparanoia didn't skip over this gap when ripping: instead, they would append 152 seconds of silence onto the end of the second-to-last track. So now my script that sanity-checks the lengths of the files has to consider two different lengths to be "right", since I now have CDs that were ripped both ways.
Whee. I love love love supporting standards invented by 12-year-olds.
Of course the reason that I use CDDB files at all in Gronk is because of the mind-blowing worthlessness of ID3 tags (32 character limits on titles, etc.) Yay more standards invented by 12-year-olds. (Please don't even mention ID3v2 or Ogg. I laugh at you, you silly person. Those are universally-unsupported fantasies that simply trade one set of problems for a whole new set of problems.)
And as if CDDB wasn't bad enough, FreeDB has taken the CDDB braindeadness and layered even more braindeadness on top of it: it is truly a thing of wonder.
For example, go ahead and try to ever have the "genre" field be something approaching reality -- oops! The first person who ripped this CD said it was "folk" because that's genre number zero! So fix it and resubmit it to the database? Sorry! You can't ever change the genre of an entry in the database after creation, since the genre dictates what directory the file goes in on their server. And so on.
It's a wonder anything works at all.
cpeterso
Vorbis-comments are ASCII only, right?
No. The field names are ACSII only (actually a printable subset minus '=') but the contents of the fields are specified as UTF-8.
The intention was you could put arbitrary binary data in there too, but there's no general mechanism for marking it as anything else. So any non-UTF-8 use would be application specific.
This is an interesting general problem. I'm sorry that so few people seem to have taken the time to understand it before replying. The general approach seems to be "read the first sentence, assume the poster is an idiot, hit reply".
The problems are these:
1) Reading ID3v2 tags on an MP3 file is a complex business. I have no desire to re-implement the libraries that do that, or even to wade deep into the existing codebases, if I can avoid that. And it should be possible to avoid that.
2) Even knowing the size and location of ID3v2 tags is complex. Contrary to popular belief here, those tags can appear at either the beginning or the end of a file, and can be arbitrary size. I already implemented the "fetch some stuff at the beginning and some stuff at the end and feed that to the library" approach, and it sort-of works, but you have to guess the size of the tag. Guess too big, you fetch lots of data unnecessarily. Guess too small, you get breakage or wrong results. By contrast, the libraries that read ID3v2 tags know exactly where and how much to read to glean the appropriate data, and it should be possible to make use of that.
3) I want to read existing data - changing the format of that data is not an option.
So that's why I was suggesting solutions like "FUSE". With FUSE, when the library does a seek and a read, I can arrange for just the relevant portion of the file to be fetched. I don't have to include any knowledge about ID3 in my application - the library does all the work. But the library doesn't have to worry about HTTP byte ranges - FUSE handles that. And the code will always be correct.
The only trouble is that FUSE requires a kernel patch and root privs. The question is, is there a way to do the same trick without those limitations? Or is there a library for reading ID3v2 tags in an object-oriented language that will let me put an efficient back-end for fetching data on request using HTTP byteranges in place of the file?
The best information I've got out of this is that there's a pure-Python implementation of ID3v2 (most implementations appear to be built on top of the C library). This may be hackable to solve my problem.
Those of you who didn't think reading or thinking was necessary before posting - please don't do the next "Ask Slashdot" post the same discourtesy. Thanks.
Xenu loves you!
No it wouldn't. The tag is at the beginning of the file, so why not just fetch the first 512 bytes or so (more if you expect cover art in the tag) and save it into /tmp/.
If you really -must- download only the tag and not a byte more, then clearly you'd have to (A.) know the offset in each file where the tag ends. This is not possible without storing that in some sort of database. Which won't work if you aren't the person in control of the server. Or (B.) download the file and scan it as you download looking for the end of the tag and when you see it abort the download. Seems more trouble than it's worth to bother using those methods, though.
This is where you tell us how you expect to find out which sections of the files contain the tag. The "libraries that read ID3v2 tags know exactly where and how much to read" because of the fact that they have access to the whole file! If you want that kind of perfect efficiency, then you'll have to download the whole file!
If you want a solution that will allow you to escape downloading the whole file, just check for ID3 in the first 3 bytes and 3DI @ 10 bytes back from the end. Download a couple K in the appropriate direction (more if you expect more than a minority to have images). Neither means no tag, move on. Then parse the tags with your libraries, catch errors, and try to grab more on the few that will throw errors, in case someone put a huge image or War and Peace in a tag.
It's not that hard, and I'm not sure what sorcerer's magic you expected Ask Slashdot to come up with to help you with this.
I think we can all agree on one thing though. Whatever asshole who decided that a tag at the end of the file is a good idea needs to be smacked in the head.