Spotify Is Writing Massive Amounts of Junk Data To Storage Drives (arstechnica.com)
An anonymous reader quotes a report from Ars Technica: For almost five months -- possibly longer -- the Spotify music streaming app has been assaulting users' storage devices with enough data to potentially take years off their expected lifespans. Reports of tens or in some cases hundreds of gigabytes being written in an hour aren't uncommon, and occasionally the recorded amounts are measured in terabytes. The overload happens even when Spotify is idle and isn't storing any songs locally. The behavior poses an unnecessary burden on users' storage devices, particularly solid state drives, which come with a finite amount of write capacity. Continuously writing hundreds of gigabytes of needless data to a drive every day for months or years on end has the potential to cause an SSD to die years earlier than it otherwise would. And yet, Spotify apps for Windows, Mac, and Linux have engaged in this data assault since at least the middle of June, when multiple users reported the problem in the company's official support forum. Three Ars reporters who ran Spotify on Macs and PCs had no trouble reproducing the problem reported not only in the above-mentioned Spotify forum but also on Reddit, Hacker News, and elsewhere. Typically, the app wrote from 5 to 10 GB of data in less than an hour on Ars reporters' machines, even when the app was idle. Leaving Spotify running for periods longer than a day resulted in amounts as high as 700 GB. According to comments left in the Spotify forum in the past 24 hours, the bug has been fixed in version 1.0.42, which is in the process of being rolled out.
Bandwidth, memory, clock cycles....don't matter. Use more shitty layers of abstraction over layers built into high level languages, then kick it out the door.
Im not sure its a problem solved by that.
I think the gist of it is that for every small change to the data they store on your device, they are re-writing the entirety of the dataset they are keeping. So for instance they are logging a record that says "didnt play music this minute" but are re-writing the entire multi-year log.
I blame XML and other formats that are used for the stupid reason that "we already have XML routines so lets use it for everything"
"His name was James Damore."
From the comments on Ars, it seems pretty clear that there is a bug in the app causing it to repeatedly compact the sqlite database it uses. I'm sure we all know that that is something which should be done only when actually needed, so that's clearly a bug, not inefficiency.
This sounds like some smart software architect to the abstraction of the persistance/storage layer of the Spotify stack too far whilst at the same time storing to much of miniscule datapoints in Spotifys objects. Because once abstracted properly, adding attributes to your objects and the entire stack is trivial.
Think of it:
If your stacks ORM neatly abstracts everything concerning persistance and on the backside syncs on neatly whenever it has the opportunity, all you need is app-side developers and software designers storing every little piece of data they can find and that changes evers millisecond and then you have your bandwidth/load disaster as described.
If something like this is the case with Spotify, which I do strongly suspect, it is a good example that goes to show that you can take clean-room design too far. And that a haphazard duct-tape and chickenwire approach to product development can have significant advantages, as you build around unforseen roadblocks on a daily basis and only add the features really needed.
I see an example of this every day, as I am currently doing WordPress development and building a WordPress pipeline for an agency. Large parts of the WP legacy architecture are an abysmally convoluted mess built by people who shouldn't have been let near a keyboard 15 years ago. But having a non-developer build a production capable demo of a website in WP is significantly faster than starting with an actual UX prototype, which quickly leads our team into real-world problems that we often haven't suspected. And suddenly a proper ORM and cleanroom design would cause hassle at one end or the other.
My to eurocents.
We suffer more in our imagination than in reality. - Seneca
Rust spinners wear out too. This can be a particular problem if it's constantly bringing the drive out of power-down.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
I use Google Play Music. Not only can it cache songs, you can also upload your own collection. And now that Google has acquired and integrated Songza, their playlists are awesome.
lucm, indeed.
I blame XML and other formats that are used for the stupid reason that "we already have XML routines so lets use it for everything"
XML is like violence - if it doesn't solve your problem, you aren't using enough of it.
"They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety."
Here is a possibly related complaint from almost three years ago.
If you're using XML to solve a problem, you actually have two problems.
That is all.