Spotify Is Writing Massive Amounts of Junk Data To Storage Drives (arstechnica.com)
An anonymous reader quotes a report from Ars Technica: For almost five months -- possibly longer -- the Spotify music streaming app has been assaulting users' storage devices with enough data to potentially take years off their expected lifespans. Reports of tens or in some cases hundreds of gigabytes being written in an hour aren't uncommon, and occasionally the recorded amounts are measured in terabytes. The overload happens even when Spotify is idle and isn't storing any songs locally. The behavior poses an unnecessary burden on users' storage devices, particularly solid state drives, which come with a finite amount of write capacity. Continuously writing hundreds of gigabytes of needless data to a drive every day for months or years on end has the potential to cause an SSD to die years earlier than it otherwise would. And yet, Spotify apps for Windows, Mac, and Linux have engaged in this data assault since at least the middle of June, when multiple users reported the problem in the company's official support forum. Three Ars reporters who ran Spotify on Macs and PCs had no trouble reproducing the problem reported not only in the above-mentioned Spotify forum but also on Reddit, Hacker News, and elsewhere. Typically, the app wrote from 5 to 10 GB of data in less than an hour on Ars reporters' machines, even when the app was idle. Leaving Spotify running for periods longer than a day resulted in amounts as high as 700 GB. According to comments left in the Spotify forum in the past 24 hours, the bug has been fixed in version 1.0.42, which is in the process of being rolled out.
If you're writing enough to pagefiles, you need more RAM anyway.
If you're writing a lot to temporary areas, you need to stop doing so.
That said, I'm on an SSD machine at the moment that has been running for 6 months, with absolutely no special treatment, imaged from a years-old working PC without changing anything, and it's written 1.5TB. 1TB of that was the initial imaging process.
It's the main workhorse in an IT Office in a school, use for 10+ hours every single day for everything imaginable. Client machines rarely use much.
It has a write-life of 100TB. If it dies, I just hit F12 and re-image cleanly.
At current usage (not including the initial image), I count that as 1TB of write a year, which gives longer the expected lifetime of the PC itself, however far out I am.
There's no need for special treatment, no need to use special SSD transfer software, no need to over-provision, or increase RAM cache or anything else. Just have a PC that isn't slogging itself to death, and slap an SSD in.
Don't expect it to last forever, but you shouldn't need to adjust ANYTHING at all.
And I've done this on all the staff work machines earlier this year - zero failures so far and it has made much more of a performance difference than doubling the amount of RAM. In fact, where machines had motherboards that were limited in RAM, we SSD'd and saw HUGE performance increases better than those clients whose RAM we doubled but are running on traditional hard disks.
At home I have a 1TB EVO 850 and that's the same. Literally imaged byte-for-byte, and is stupendously fast and no need for any software changes whatsoever, and the write numbers are predicting 20+ years of life despite a similar 10+ hours a day of usage.
Don't RELY on it never failing. But they are going to be in warranty (whether that's by number of years, or data written) for the life of your machine, under even heavy usage, unless you're doing something incredibly stupid (like use in NVR, RAID, or similar without buying a high-write-endurance model).
From the comments on Ars, it seems pretty clear that there is a bug in the app causing it to repeatedly compact the sqlite database it uses. I'm sure we all know that that is something which should be done only when actually needed, so that's clearly a bug, not inefficiency.
This sounds like some smart software architect to the abstraction of the persistance/storage layer of the Spotify stack too far whilst at the same time storing to much of miniscule datapoints in Spotifys objects. Because once abstracted properly, adding attributes to your objects and the entire stack is trivial.
Think of it:
If your stacks ORM neatly abstracts everything concerning persistance and on the backside syncs on neatly whenever it has the opportunity, all you need is app-side developers and software designers storing every little piece of data they can find and that changes evers millisecond and then you have your bandwidth/load disaster as described.
If something like this is the case with Spotify, which I do strongly suspect, it is a good example that goes to show that you can take clean-room design too far. And that a haphazard duct-tape and chickenwire approach to product development can have significant advantages, as you build around unforseen roadblocks on a daily basis and only add the features really needed.
I see an example of this every day, as I am currently doing WordPress development and building a WordPress pipeline for an agency. Large parts of the WP legacy architecture are an abysmally convoluted mess built by people who shouldn't have been let near a keyboard 15 years ago. But having a non-developer build a production capable demo of a website in WP is significantly faster than starting with an actual UX prototype, which quickly leads our team into real-world problems that we often haven't suspected. And suddenly a proper ORM and cleanroom design would cause hassle at one end or the other.
My to eurocents.
We suffer more in our imagination than in reality. - Seneca
I use Google Play Music. Not only can it cache songs, you can also upload your own collection. And now that Google has acquired and integrated Songza, their playlists are awesome.
lucm, indeed.
Here is a possibly related complaint from almost three years ago.