Spotify Is Writing Massive Amounts of Junk Data To Storage Drives (arstechnica.com)
An anonymous reader quotes a report from Ars Technica: For almost five months -- possibly longer -- the Spotify music streaming app has been assaulting users' storage devices with enough data to potentially take years off their expected lifespans. Reports of tens or in some cases hundreds of gigabytes being written in an hour aren't uncommon, and occasionally the recorded amounts are measured in terabytes. The overload happens even when Spotify is idle and isn't storing any songs locally. The behavior poses an unnecessary burden on users' storage devices, particularly solid state drives, which come with a finite amount of write capacity. Continuously writing hundreds of gigabytes of needless data to a drive every day for months or years on end has the potential to cause an SSD to die years earlier than it otherwise would. And yet, Spotify apps for Windows, Mac, and Linux have engaged in this data assault since at least the middle of June, when multiple users reported the problem in the company's official support forum. Three Ars reporters who ran Spotify on Macs and PCs had no trouble reproducing the problem reported not only in the above-mentioned Spotify forum but also on Reddit, Hacker News, and elsewhere. Typically, the app wrote from 5 to 10 GB of data in less than an hour on Ars reporters' machines, even when the app was idle. Leaving Spotify running for periods longer than a day resulted in amounts as high as 700 GB. According to comments left in the Spotify forum in the past 24 hours, the bug has been fixed in version 1.0.42, which is in the process of being rolled out.
Bandwidth, memory, clock cycles....don't matter. Use more shitty layers of abstraction over layers built into high level languages, then kick it out the door.
... for highlighting the potential for damage as news, don't ya think?
Nah, then you'd see an increased network usage. This is probably just Firefox's fsync bug repeated: in order to ensure data integrity, SQLite has a mode that fsyncs on commit. (After all, if the data isn't written to storage, it isn't really committed.) If you combine that with autocommit after every minor transaction, you get a ton of fsyncs and massive data usage.
Im not sure its a problem solved by that.
I think the gist of it is that for every small change to the data they store on your device, they are re-writing the entirety of the dataset they are keeping. So for instance they are logging a record that says "didnt play music this minute" but are re-writing the entire multi-year log.
I blame XML and other formats that are used for the stupid reason that "we already have XML routines so lets use it for everything"
"His name was James Damore."
This is a non issue on desktops, really.
Correction: I think you meant to say this is a non-issue on desktops that are not using solid state drives.
Why the hell would you put a pagefile in a ramdisk? "Yo dawg, I heard you love pages?"
What part of "shall not be infringed" is so hard to understand?
If you're writing enough to pagefiles, you need more RAM anyway.
If you're writing a lot to temporary areas, you need to stop doing so.
That said, I'm on an SSD machine at the moment that has been running for 6 months, with absolutely no special treatment, imaged from a years-old working PC without changing anything, and it's written 1.5TB. 1TB of that was the initial imaging process.
It's the main workhorse in an IT Office in a school, use for 10+ hours every single day for everything imaginable. Client machines rarely use much.
It has a write-life of 100TB. If it dies, I just hit F12 and re-image cleanly.
At current usage (not including the initial image), I count that as 1TB of write a year, which gives longer the expected lifetime of the PC itself, however far out I am.
There's no need for special treatment, no need to use special SSD transfer software, no need to over-provision, or increase RAM cache or anything else. Just have a PC that isn't slogging itself to death, and slap an SSD in.
Don't expect it to last forever, but you shouldn't need to adjust ANYTHING at all.
And I've done this on all the staff work machines earlier this year - zero failures so far and it has made much more of a performance difference than doubling the amount of RAM. In fact, where machines had motherboards that were limited in RAM, we SSD'd and saw HUGE performance increases better than those clients whose RAM we doubled but are running on traditional hard disks.
At home I have a 1TB EVO 850 and that's the same. Literally imaged byte-for-byte, and is stupendously fast and no need for any software changes whatsoever, and the write numbers are predicting 20+ years of life despite a similar 10+ hours a day of usage.
Don't RELY on it never failing. But they are going to be in warranty (whether that's by number of years, or data written) for the life of your machine, under even heavy usage, unless you're doing something incredibly stupid (like use in NVR, RAID, or similar without buying a high-write-endurance model).
From the comments on Ars, it seems pretty clear that there is a bug in the app causing it to repeatedly compact the sqlite database it uses. I'm sure we all know that that is something which should be done only when actually needed, so that's clearly a bug, not inefficiency.
Pagefiles I don't put on software ramdisk (had to clarify that), but on HDD instead
So you put the things that benefits the most from fast i/o on your slowest storage device instead of your ssd? Why not put it on a floppy drive, or a mounted network share connected to a VPS hosted on the other side of the country if you like to slow things down?
Or maybe you just love that spinnig hdd sound.
lucm, indeed.
Should I also move my HOSTS file to a ram disk?
pagefile on a ramdisk is awesome because if you don't have enough ram all you have to do is either add ram to need less paging or add ram to increase the size of your ramdisk - you can't go wrong! It also saves a lot of expensive hard disk storage, especially when you put the computer to sleep.
lucm, indeed.
This sounds like some smart software architect to the abstraction of the persistance/storage layer of the Spotify stack too far whilst at the same time storing to much of miniscule datapoints in Spotifys objects. Because once abstracted properly, adding attributes to your objects and the entire stack is trivial.
Think of it:
If your stacks ORM neatly abstracts everything concerning persistance and on the backside syncs on neatly whenever it has the opportunity, all you need is app-side developers and software designers storing every little piece of data they can find and that changes evers millisecond and then you have your bandwidth/load disaster as described.
If something like this is the case with Spotify, which I do strongly suspect, it is a good example that goes to show that you can take clean-room design too far. And that a haphazard duct-tape and chickenwire approach to product development can have significant advantages, as you build around unforseen roadblocks on a daily basis and only add the features really needed.
I see an example of this every day, as I am currently doing WordPress development and building a WordPress pipeline for an agency. Large parts of the WP legacy architecture are an abysmally convoluted mess built by people who shouldn't have been let near a keyboard 15 years ago. But having a non-developer build a production capable demo of a website in WP is significantly faster than starting with an actual UX prototype, which quickly leads our team into real-world problems that we often haven't suspected. And suddenly a proper ORM and cleanroom design would cause hassle at one end or the other.
My to eurocents.
We suffer more in our imagination than in reality. - Seneca
Generally there is no reason to do that, but there are some poorly coded applications that will page memory to disk, even when they don't need to.
Is it just my observation, or are there way too many stupid people in the world?
Rust spinners wear out too. This can be a particular problem if it's constantly bringing the drive out of power-down.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
I use Google Play Music. Not only can it cache songs, you can also upload your own collection. And now that Google has acquired and integrated Songza, their playlists are awesome.
lucm, indeed.
Problem solved.
This is a non issue on desktops, really.
It takes a pretty small worldview to not be able to imagine people on limited bandwidth / unreliable internet connections.
I blame XML and other formats that are used for the stupid reason that "we already have XML routines so lets use it for everything"
XML is like violence - if it doesn't solve your problem, you aren't using enough of it.
"They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety."
This isn't a bandwidth issue, nothing is being downloaded, It takes a pretty dense worldview not to read the article you are posting on.
Sorry, teleporters just kill you and then make a copy. A perfect, soul-less copy.
Sata 1?
Wouldn't you be better served with something that isn't bottle necked by it's old connection?
Sorry, teleporters just kill you and then make a copy. A perfect, soul-less copy.
Here is a possibly related complaint from almost three years ago.
If you're using XML to solve a problem, you actually have two problems.
That is all.
A lot depends on the settings. Some WD drives were too "lazy" so they were constantly parking/unparking. Google wdidle3.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
That's the joke.
Cheap storage VM.
If you leave browsers up all all the time, they have the same problem. Firefox and Chrome. https://www.grc.com/sn/sn-580....
You can lose something that is loose, so tighten the loose item so you don't lose it.
Wait, there's articles now?
It also helps when you're rebooting, loading data from RAM is a lot faster than loading from a hard drive or even a SSD.
You must be new here
(notices UID) err, now I'm just confused...
"Government is like fire; a handy servant, but a dangerous master." -- George Washington
So this would mostly be small-writes for something which is essentially metadata? Talk about these people not even having a faint clue what they are doing. With the write-amplification you get in an SSD for small writes, this can probably kill a modern SSD in a week or less.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
That would only be a problem on laptops. Desktop drives spin-down far less often, if at all. (Mine do not. No reason for them to.)
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
users in that thread are reporting that even with the new version the issue persists... "I do have 1.0.42 installed. Still writing stupid data. "
Seems developers don't consider to optimize disk I/O. Recently I saw a live event streamed using firefox (from a not so great website, i guess it uses flash) and it kept my disk 100% all the time. Why should a streaming service write all those video data into disk, can't it just cache in RAM n display n forget the bits?
Such unnecessary disk i/o wears my disk down, increases power use (if I'm on say battery on my laptop) and of course creates a kind of internal DoS as it hogs the disk i/o and rest of processes can't get disk i/o or get delayed -- resulting in a sluggish OS response even to say some file explorers. ie a well behaved app/software should not hog any shared piece of hardware/resource (like disk-io) leading to system instability.
Apps should be benchmarked not only on their memory foot print or CPU usage (like algorithm/big(Oh) s) but also on their external data traffic usage like disk/network i/o.
I regularly watch my disk i/o usage by processes and get rid of any if I suspect they are hitting it unnecessarily hard.
I once tried to use a regex to parse XML and got caught in an infinite recursion of problems.
Chelloveck
I give up on debugging. From now on, SIGSEGV is a feature.
can only post 5x a day or so typically...)
for which we all thank the heavenly host!
Spotify Is Writing Massive Amounts of Junk Data To Storage Drives
Or are they talking about the music files?
It must have been something you assimilated. . . .
Mine do. Most energy efficient or otherwise green drives spin down very frequently. Not burning through 6w continuously spinning something that isn't doing anything constructive is a pretty good reason to.
If it's an APK HOSTS file I would suggest a ramdisk is the perfect place for it. Don't forget to reboot after installing it. Even if software doesn't ask you to it's always a good idea to reboot a windows machine after installing software.
That will teach 'em. It is absurd what programmers can do and get away with it, simply because you click on "I agree" on EULA starting with "No warranty"
Yes, it is going to be expensive, but software is getting worse.
Chrome and Vivaldi also does this, possibly due to some extension I have (Adblock Plus and BankID mostly.)
Three times at-least they have written away 1 TB of data thanks to messing with swap I guess.
I wish there was some information when the load got high / lots of data was written. Need some program for that. (Samsung EVO 850.)
I remember XML in one application I worked on, things like <AVeryVeryLongFieldNameThatTakesALotOfCharacters>A</AVeryVeryLongFieldNameThatTakesALotOfCharacters> on a slow and flaky connection.
"When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
Nobody has mentioned the temp fix yet so
On OS X, Open /Applications/Spotify.app/Contents/MacOS/Spotify in a hex editor.
Search for "VACUUM;" Replace with "xxxxxx;"
Once you apply that fix you can manually vacuum with
"~/Library/Application Support/Spotify/PersistentCache/mercury.db"
download sqlite3 from https://sqlite.org/download.ht...
Pagefiles I don't put on software ramdisk (had to clarify that), but on HDD instead
I place my pagefile o a "True SSD" as I call it based on DDR Ram
Dude you should pick one version and stick with it. Or let one of the voices win.
lucm, indeed.
You seem to have no idea how much power even an idle PC consumes. A 6W change is typically below what you can measure on mains-inlet.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
You seem to have no idea how much power even an idle PC consumes.
I may have no idea, but the power meters on my computer do. Around 350W when I'm playing games. Around 270W when just taxing the CPU. Around 90W when the computer is sitting there idle with the screen off, 89W with the screen off and the HDDs powered down too. So the 2 HDDs in my computer use 10% of the total power load of an idle PC. And that's a 5 year old not very energy efficient one.
Now looking at my server at home it uses 47W idle. And just over 110W when serving files from both arrays. So the HDDs on the PC which spends all of its time powered on use up more than half of the power, hence I power them down.
A 6W change is typically below what you can measure on mains-inlet.
A mains inlet measures watt-hours if you can't measure it then increase the integration time over which you're measuring. Or you could change the way you measure it. For instance you could measure it by looking at your bank account. Powering down the drives when not in use (the majority of the time on my NAS) saves me 11EUR per year. Over 120EUR for my server.
Also if you can't measure 6W then don't buy your measurement equipment from Alibaba.
Also if you can't measure 6W then don't buy your measurement equipment from Alibaba.
And there the discussion stops, as you just failed EE101.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Actually I am an EE, have been for many years. Nice try though.
You try to be legal and buy your music and videos, but there are always things fucking you around. Warnings on dvds about piracy. You just cant press play. Music services injecting ads, and killing of disk space...bizarre limitation on how long tv and movies stay on streaming services. Ads even AFTER you pay, and then they advertise ad free services that you pay extra for.. AND STILL GET ADS.
Piracy will continue while its more convenient to do so.
yes I have multiple pagefiles (2gb on IRAM & 512mb on a WD 10,000 rpm Raptor driven off of a Promise Ex-8350 128mb ECC ram caching raid sata 1/2 controller)
I see. I guess in your mind this explains why you can tell that you have your pagefile on HDD, not ram, but also that you have your pagefile on ram, not HDD. Let's call it a Shrodinger's pagefile.
lucm, indeed.
that's just what "your kind" does, lol... apk
Oh, so you're a racist on top of everything?
lucm, indeed.
Dude take a chill pill. Express yourself more clearly and don't contradict yourself if you want people to take you seriously. Those links you keep posting to other messages in the same thread are not supporting your points, they just make you look like an aspie with a grudge.
lucm, indeed.
I don't take FAKE NAME ONLINE
Yes you did just that in another thread, pretending to be someone else and linking back to this thread. Unfortunately your unique way to express yourself betrayed you. Next time try to write full sentences and don't constantly refer to the titles of your posts if you want to conceal your identity. The fact that you're probably one of the only persons on Slashdot who frequently posts links to other comments also was an obvious tell.
lucm, indeed.
You're no longer fooling me with your torrent of babble and bragging. Skate around it as much ss you want, but twice in this thread you've been caught lying, and now you've also been caught pretending to be someone else in other threads while waging your little vengeful campaign.
You're not merely the excentric techie people assume you are. You're a dishonest, scheming individual that just happens to have a hard time expressing himself succinctly and clearly. I'm disappointed, it's like finding out that the joyful greeter I see every week at the department store is a convicted sex offender.
lucm, indeed.