Slashdot Mirror


Spotify Is Writing Massive Amounts of Junk Data To Storage Drives (arstechnica.com)

An anonymous reader quotes a report from Ars Technica: For almost five months -- possibly longer -- the Spotify music streaming app has been assaulting users' storage devices with enough data to potentially take years off their expected lifespans. Reports of tens or in some cases hundreds of gigabytes being written in an hour aren't uncommon, and occasionally the recorded amounts are measured in terabytes. The overload happens even when Spotify is idle and isn't storing any songs locally. The behavior poses an unnecessary burden on users' storage devices, particularly solid state drives, which come with a finite amount of write capacity. Continuously writing hundreds of gigabytes of needless data to a drive every day for months or years on end has the potential to cause an SSD to die years earlier than it otherwise would. And yet, Spotify apps for Windows, Mac, and Linux have engaged in this data assault since at least the middle of June, when multiple users reported the problem in the company's official support forum. Three Ars reporters who ran Spotify on Macs and PCs had no trouble reproducing the problem reported not only in the above-mentioned Spotify forum but also on Reddit, Hacker News, and elsewhere. Typically, the app wrote from 5 to 10 GB of data in less than an hour on Ars reporters' machines, even when the app was idle. Leaving Spotify running for periods longer than a day resulted in amounts as high as 700 GB. According to comments left in the Spotify forum in the past 24 hours, the bug has been fixed in version 1.0.42, which is in the process of being rolled out.

34 of 196 comments (clear)

  1. Typical of today's programmer by rfengr · · Score: 5, Insightful

    Bandwidth, memory, clock cycles....don't matter. Use more shitty layers of abstraction over layers built into high level languages, then kick it out the door.

    1. Re:Typical of today's programmer by MitchDev · · Score: 4, Interesting

      Wonder what this does to people's data plans and consumption of their monthly limits...

      So glad I just use local music files and don't stream. Write once, maybe again to add some more music, then just read many,,,

    2. Re:Typical of today's programmer by Anonymous Coward · · Score: 2, Interesting

      Today's programmers? It's been rampant since at least the 1990's...

    3. Re:Typical of today's programmer by Anonymous Coward · · Score: 5, Insightful

      If you can't differentiate between bad programming and high-level programming with abstractions, you're part of the problem.

      PS lots of great software is written in higher level languages than you're probably capable of ever reading.

    4. Re:Typical of today's programmer by Anonymous Coward · · Score: 2, Funny

      lots of great software is written in higher level languages than you're probably capable of ever reading

      English isn't apparently among them.

    5. Re:Typical of today's programmer by jareth-0205 · · Score: 3, Insightful

      Bandwidth, memory, clock cycles....don't matter. Use more shitty layers of abstraction over layers built into high level languages, then kick it out the door.

      Well, what do you expect? Everyone expects client programmers to support more devices, more user for less money, cheaper / free apps. The last 3 places I've worked at had no QA department whatsoever.

      I know it's fashionable to shake the fist at 'lazy' programmers, but the fact is we expect more functionality from less dev time, requiring abstractions, libraries that aren't completely controlled or understood, testing skipped, etc. Programmers aren't the problem, relentless competition is.

    6. Re:Typical of today's programmer by Anonymous Coward · · Score: 2, Insightful

      Well, what do you expect? Everyone expects client programmers to support more devices, more user for less money, cheaper / free apps. The last 3 places I've worked at had no QA department whatsoever.

      I know it's fashionable to shake the fist at 'lazy' programmers, but the fact is we expect more functionality from less dev time, requiring abstractions, libraries that aren't completely controlled or understood, testing skipped, etc. Programmers aren't the problem, relentless competition is.

      I certainly expect better. Open source delivers quality - again and again. Any organization with an actual budget ought to do better. And please note that the competition is on quality, not on prettyness, and not on delivery date either.

      Also, this can't be a bug resulting from sloppy programming. Sloppy/quick programming results in apps that crash "occationally" and a lot of corner cases that aren't quite right. This MASSIVE writing is something else entirely. Fortunately, spotify is not necessary. In my case, it lost to the "relentless competition": Buying CDs and ripping them myself "just works". No io at all, except during playback or ripping. And then it is merely measured in kB/s. . .

    7. Re:Typical of today's programmer by Anonymous Coward · · Score: 3, Interesting

      It's called Gates Law, because it's the opposite of Moore's Law.

      Every 18 months hardware became[1] twice as fast, and every 18 months software becomes[1] half as fast.

      [1] This trend has mostly stopped for hardware, but software is still becoming slower with each new version, something I can see at the office where everybody is complaining about how slow the PCs are running with Windows 10, where as mine is running Windows 7 just fine[2].

      [2] Well, fine for Windows anyway. Of course things don't happen instantly, like I'm used to on my home Linux machine.

    8. Re:Typical of today's programmer by MitchDev · · Score: 2

      Remember when the OS fit on a floppy and only did the most basic tasks rather than spying on the user and trying to be everything including the kitchen sink?

    9. Re: Typical of today's programmer by AcerbusNoir · · Score: 2

      It has very little to do with abstraction layers.

      It's poor implementation, lack of appropriate testing and, in a lot of cases the aforementioned is a result of unrealistic deadlines.

    10. Re:Typical of today's programmer by Yvan256 · · Score: 2

      I remember when the "OS" was stored in ROM ICs, computers didn't even have floppy drives and could boot under one second.

    11. Re:Typical of today's programmer by david_thornley · · Score: 2

      Data compaction is easy. Here's the entire NSA archive in compacted form: 1. It's a bit lossy, but with the right expansion program it'll work fine.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
    12. Re:Typical of today's programmer by david_thornley · · Score: 2

      If you don't understand an API and what the functions do, it doesn't matter if the code you write has any abstraction. You're still probably going to screw up.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
  2. Re:distributed client? by Anonymous Coward · · Score: 2, Interesting

    Nah, then you'd see an increased network usage. This is probably just Firefox's fsync bug repeated: in order to ensure data integrity, SQLite has a mode that fsyncs on commit. (After all, if the data isn't written to storage, it isn't really committed.) If you combine that with autocommit after every minor transaction, you get a ton of fsyncs and massive data usage.

  3. Re:Do not store songs locally by Rockoon · · Score: 4, Insightful

    Im not sure its a problem solved by that.

    I think the gist of it is that for every small change to the data they store on your device, they are re-writing the entirety of the dataset they are keeping. So for instance they are logging a record that says "didnt play music this minute" but are re-writing the entire multi-year log.

    I blame XML and other formats that are used for the stupid reason that "we already have XML routines so lets use it for everything"

    --
    "His name was James Damore."
  4. Re:SSD finite write capacity help by Zak3056 · · Score: 3, Funny

    Why the hell would you put a pagefile in a ramdisk? "Yo dawg, I heard you love pages?"

    --
    What part of "shall not be infringed" is so hard to understand?
  5. Re:SSD finite write capacity helpIf by ledow · · Score: 3, Informative

    If you're writing enough to pagefiles, you need more RAM anyway.

    If you're writing a lot to temporary areas, you need to stop doing so.

    That said, I'm on an SSD machine at the moment that has been running for 6 months, with absolutely no special treatment, imaged from a years-old working PC without changing anything, and it's written 1.5TB. 1TB of that was the initial imaging process.

    It's the main workhorse in an IT Office in a school, use for 10+ hours every single day for everything imaginable. Client machines rarely use much.

    It has a write-life of 100TB. If it dies, I just hit F12 and re-image cleanly.

    At current usage (not including the initial image), I count that as 1TB of write a year, which gives longer the expected lifetime of the PC itself, however far out I am.

    There's no need for special treatment, no need to use special SSD transfer software, no need to over-provision, or increase RAM cache or anything else. Just have a PC that isn't slogging itself to death, and slap an SSD in.

    Don't expect it to last forever, but you shouldn't need to adjust ANYTHING at all.

    And I've done this on all the staff work machines earlier this year - zero failures so far and it has made much more of a performance difference than doubling the amount of RAM. In fact, where machines had motherboards that were limited in RAM, we SSD'd and saw HUGE performance increases better than those clients whose RAM we doubled but are running on traditional hard disks.

    At home I have a 1TB EVO 850 and that's the same. Literally imaged byte-for-byte, and is stupendously fast and no need for any software changes whatsoever, and the write numbers are predicting 20+ years of life despite a similar 10+ hours a day of usage.

    Don't RELY on it never failing. But they are going to be in warranty (whether that's by number of years, or data written) for the life of your machine, under even heavy usage, unless you're doing something incredibly stupid (like use in NVR, RAID, or similar without buying a high-write-endurance model).

  6. Re:Do not store songs locally by Anonymous Coward · · Score: 5, Informative

    From the comments on Ars, it seems pretty clear that there is a bug in the app causing it to repeatedly compact the sqlite database it uses. I'm sure we all know that that is something which should be done only when actually needed, so that's clearly a bug, not inefficiency.

  7. Re:Addendum & small 'correction'... apk by lucm · · Score: 2

    Pagefiles I don't put on software ramdisk (had to clarify that), but on HDD instead

    So you put the things that benefits the most from fast i/o on your slowest storage device instead of your ssd? Why not put it on a floppy drive, or a mounted network share connected to a VPS hosted on the other side of the country if you like to slow things down?

    Or maybe you just love that spinnig hdd sound.

    --
    lucm, indeed.
  8. Re: SSD finite write capacity help by Anonymous Coward · · Score: 2, Funny

    Should I also move my HOSTS file to a ram disk?

  9. Persistance abstracted to far? by Qbertino · · Score: 4, Informative

    This sounds like some smart software architect to the abstraction of the persistance/storage layer of the Spotify stack too far whilst at the same time storing to much of miniscule datapoints in Spotifys objects. Because once abstracted properly, adding attributes to your objects and the entire stack is trivial.

    Think of it:
    If your stacks ORM neatly abstracts everything concerning persistance and on the backside syncs on neatly whenever it has the opportunity, all you need is app-side developers and software designers storing every little piece of data they can find and that changes evers millisecond and then you have your bandwidth/load disaster as described.

    If something like this is the case with Spotify, which I do strongly suspect, it is a good example that goes to show that you can take clean-room design too far. And that a haphazard duct-tape and chickenwire approach to product development can have significant advantages, as you build around unforseen roadblocks on a daily basis and only add the features really needed.

    I see an example of this every day, as I am currently doing WordPress development and building a WordPress pipeline for an agency. Large parts of the WP legacy architecture are an abysmally convoluted mess built by people who shouldn't have been let near a keyboard 15 years ago. But having a non-developer build a production capable demo of a website in WP is significantly faster than starting with an actual UX prototype, which quickly leads our team into real-world problems that we often haven't suspected. And suddenly a proper ORM and cleanroom design would cause hassle at one end or the other.

    My to eurocents.

    --
    We suffer more in our imagination than in reality. - Seneca
  10. FTFY by Overzeetop · · Score: 2

    Generally there is no reason to do that, but there are some poorly coded applications that will page memory to disk, even when they don't need to.

    --
    Is it just my observation, or are there way too many stupid people in the world?
  11. Re:Do not store songs locally by Hognoxious · · Score: 4, Insightful

    Rust spinners wear out too. This can be a particular problem if it's constantly bringing the drive out of power-down.

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  12. Spotify, why by lucm · · Score: 4, Informative

    I use Google Play Music. Not only can it cache songs, you can also upload your own collection. And now that Google has acquired and integrated Songza, their playlists are awesome.

    --
    lucm, indeed.
    1. Re:Spotify, why by pnutjam · · Score: 2

      Google won't allow me to use the family plan with my custom domain, thanks for rubbing it in.

  13. Re:Do not store songs locally by jareth-0205 · · Score: 2

    Problem solved.

    This is a non issue on desktops, really.

    It takes a pretty small worldview to not be able to imagine people on limited bandwidth / unreliable internet connections.

  14. Re:Do not store songs locally by ATMAvatar · · Score: 4, Funny

    I blame XML and other formats that are used for the stupid reason that "we already have XML routines so lets use it for everything"

    XML is like violence - if it doesn't solve your problem, you aren't using enough of it.

    --
    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety."
  15. Re:Do not store songs locally by Holi · · Score: 2

    This isn't a bandwidth issue, nothing is being downloaded, It takes a pretty dense worldview not to read the article you are posting on.

    --
    Sorry, teleporters just kill you and then make a copy. A perfect, soul-less copy.
  16. has this been going on for years? by aisaac · · Score: 4, Informative

    Here is a possibly related complaint from almost three years ago.

  17. Re:Do not store songs locally by frank_adrian314159 · · Score: 4, Funny

    If you're using XML to solve a problem, you actually have two problems.

    --
    That is all.
  18. Browsers are doing it too by cjmnews · · Score: 2

    If you leave browsers up all all the time, they have the same problem. Firefox and Chrome. https://www.grc.com/sn/sn-580....

    --
    You can lose something that is loose, so tighten the loose item so you don't lose it.
  19. Re: Do not store songs locally by Yvan256 · · Score: 2

    Wait, there's articles now?

  20. Re:Do not store songs locally by Striek · · Score: 2

    You must be new here

    (notices UID) err, now I'm just confused...

    --
    "Government is like fire; a handy servant, but a dangerous master." -- George Washington
  21. disk usage consideration by yes-but-no · · Score: 2

    Seems developers don't consider to optimize disk I/O. Recently I saw a live event streamed using firefox (from a not so great website, i guess it uses flash) and it kept my disk 100% all the time. Why should a streaming service write all those video data into disk, can't it just cache in RAM n display n forget the bits?

    Such unnecessary disk i/o wears my disk down, increases power use (if I'm on say battery on my laptop) and of course creates a kind of internal DoS as it hogs the disk i/o and rest of processes can't get disk i/o or get delayed -- resulting in a sluggish OS response even to say some file explorers. ie a well behaved app/software should not hog any shared piece of hardware/resource (like disk-io) leading to system instability.

    Apps should be benchmarked not only on their memory foot print or CPU usage (like algorithm/big(Oh) s) but also on their external data traffic usage like disk/network i/o.

    I regularly watch my disk i/o usage by processes and get rid of any if I suspect they are hitting it unnecessarily hard.