Slashdot Mirror


Ask Slashdot: Smarter Disk Space Monitoring In the Age of Cheap Storage?

relliker writes In the olden days, when monitoring a file system of a few 100 MB, we would be alerted when it topped 90% or more, with 95% a lot of times considered quite critical. Today, however, with a lot of file systems in the Terabyte range, a 90-95% full file system can still have a considerable amount of free space but we still mostly get bugged by the same alerts as in the days of yore when there really isn't a cause for immediate concern. Apart from increasing thresholds and/or starting to monitor actual free space left instead of a percentage, should it be time for monitoring systems to become a bit more intelligent by taking space usage trends and heuristics into account too and only warn about critical usage when projected thresholds are exceeded? I'd like my system to warn me with something like, 'Hey!, you'll be running out of space in a couple of months if you go on like this!' Or is this already the norm and I'm still living in a digital cave? What do you use, on what operating system?

170 comments

  1. I delete things when I'm done using them by Anonymous Coward · · Score: 5, Funny

    I never run out of disk space.

    1. Re:I delete things when I'm done using them by bobbied · · Score: 4, Interesting

      I'll bet that's not true...

      Seems to me that the stuff I work on keeps getting bigger and bigger, as does my collection of digital pictures and videos. Where I attempt to pare down what I keep, some of it stays around...

      I expect that most users do the same things and thus data keeps piling up. I don't think it matters how well you are at deleting stuff you don't need anymore.

      --
      "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
    2. Re:I delete things when I'm done using them by jedidiah · · Score: 1

      I never run out of space. As disks get larger and larger, the risk of running out of space seems like the single least significant thing possible. The real issue is corruption.

      Based on the headline, I would have expected this to be about content verification with all of the ZFS fanboys coming out of the work to extol it's virtues.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    3. Re:I delete things when I'm done using them by dissy · · Score: 4, Interesting

      I delete things when I'm done using them

      1) Many of my things I either desire to use for many years to come (a video download I paid for), or am required to keep to cover my ass (taxes, logs, most data at work due to policies, etc)

      2a) The cost of more storage space is almost always less than the cost of the time to clean up files that could be deleted. In the context of work this does depend heavily on exactly who made the data and their rate of pay / work load - but I've noted the higher up execs and managers tend to be the worst hoarders as well as of course the highest rates of pay. Most of the lower techs on the shop floor don't even have access above read-only to the network storage here, though that is far from universal everywhere.

      2b) Yes there are other people whos time is not as expensive, but no one other than the datas owner/creator can know 100% what needs to stay vs what can go (and sometimes even the owner/creator chooses wrong.)

      3) After deleting/archiving data, the chances of you needing it in the future are typically higher to much higher than the chances you are really done with it.

      4) For the small number of times you really are done with it (like, totally and fur sure), the amount of data that gets deleted is generally such a small percentage of the whole that, while still a good thing to do, doesn't really help much with the problem at hand - freeing up a lot of space for future needs.

      I never run out of disk space.

      You either have too much free storage space, not enough data, or possibly both :P

    4. Re:I delete things when I'm done using them by Mister+Liberty · · Score: 1

      I suppose you're much like me.
      Both of us are a "Being Digital".

    5. Re:I delete things when I'm done using them by pooh666 · · Score: 1

      Hell yeah :)

    6. Re:I delete things when I'm done using them by rijrunner · · Score: 1

      Generally, if you are the type of organization that needs to monitor a 1TB for filespace, you're the kind of company that can fill that 1TB of filespace.

      Where I currently work, it is not unusual to fill 1TB in about the same amount of time it took to fill a 100MB drive back int he day.

    7. Re:I delete things when I'm done using them by donaldm · · Score: 1

      I never run out of space. As disks get larger and larger, the risk of running out of space seems like the single least significant thing possible. The real issue is corruption.

      Based on the headline, I would have expected this to be about content verification with all of the ZFS fanboys coming out of the work to extol it's virtues.

      Well if you use a FAT file-system then you are asking for trouble. Most modern file-systems use journalling which are fairly reliable although I do recommend you find the best file-system that suites you, which in the case of a Microsoft OS is normally NTFS. If you use Linux then you really have a choice such as Ext3 and 4 (Ext2 does not support journalling) which are supported by Redhat. Other file-systems are BtrFS (supported by Redhat), ZFS (not supported on Linux by Oracle), JFS (supported by IBM), JFS (Silicon Graphics) and NTFS (not supported on Linux by Microsoft) - just to name a few.

      Of course you are really asking for trouble if you don't backup your data no matter what file-system you choose. In fact with a journalling file-system there is more chance of wilful or accidental removal of data than you would have of corruption caused by a journalling file-system.

      Personally I really cant see the attraction for ZFS on Linux when BtrFS will pretty much do the same job. Still if you want it it is available as OpenZFS.

      --
      There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
    8. Re:I delete things when I'm done using them by ByTor-2112 · · Score: 1

      Don't be a ZFS hater.

  2. Performance issues? by brausch · · Score: 3, Insightful

    How does performance change as the big disks approach full? That was always one reason for the rule of thumb about keeping at least 10% free space on UNIX.

    --
    "Almost every wise saying has an opposite one, no less wise, to balance it." - George Santayana
    1. Re:Performance issues? by Anonymous Coward · · Score: 5, Informative

      Well, ext4 strives to scatter files around disk to avoid fragmentation. Once the disk begins to approach full, it has to use even smaller and smaller holes to place data into, which causes some fragmentation.

    2. Re:Performance issues? by __aaclcg7560 · · Score: 2

      You want to keep the hard drive at 50% or less to maximize performance. If the hard drive is more than 50% full, the read/write head takes longer to reach the data. If the hard drive is 90% full, most OSes will have performance issues.

    3. Re:Performance issues? by gnasher719 · · Score: 4, Insightful

      ou want to keep the hard drive at 50% or less to maximize performance. If the hard drive is more than 50% full, the read/write head takes longer to reach the data. If the hard drive is 90% full, most OSes will have performance issues.

      Actually, any OS will have performance issues, because the transfer rate (MB/sec) drops from the outside tracks to the inside tracks. That's why for home use, you just buy the biggest hard drive that you can easily afford (if you need 1TB, you buy 3TB), because that way you use only the parts of the drive with the highest transfer speed, and the average head movement time is also a lot less.

    4. Re: Performance issues? by Anonymous Coward · · Score: 0

      Um performance is not the primary reason to alert on disks being full on SERVERS, and I don't think that was ever true.

      It's because whatever filled up the disks will stop working when it can no longer fill up the disk. The increased fragmentation that happens when it is nearly full won't happen for long enough to matter when something is automatically chugging through space.

      That's why you usually have two disk thresholds instead of one, to gauge how fast you're running through it.

    5. Re:Performance issues? by RenderSeven · · Score: 3, Interesting

      I typically partition the drive into two logical drives. The inner partitions with awful performance are where my media goes (movies, music, photos). The performance falloff is non-linear. Also, performance degradation over time is worse for the inner tracks, so inner tracks are where you put data that is more or less static, or at least written sequentially.

    6. Re:Performance issues? by fuzzyfuzzyfungus · · Score: 1, Insightful

      Unless you are the sort of disconcertingly disciplined and organized person who sets up a monitoring and alerting system for their dinky little desktop, you probably aren't talking about 'the hard drive'. At a minimum, you are probably dealing with some flavor of RAID, or ZFS, or an iSCSI LUN farmed out by some SAN that does its own mysterious thing behind the expensive logo, or some other additional complexity. Flash SSDs are also increasingly likely to be involved, quite possibly along with some RAM caches in various places.

    7. Re:Performance issues? by Nutria · · Score: 1

      Too bad my mod points expired, because that's exactly what I was thinking. Although, 20% was my rule of thumb.

      It probably has a lot to do with usage patterns: is your multi-TB volume used as an IMAP server, and thus chock full of 5-250KB files -- so that the FS can easily find contiguous holes --, or is it a video server fully of 1-5GB files so that contiguous holes are much harder to find when the disk is "only" around 70%? Or a DB server who's files are even huger, and so contiguous holes impossible to find one?

      Even then, circumstances can alter the situation, since if you create a bunch of *huge* tablespaces on a virgin FS and they never extend, then you can get up to a high usage percentage without fragmentation.

      --
      "I don't know, therefore Aliens" Wafflebox1
    8. Re:Performance issues? by NoNonAlphaCharsHere · · Score: 1

      Picture this:

      You're pulling into the parking lot at work, and you know that there are only 5% of the spaces free. How long will you have to drive around before you find a place to park?
      Now picture pulling into the parking lot at Disney World, and you know that 5% of the spaces are free. Now how long will you have to drive around?

    9. Re:Performance issues? by Anne+Thwacks · · Score: 0, Troll
      If you use Unix on a server, you should have multiple partitions. If you don't know that, you should not be using Unix. If you have a server, and you are not using Unix, you should not be in charge of a server. The partitions might be on different H/Ds if performance matters. Different partitions get used in different ways. Some grow faster than others.

      If any partition is more than 50% full, you had best have a plan for what to do next, even if it will not need to be done for two years. If you don't plan two years ahead, you should not be running a server. If you use SCSI disks on Unix, its easy to add more hard disks. If you are not using SCSI hard disks, well, presumably the data was not very important anyway.

      Did I mention the lawn?

      --
      Sent from my ASR33 using ASCII
    10. Re:Performance issues? by Anonymous Coward · · Score: 0

      some fragmentation

      Wish it was only 'some'. I have seen as much as 30k fragments for 1 file. It would be interesting if it would decide to move files to punch a bigger hole.

    11. Re:Performance issues? by Anonymous Coward · · Score: 0

      If the free spots are uniformly distributed, I'm not seeing a difference... Either way you come across an empty spot every 20 spaces.

    12. Re:Performance issues? by kuzb · · Score: 4, Insightful

      That's an interesting idea for the budget-minded, but personally I think if performance is actually an issue I'd use SSDs for things that need to be performant, and store everything else on regular drives.

      --
      BeauHD. Worst editor since kdawson.
    13. Re:Performance issues? by __aaclcg7560 · · Score: 1

      Did I mention the lawn?

      Did I ever mentioned that I was running Unix and/or server? Re-read my comment. I'm talking about hard drives in general.

    14. Re:Performance issues? by mysidia · · Score: 1

      You want to keep the hard drive at 50% or less to maximize performance.

      You're talking about short-stroking the drive which is fundamentally a different question --- than what percentage of your space usage is best for performance.

      For the sake of argument: Let's assume you create a single partition on your hard drive that only uses the first 30% of the disk drive, AND your partition's starting cylinder is carefully chosen to be in alignment with your allocation units / stripes down all RAID levels to avoid RAID crossing.

      What amount of filesystem usage is appropriate?

      Is there a point at which you should increase the size of your partition and filesystem for performance reasons, and how do you decide?

    15. Re:Performance issues? by NoNonAlphaCharsHere · · Score: 2

      Given a spherical cow of uniform density...

      That isn't how First Fit works. Ever.

    16. Re:Performance issues? by Anonymous Coward · · Score: 0

      Given a linked list of free spaces, there wouldn't be a difference. Even with a situation where you need different sized spaces (e.g. a hard drive), proper data structures will enure the difference is pretty minor.

    17. Re:Performance issues? by adri · · Score: 1

      Not entirely - it's an easier problem to solve when you're parking a car that takes up a single spot. Now, imagine you have a trailer count that is between 0 and 1024 parking spots wide, and breaking your trailer up into pieces (and then reassembling it!) is feasible but it takes time to do that.

      That's why it's not that simple.

    18. Re:Performance issues? by afidel · · Score: 4, Insightful

      Inner tracks have better seek times, which is why high performance applications often "short stroke" drives (ie artificially restrict the percentage of the drive used so that only the inner tracks are utilized, though with modern drives and transparent sector remapping it's unlikely this practices actually works), outer tracks have better streaming performance because more sectors move under the head in a given timeframe.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    19. Re:Performance issues? by Opportunist · · Score: 1

      Hey! No "your mom" jokes here!

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    20. Re:Performance issues? by gstoddart · · Score: 2

      Hmmm ... if the goal is to keep all of my disks under 50% to maximize performance ... don't I effectively need twice as much disk? And if it's under RAID I'd need at least 4x as much disk?

      Which kind of defeats the purpose of both having cheaper disk, as well as having monitoring to let me know when it's filling.

      Sorry, but who has the luxury of buying twice as much disk so we can keep them all under 50%??

      What you say might get you a performance boost, but otherwise it doesn't make a lot of sense to me.

      --
      Lost at C:>. Found at C.
    21. Re:Performance issues? by __aaclcg7560 · · Score: 1

      When my vintage MacBook (2006) started slowing down last year, I read that keeping the hard drive to less than 50% full would improve performance. As it was, my hard drive was 60% full and I was able to reduce it down to 40%. Performance improved noticeably. Replacing the hard drive with an SSD improved the overall performance some more.

    22. Re:Performance issues? by gstoddart · · Score: 1

      If you have a server, and you are not using Unix, you should not be in charge of a server.

      Wow, that is quite possibly one of the stupidest things I've seen in a while.

      "Yarg! All servers must run teh unix, because all software runs teh unix, and if it doesn't run teh unix it must be crap".

      Do people actually put you in charge of servers? For real?

      I'm no Microsoft fanboi, but it simply is not possible to run all software a large organization needs on unix.

      And believing otherwise is the sign of someone who either doesn't work for a large organization, or has been relegated so far into the corner as to be out of touch with reality.

      --
      Lost at C:>. Found at C.
    23. Re:Performance issues? by Slashdot+Parent · · Score: 1, Insightful

      That was pretty caustic, wasn't it!

      Anyway, in today's virtualized world, none of what you ranted about really matters anymore. If disk I/O is important to your application, you're using SSD. If your filesystem needs more space, you just grow it using your platform's volume manager. And yes, real work gets done on Windows servers now. It's not my personal cup of tea, but you might as well just acknowledge it.

      And you don't plan 2 years ahead because who knows what your requirements will be in 2 years?

      --
      They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
    24. Re:Performance issues? by unrtst · · Score: 1

      Even then, circumstances can alter the situation, since if you create a bunch of *huge* tablespaces on a virgin FS and they never extend, then you can get up to a high usage percentage without fragmentation.

      This.
      To explain it to uses that have no clue what a tablespace is, but may know what a partition is, imagine:

      * setting up a separate partition for every set of similar sized files
      * for very large files, give each its own partition
      * pad every file in out to a fixed size

      For example, for mp3's, pad every one of them to 6mb (I'm guessing; do some stats on your archive to determine optimal size).
      Every time you write one, it's 6mb to that one partition.
      If you delete one, there will be a hole in the filesystem, but it will be exactly 6mb.
      Next time you write one, it will fill in the hole = no significant fragmentation.

      In that case, you can allocate 100% of the drive. Planning for growth is left up to you, and the above isn't really a perfect description of tablespaces :-)

      Back to the typical desktop, AFAIK, the 10% rule still applies. It's not for growth, it's for optimal filesystem performance. Less than that an there's not enough free space to avoid significant fragmentation under typical usage.
      If you consider the above, it may be obvious that some files are worse than others at contributing to fragmentation. Log files, for example, can be awful. FS's do a damn good job nonetheless.

    25. Re:Performance issues? by gstoddart · · Score: 1

      Given a spherical cow of uniform density...

      With perfect conductivity, and with no mass ...

      --
      Lost at C:>. Found at C.
    26. Re:Performance issues? by __aaclcg7560 · · Score: 1

      Sorry, but who has the luxury of buying twice as much disk so we can keep them all under 50%??

      I'm planning to replace the 3x80GB hard drives in my FreeNAS file server at home with 3x1TB hard drives, as Newegg has 1TB drives on sale for $50. That will give me 80% free space in a RAID5 configuration for $150.

    27. Re:Performance issues? by Anonymous Coward · · Score: 0

      I did the same thing. But then SSDs happened.

    28. Re:Performance issues? by RabidReindeer · · Score: 3, Insightful

      If you use Unix on a server, you should have multiple partitions.

      I use LVM, you insensitive clod!

      Juggling physical partitions is a royal pain.

    29. Re:Performance issues? by Anonymous Coward · · Score: 0

      Depends which organization...if you're forced to you'll need to grudgingly keep some windows servers around to serve the office/sharepoint fleet, but many large organizations putter along with only *nix-based servers and a few windows desktops just fine (CERN, Munich, many scientific orgs, etc.).

    30. Re:Performance issues? by gstoddart · · Score: 1

      Sure, great ... and those of us in the real world who manage 10s or 100s (or in some cases 1000s) of terabytes?

      We're talking an entirely different price point and quantity.

      I seriously doubt people with NetApp servers and other large storage could even consider keeping 50% of their disk space empty just to make it slightly faster.

      My user account on my personal machine has over 1TB of stuff in it, which gets mirrored to two other drives. That adds up after a while when you're staying under 50%.

      So I'd be looking to double or triple that number.

      And, from the very little I know about RAID 5 ... if you only have 3 drives in it, you're not really getting a whole lot of added security, are you?

      --
      Lost at C:>. Found at C.
    31. Re:Performance issues? by Anonymous Coward · · Score: 0

      >> And if it's under RAID I'd need at least 4x as much disk?

      No, I'll stay 2x.

    32. Re:Performance issues? by Anonymous Coward · · Score: 0

      Hmm, wouldn't it be more likely to find a large contiguous hole in a partition filled to 80% with large files than one 80% filled with 5-150kB files? I really don't know, it needs some thinking...

    33. Re:Performance issues? by Anonymous Coward · · Score: 1

      Actually, hard drive track sizes are radii based and not all the same physical size even though they hold the same amount of data. That means inner tracks take the same amount of spin and time to be read as the outer tracks.

      (http://en.wikipedia.org/wiki/Disk_sector#mediaviewer/File:Disk-structure2.svg)

    34. Re:Performance issues? by gstoddart · · Score: 1

      >> And if it's under RAID I'd need at least 4x as much disk?

      No, I'll stay 2x.

      Depends on your level of mirroring, doesn't it?

      I know people who do storage for a living, and some places use the RAID x+y where you have levels of RAID giving mirroring, combined with striping and parity to get additional redundancy. I those situations, the amount of raw space you need is at least 2x the amount of usable space you want to end up with.

      And, a lot of those places replicate the entire storage to another instance as the redundant backup/failover/DR.

      --
      Lost at C:>. Found at C.
    35. Re:Performance issues? by __aaclcg7560 · · Score: 1

      And, from the very little I know about RAID 5 ... if you only have 3 drives in it, you're not really getting a whole lot of added security, are you?

      RAID5 requires a minimum of three drives. If one drive fails, the other two drives can continue function in degraded mode. The entire RAID would be lost if you have more than one hard drive failure. You could designate one or more extra hard drive as spares to automatically replace a failed hard drive. For extra security, each hard drive need to be on a separate controlller (which is what I have in my FreeNAS box). I typically have a hard drive crash every five years, which is why I replace my hard drives every five years. They keep getting bigger and cheaper all the time.

      RAID6 may not be the answer for the enterprise.

    36. Re:Performance issues? by JeffAtl · · Score: 1

      many large organizations putter along with only *nix-based servers and a few windows desktops just fine (CERN, Munich, many scientific orgs, etc.).

      I think you might be suffering from confirmation bias or maybe just denial - CERN does use Microsoft server based solutions.

      Here is just one example... CERN Using Microsoft Lync for Collaboration and Mobility

    37. Re:Performance issues? by Anonymous Coward · · Score: 0

      If you use Unix on a server, you should have multiple partitions.

      Speak for yourself. I use Unix on my fileserver, and all my drives are 90-95% full (ranging in size from 2TB to 4TB each), and I have never once noticed a speed issue except in rare cases where I accidentally filled the drive to 99% full.

      It totally depends what you are doing with your data. My data is mostly archival. I most write to it once and read back from it regularly (scrubbing or processing files) and most of my files are huge. I have no need for multiple partitions. Even the slow data rate of USB 2 is far in excess of my speed needs on this fileserver.

    38. Re:Performance issues? by dcollins117 · · Score: 1

      The inner partitions with awful performance are where my media goes (movies, music, photos).

      Hmm. I keep my media (movies, music, & photos) on an external USB drive. It's probably the slowest of all my storage devices and it works just fine. I'm sure there are higher latencies than your setup but I certainly never noticed them.

    39. Re:Performance issues? by Anonymous Coward · · Score: 0

      Also - you do realize that "inner tracks" you speak of are on the outer portion of the disc?

    40. Re:Performance issues? by Anonymous Coward · · Score: 0

      Be careful, it is fairly likely that with a raid5 set that when you go to replace your "1 bad drive" that there may be unrecoverable read errors on one or more portions of the remaining disks that had no indication of failure before the rebuild process started. It is very surprising how few vendors raid implementations do not utlize low use time on the devices to slowly and consistently verify rebuild is possible by reading all N disks and looking for any parity issues. MTBF only really gives you a very wide view of the failure rates for these devices, it gets fairly scary quickly when you start talking about large sets with lots of disks all sitting together. Here is an old article about MTBF @ google.

      http://storagemojo.com/2007/02/19/googles-disk-failure-experience/

      Lastly don't think that SSD is the fix here. SSD have a completely different failure mode then spindles, in many ways they are more scary in large sets.

    41. Re:Performance issues? by Anonymous Coward · · Score: 1

      I work at Disney, Insensitive Claude.

    42. Re:Performance issues? by Anonymous Coward · · Score: 0

      Uhh, they switched to https://en.wikipedia.org/wiki/Zone_bit_recording a long time ago.

    43. Re:Performance issues? by Anonymous Coward · · Score: 0

      " If the hard drive is more than 50% full, the read/write head takes longer to reach the data."

      You write as if 50% were some sort of threshold, which is absurd.

    44. Re:Performance issues? by Anonymous Coward · · Score: 0

      Bzzzt! Wrong! On modern disks, more data is stored on the outside tracks, as you would know if you had actually bothered to read the article rather than just look at the pictures:

      http://en.wikipedia.org/wiki/Disk_sector#Zone_bit_recording

    45. Re:Performance issues? by LinuxIsGarbage · · Score: 1

      With Windows, and NTFS, the MFT (Master File Table) occupies 12.5% of the disk space. Once all other sectors on the disk are full, it will actually store files IN the MFT reserved space, and you run the risk of fragmenting the MFT itself and decreasing performance.

      As well the defrag tool (automatically scheduled or not), requires 15% free space to run.

    46. Re:Performance issues? by gnasher719 · · Score: 1

      Sorry, but who has the luxury of buying twice as much disk so we can keep them all under 50%??

      i just had a look; if you need 1TB in a desktop, I can buy 1TB for £46 and 2TB for £54.

    47. Re: Performance issues? by Anonymous Coward · · Score: 0

      How about you leave, Linux shill? Linux filesystems can and do get fragmented (some more than others). Care to explain why the ext4 devs wrote a dedicated defragment tool if it's impossible to fragment in the first place?

    48. Re: Performance issues? by Anonymous Coward · · Score: 0

      Even with mirroring, it's still only 2x the number of disks to get 2x the capacity. 2 3TB drives mirrored is 3TB. 4 3TB drives mirrored is 6TB. In RAID5 and RAID6 though you get better scaling.

    49. Re: Performance issues? by Anonymous Coward · · Score: 0

      If you're padding files, by definition you can't fill the drive to 100%

    50. Re:Performance issues? by Anonymous Coward · · Score: 0

      This could be the stupidest comment ever posted, except for many, many other stupid comments, which after careful review are quite unlike this one but nearly as stupid.
      Never be tricked into believing a stupid comment just for the authoritative manner in which it is presented. Likewise, don't be tricked by ID numbers. They don't mean much either. Kids at dads computer again.

    51. Re: Performance issues? by Anonymous Coward · · Score: 0

      What kind of shitty OS are you using that performance degrades noticeably at 50%? Do you have like a 10 GB drive or something?

      I've hit 95% on my Windows machines. They only slow to a crawl if I run out of space.

    52. Re:Performance issues? by dbIII · · Score: 1

      It's 2014 - just get a shitload of cheap small drives and stripe across a lot of mirrors if you can't put up with the speed of one drive. Even if they are old and slow laptop (or "green") drives if you have enough of them it's still going to be faster than short stroking a single drive even if it is 10k rpm and SAS.
      If budget is a problem, then yes, get the couple of percent improvement from only using part of the drive instead of doubling the speed or more with mirrors.

    53. Re: Performance issues? by __aaclcg7560 · · Score: 1

      Snow Leopard OS X on a 120GB hard drive in a vintage Black MacBook (2006). RIP 2014.

    54. Re:Performance issues? by __aaclcg7560 · · Score: 1

      Uh, no. I wrote about hard disks in general. Other people started talking about RAID and I went with the flow. Slashdot exists to keep me amuse while I console into hurt computers and fix broken users at work. Ten years and counting. :P

    55. Re:Performance issues? by advocate_one · · Score: 1

      on a frictionless surface... and in a vacuum...

      --
      Donald 'Duck' Dunn: We had a band powerful enough to turn goat piss into gasoline.
    56. Re:Performance issues? by Anonymous Coward · · Score: 0

      Actually, hard drive track sizes are radii based and not all the same physical size even though they hold the same amount of data. That means inner tracks take the same amount of spin and time to be read as the outer tracks.

      (http://en.wikipedia.org/wiki/Disk_sector#mediaviewer/File:Disk-structure2.svg)

      That was true a long time ago. It hasn't been true for a long time.

      Most hard drives today put more data on the outer tracks (the tracks are longer, so the data density remains about the same).

      And it's entirely irrelevant for SSDs :-)

    57. Re:Performance issues? by Neil+Boekend · · Score: 1

      Then you are just wasting the space that would usually be in fragments.
      Unless I am terribly misinformed modern filesystems figure that out themselves and try to prevent fragmentation in a much better way. Old file systems didn't do that properly so back in the 90's (and beginning 00's for windows) that probably was a good way to work.

      --
      Well, I might have a way, but it only works on a semi spherical planet in a vacuum.
    58. Re:Performance issues? by allo · · Score: 1

      so not only your FS is fragmented, but your partitions, too.

    59. Re:Performance issues? by RabidReindeer · · Score: 1

      so not only your FS is fragmented, but your partitions, too.

      I prefer to think of them as "creatively load-balanced".

      But actually, no, and no. It's not like I'm running NTFS here. Plus, if I do get my logical volumes too fractured, I can jack in another drive, move them over to the other drive, clean up the original physical volume(s) (logical volumes - unlike partitions - can span drives), move the logical volume back again, all without rebooting the system.

  3. Nagios XI by Jawnn · · Score: 1, Informative

    Isn't smart enough to track trends, but it does do graphs so you can easily see where your headed and how fast.

    1. Re:Nagios XI by Anonymous Coward · · Score: 0

      A plugin could be made smart enough to do this.

    2. Re:Nagios XI by Bob+the+Super+Hamste · · Score: 1

      I have written plugins to do this and variations on this. Also since there isn't a difference between NagiosXI and Nagios Core as far as this is concerned it is just as applicable to both.

      --
      Time to offend someone
    3. Re:Nagios XI by Anonymous Coward · · Score: 0

      Are you permitted to share these?

    4. Re:Nagios XI by Bob+the+Super+Hamste · · Score: 1

      Unfortunately no. They were developed on my employer's time for our customers so they own them.

      --
      Time to offend someone
  4. I use the red bar in Explorer by davidwr · · Score: 1

    Windows 7. :P.

    Seriously though, you do have a good question. Every environment is different. A stable environment with very little fluctuation can be a few hundred MB (plus whatever the OS needs for temporary files) away from capacity for years on end - set the alarm at that level plus 1. A drive that's used for archiving everything-ever-created in a video-editing shop will grow to infinity quite fast - set the alarm so you catch it in time to add more space and consider a second alarm that monitors for increases in the rate of growth. A "temp drive" that fluctuates wildly but has only hit 75% once and probably never will again can probably have the alarm set at 76%.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  5. Bigger question by wiredlogic · · Score: 1

    The bigger question is how to reserve less than 1% for the superuser?

    --
    I am becoming gerund, destroyer of verbs.
    1. Re:Bigger question by Bigbutt · · Score: 5, Informative

      It's a configuration option when you newfs a file system. Man newfs or mkfs.

      [John]

      --
      Shit better not happen!
    2. Re:Bigger question by Anonymous Coward · · Score: 0

      tune2fs -r

      Of course if this isn't a system partition but only for data, then just specify "tune2fs -m 0" for 0%. Personally, I have a relatively small root partition (30GiB) with 1% reserved for root and have everything else partitioned with 0%. Because, honestly, there's probably something wrong if my root partition is near full and I'd like to know about that sooner rather than later. It's the same reason I still have a swap partition. Because that way I'll notice the disk thrashing and investigate. Simple memory indicators are comparatively useless because many programs (java, web browsers, etc) will try to use 100% of RAM but have decently good back-off heuristics to avoid actually dipping into swap when all that RAM is being used for cache (and hence swapping the cache is self-defeating).

    3. Re:Bigger question by DarkOx · · Score: 2

      I don't know; the default 5% might be excessive for really big volumes but keeping at least %1 free seems 'smart' pretty much no matter how many orders of magnitude the typical volume grows to be. The typical file size has grown with volume size. We now have all kinds of large media files we keep on online storage now that previously would have run off to some other sort of media in short order.

      The entire port of the reservation is so in the event of calamity the super user retains a little free space to work in; if (s)he is going to be able to be able to shuffle things about they might well need what we nominally think of as quite a bit of space. Those things today might be a 100GB VM image or something on 20TB SAN volume for example.

      --
      Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    4. Re:Bigger question by jader3rd · · Score: 2

      Create a large file, that the super user then deletes when the super user needs to fix issues.

    5. Re:Bigger question by Anonymous Coward · · Score: 1

      Great idea!

      Hey, there's no sense it letting that large file got to waste, just taking up valuable disk space until the moment when the disk is almost full, so why not do something useful instead, like:

      df -Pk /|awk '{if ($4 ~ /^[0-9]*$/ ) {print "dd if=/dev/zero of=/RemoveMeIfDiskGetsFull file bs=1024 count=" $4 / 2 " && mkswap /RemoveMeIfDiskGetsFull && swapon /RemoveMeIfDiskGetsFull" }}'

      That'll make for loads of fun!

    6. Re:Bigger question by Rob_Bryerton · · Score: 1

      -m0

    7. Re:Bigger question by Anonymous Coward · · Score: 0

      Use tune2fs to change it. (ext2/3/4)

  6. Been suggesting this for years by Anonymous Coward · · Score: 0

    But the 'just monitor percentages' crowd always wins by demanding an across the board standard of specific percentages to alert on

  7. Percentages have a purpose by Anonymous Coward · · Score: 1, Interesting

    Percentages still make sense. Much more sense than absolute numbers.

    It's possible that the alarm thresholds we've chosen might be tweaked, but percentages are still the way to go.

    If you don't understand why we use percentages in the first place, you probably shouldn't be working in IT.

  8. Space monitoring by Anonymous Coward · · Score: 1

    At a previous job, I had set up a cron job that would record nightly to a database the amount of disk space used for each file system. I would then use excel to chart and project consumption trends. Using excel I predicted I would run out of space about 2 months after my server refresh. This was pretty close to accurate when I would have ran out of disk space, but since I had moved to a new server with twice as much space, it was a non issue.

  9. We have more but we USE more. by pla · · Score: 5, Insightful

    Today, however, with a lot of file systems in the Terabyte range, a 90-95% full file system can still have a considerable amount of free space but we still mostly get bugged by the same alerts as in the days of yore when there really isn't a cause for immediate concern.

    When we had drives in the 100s of MB range, we used a few MB at a time. Now that we have drives in the multi-TB range, we tend to use tens of GB at a time. In my experiences, a 90 percent full drive has as much time left before running out as it did a decade ago.

    Perhaps more importantly, running at 90% of capacity kills your performance if you still use spinning glass platters as your primary storage medium (not so much when talking about a SAN of SSDs). In general, when you hit 90% full, you have problems other than just how long you can last before reaching 100%.

    1. Re:We have more but we USE more. by vux984 · · Score: 1

      In my experiences, a 90 percent full drive has as much time left before running out as it did a decade ago.

      In your experience maybe. Not in mine.

      I don't use 10s of GB at a time. If I start a new torrent, dump my phones camera onto my computer, or install a new game that eats a several GB. But everything else is pretty steady state with very slow steady growth. I don't download a lot of torrents on this particular PC, and sometimes remove old ones, I install a few new games a year and sometimes uninstall old ones...

      When I hit 90% full on my current data drive, I'm probably 1 to 2 years out from hitting 95%.

    2. Re:We have more but we USE more. by Vellmont · · Score: 4, Informative

      Exactly. The question is strange (and the attitude of the poster is odd too... 20 years ago is "days of yore", and "olden days"?) Methinks dusting off the word "whippersnapper" might be appropriate here.

      Oddly enough, a similar question fell through a wormhole in the space time continuum from Usenet, circa 1994. "Now that we have massive HDs of 100s of megabytes, and not the dinky little ones of several megabytes from the Reagan era, do we still have to worry about having 95% usage alarms?"

      The truth being, if you got to 95% usage somehow, what makes you think that you're not going to get to 100% sometime soon? Maybe you won't, but you can't know unless you understand how and why your usage increases. That's not going to be solved by a magic algorithm alone, it involves understanding where your data comes from, and who or what is adding to it. This isn't new. The heuristics and usage question, and estimating when action needs to be taken is just as relevant now as it was 20 years ago.

      --
      AccountKiller
    3. Re:We have more but we USE more. by Anonymous Coward · · Score: 0

      But you are four years past the safe lifespan of your disk, and when needed, it could fail.
      Hoarding capacity for a decade is as foolish as running out of space tomorrow.

    4. Re:We have more but we USE more. by nine-times · · Score: 1

      In my experiences, a 90 percent full drive has as much time left before running out as it did a decade ago.

      Not in mine. Granted, we're both going off of anecdotal evidence, but in my favor, my experience is based off of managing a few hundred servers and a couple thousand desktops.

      It seems like most workstations/servers that I manage, if they're taking up massive amounts of space, it's very often because they're storing lots of old stuff. Several years ago, when we only had a 30 GB drives, people would go back and clear out, delete, and archive old data. Now they just store it, because why not? Storage is cheap. Most of the time, it doesn't seem like the data set is growing faster, but they're just holding on to old stuff longer.

      So yes, I think it's true, if you have a 60 GB drive that's 90% full, it's a more pressing concern than if you have a 10 TB RAID that's 90% full. The RAID may be a bigger problem, but it's a less immediate problem.

    5. Re:We have more but we USE more. by vux984 · · Score: 2

      But you are four years past the safe lifespan of your disk, and when needed, it could fail.

      Hence... backups.

      Hoarding capacity for a decade is as foolish as running out of space tomorrow.

      Hoarding capacity? I don't even really know what that is supposed to mean.

    6. Re:We have more but we USE more. by Anonymous Coward · · Score: 0

      How did you get to 90% if you're only using 2.5-5% per year?

    7. Re:We have more but we USE more. by afidel · · Score: 2

      YOU don't use 10's of GB at a time, but I bet your organization does. My company has expanded their storage by 50% per year compounded for at least the last 10 years (I've been here 8 and I have 2 years of backup reports from before I started), and I don't think we're that unusual if you look at the industry reports for GB shipped per year.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    8. Re:We have more but we USE more. by Anonymous Coward · · Score: 0

      Today, however, with a lot of file systems in the Terabyte range, a 90-95% full file system can still have a considerable amount of free space but we still mostly get bugged by the same alerts as in the days of yore when there really isn't a cause for immediate concern.

      When we had drives in the 100s of MB range, we used a few MB at a time. Now that we have drives in the multi-TB range, we tend to use tens of GB at a time. In my experiences, a 90 percent full drive has as much time left before running out as it did a decade ago.

      Perhaps more importantly, running at 90% of capacity kills your performance if you still use spinning glass platters as your primary storage medium (not so much when talking about a SAN of SSDs). In general, when you hit 90% full, you have problems other than just how long you can last before reaching 100%.

      I agree with all you said above, with respect to your primary OS drive. The problem with relliker's question is he did not specify the context.

      If we're talking about you primary boot device, then the same rules apply today that applied before drives got to the GB & TB ranges, for the performance reasons you mentioned. Likewise for things like defragging drives, you do it for performance reasons.

      If we're talking about external drives that are used solely for data, then things get more subjective and drag in the usage and data size caveats.

      The bottom line is, if your drive(s) is getting close to 75%-80% full OR getting close to end of warranty you should probably start looking for a newer, high capacity drive or backup data to tape or DVD/BD that isn't used that much.

      Should you buy a new drive when the one you own is almost full? This is really a silly question. Kind of like, "Are you a God?" the answer should always be yes.

    9. Re:We have more but we USE more. by magarity · · Score: 1

      running at 90% of capacity kills your performance if you still use spinning glass platters

      A decent SAN will show practically no performance degredation right up to the point it hits 100% full.

    10. Re:We have more but we USE more. by relliker · · Score: 1

      True, but I had a made-up scenario in mind where I get an initial 1.9T data purged out into a 2T archival fs leaving just 100GB free. I'd be at my 90% threshold (using the old % method) at start of operations but would only use a couple more GB at a time after each successive monthly purge. I'd still have a year or two of autonomy in the archive area even with 90+% of space already used up but my alerter would know not to alert me immediately after monitoring a few successive minor additions to the archive. Seems my mind jelly wandered off a bit when I thought of this. I was actually supposed to be studying. meh.

    11. Re:We have more but we USE more. by Rob_Bryerton · · Score: 1

      not so much when talking about a SAN of SSDs

      You mean an array of SSDs.

      Just as you wouldn't call a PC on a local network a "LAN", you don't refer to an array on a storage network as a SAN. The SAN is the network.

      Sorry, but this really bugs me...

    12. Re:We have more but we USE more. by mcrbids · · Score: 1

      With today's 4-8 TB drives, it's easy to keep billions of of files on a single disk, so you could potentially keep data for many thousands of customers on a single disk. But if you do that, you quickly run into an entirely new type of constraint: IOPS.

      The dirty secret of the HD industry is that while disks have become far bigger, they haven't really become any faster. 7200 RPM is still par for the course for a "high performance" desktop or NAS drive, and you can only queue up about 150 requests per second at 7200 RPM. Simple physics takes over.

      Spinning disks are already a non-starter for many scenarios, and this is a trend that will only accelerate as HDDs basically become the modern equivalent of tape backup.

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    13. Re:We have more but we USE more. by sribe · · Score: 1

      Perhaps more importantly, running at 90% of capacity kills your performance if you still use spinning glass platters as your primary storage medium (not so much when talking about a SAN of SSDs). In general, when you hit 90% full, you have problems other than just how long you can last before reaching 100%.

      Do you have actual experience or data to back up that claim? Because my verified benchmarked experience is the opposite, 90% does NOT "kill" performance. Of course you're using inner tracks and getting lower transfer speeds, but nothing really dramatic like what you'd see with extreme fragmentation.

      I will admit however, that when you get to 0.15% free (on a 4TB disk), performance really sux rox ;-)

    14. Re:We have more but we USE more. by sribe · · Score: 1

      With today's 4-8 TB drives, it's easy to keep billions of of files on a single disk...

      Uhhmmm, no, not quite ;-)

    15. Re:We have more but we USE more. by ihtoit · · Score: 1

      I have a 1TB drive with 5.5 million files on it (don't ask). Even scaling to 8TB, that'd still only be 44 million table entries. NTFS on a GPT volume can scale to 2^32-1 files, but I'd hate to think how big that'd end up being with 64KB clusters... 274TB? Grow it for larger files.

      --
      Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
    16. Re:We have more but we USE more. by vux984 · · Score: 1

      How did you get to 90% if you're only using 2.5-5% per year?

      a) I'm not at 90%; as it happens I'm at around 50%. I said when I reach 90% it will take a year or 2 to reach 95%

      b) I didn't start at 0% and then average a couple percent a year. I was at 30-40% within a week of setting up the new home PC.

      I copied my 10,000 track music library. So 50GB or so right there. And another several thousand digital images, scans, and so forth. I have a small library of ISOs I keep on the drive worth another 20-30GB. A handful of movies. A couple dozen games and large applications installed... the steam folder alone is 300GB. (And I have only a fraction of my library installed; but its the fraction I always go back to plus what's new that I'm playing now. So although its was 250GB+ within a week of setting up the PC... its only grown another 50GB in the last couple years.

      And now that its all set up, it grows, but not especially quickly. I add a few hundred audio tracks, and a few hundred photos a year, email, documents, tax records, etc... everything else is fairly steady state.

    17. Re:We have more but we USE more. by advocate_one · · Score: 1

      STacker and doubledisc certainly helped back then, but nowadays, most of our media is already compressed as much as it will go unless you want to lose reosolution or bitrate...

      --
      Donald 'Duck' Dunn: We had a band powerful enough to turn goat piss into gasoline.
  10. Recommend: Hard Drive Sentinel by Bomarc · · Score: 4, Informative

    I install the shareware version of Hard Drive Sentinel on all my Windows systems. It not only will warn you about hard drive usage (%); it will also warn you about errors on the drive -- and in my case I was able to predict that two drives were going to fail (saving data) before they actually failed.

    Their support has been very responsive and courteous, their product can work through (see drives behind) most RAID controllers.

    And no, I don't have any affiliation with HDS.

    1. Re:Recommend: Hard Drive Sentinel by antdude · · Score: 1
      --
      Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
    2. Re:Recommend: Hard Drive Sentinel by ihtoit · · Score: 1

      I got the full version of that a while ago, it's surprisingly useful - it even does SMART monitoring.

      --
      Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
  11. Whatever is measured is optimized. by QuietLagoon · · Score: 4, Insightful

    ...when there really isn't a cause for immediate concern.

    It all depends what one is concerned about. Is maximizing disk space down to the last possible byte important to you? Or is performance in accessing random data important to you? Or is wanting to keep artificial limits imposed by monitoring systems important to you?

    .
    Once you determine what is actually important to you, then you monitor for that parameter.

    Whatever is measured is optimized.

  12. Monitoring Sucks by Bigbutt · · Score: 1

    The problem is the monitoring group is reluctant to make "custom" changes due to the size of the environment. OS and hardware level alerts are a pretty minor part of the overall monitoring environment in terms of the number of configuration changes required. With mirroring and system/geographic redundancy, we can wait until the morning status reports to identify systems before they get to critical.

    [John]

    --
    Shit better not happen!
  13. It's all about the data prouction rate by aglider · · Score: 3, Insightful

    You insensitive clod! In the age of MBs, we were producing KBs of data. In the age of GBs we were producing MBs of data. And in the age of TBs we are producing GBs of data. And so on. Thus a 90% full filesystem is as bad as 10 year ago. Unless you are still producing KBs of data.

    --
    Sent as ripples into the electromagnetic field. No single photon has been harmed in the process.
    1. Re:It's all about the data prouction rate by nine-times · · Score: 1

      Unless you are still producing KBs of data.

      Well yeah, lots of people are. An awful lot of work is still done in Microsoft Word and Microsoft Excel. No need to embed a 5 GB video just because you have the space.

    2. Re:It's all about the data prouction rate by dissy · · Score: 1

      An awful lot of work is still done in Microsoft Word and Microsoft Excel. No need to embed a 5 GB video just because you have the space.

      *noob voice enable*

      Well no, I take a screenshot of the video, which is then embedded unscalable in an excel file, which I paste into a word document, which I then send in a mime encoded email to the entire company directory.
      I mean, this is the internet after all, it's not like some form of file transfer protocol exists or anything!

    3. Re:It's all about the data prouction rate by nine-times · · Score: 1

      Some of that kind of nonsense happens in Powerpoint presentations-- embedding images that might be a couple hundred megabytes each. I see that in marketing companies often enough, but it's still been a pretty steady rate of growth for the past few years.

      However, I still don't see multi-gigabyte Word or Excel documents, at least not often enough that I recall it.

    4. Re:It's all about the data prouction rate by dbIII · · Score: 1

      I've seen PDFs almost that big that were made by printing out large MS Word documents and then scanning them at 600dpi, 24 bit color. For added fun they used full sentences, including punctuation and variable whitespace, as their filenames. Various problems associated with making and opening such things I have been assured are due to a slow gigabit network and "crappy ten year old" i7 machines and not whoever decided to not just save as PDF. A few versions of files done that way and you've got GB and vast amounts of shredded paper before you know it. I'm not sure that some people even get the point of using computers in an office.

  14. Sigh... by Anonymous Coward · · Score: 1

    I went through this years ago (a decade ago probably) at my last job... and I agree while %thresholds are ok for some things, it's not the be-all/end-all of monitoring. I got in an argument with a person who wanted to "automatically add space" to nas/san storage when it hit 95% full... and, of course, my argument was that's somewhat useless - if it's a database volume it might *always* be 95% full, and never growing because it's one big "file" (db area, or maybe a couple but...) that doesn't grow. Or it might be a disk with some log files that some application got an error and started spewing out errors into a log and filled 10% of the disk in an hour before it was noticed... adding disk would be a waste, fix the application, zip/delete the log (maybe save it somewhere to analyze), etc. It's not a "cut and dry" thing.

    If you have a 2TB drive you store data on, it's 90% full and growing at 1%/year, you've got a long time before it's a problem. If it's 90% and growing at 1%/week, you've got a big problem in just a couple months. If it's at 90% and 'historically' has grown at 1%/yr, and suddenly it's at 98% tomorrow, you might want to suspect a problem or something way out of the ordinary - and not just go throwing more space at it.

    This is why you hire experienced sysadmins and don't just rely on automation and stupid 'rules' that don't always apply, and why sometimes (as much as we were always pushed to - so they could 'offshore' it all) you can't just 'document how to fix it' when there's lots of possibilities.

    1. Re:Sigh... by afidel · · Score: 1

      Adding 10% space AND notifying the sysadmin that autogrowth has happened is probably the best way IMHO, because it keeps things from crashing/locking up (most apps aren't happy to get an out of space notification) while allowing the intelligent person to investigate the root cause if they suspect an unusual cause (ie if my database server is growing its disk it's likely to be a bad query filling tempdb, I don't want the database to halt but I also want to figure out what the bad query is, but if a file server fills a volume it's almost always just the users adding more documents which I can't really tell them to stop doing).

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    2. Re:Sigh... by tnk1 · · Score: 1

      You need both. Sysadmins are adaptive, but (relatively) slow to respond. Automation is (should be?) much faster, but not usually all that adaptive.

      Automation is used, first and foremost, to trigger anything that you need to do to save your whole application or system which must run faster than a human reaction time. In that case, we consider a disk space alarm to be a signal to automation to step in before it is too late. But how do we know when it is too late?

      My answer to the poster's original question is that you do want to tailor your disk alarms to your specific usage patterns, but you want use those percentages, or some well-considered standard initially. That's because setting alarms properly isn't a one size fits all solution, it is a process driven by knowledge of your system or your application.

      Someone pointed out that it may take someone a year to go from 90% to 91%, especially in a personal environment, so they questioned the 90% number as useful. I think that entirely misses the point. For one thing, if you select a number, you are triggering yourself to take some action where you had none before. If, at 90%, you evaluate the situation and determine you need to raise the threshold, then nothing stops you from doing that.

      Many times, people who suggest default percentages are trying to do something important: they want to make sure there is a simple, easy to implement solution by default. Having a default 85-90% alarm apply to all hosts by default allows you to be covered by default while you work out the specific needs of your application. What is the worst thing that can happen? You get annoyed by the alarm a little bit more? Once you hit 90% for the first time, I'm betting you now have good data on your disk usage patterns. You can then make a data-driven decision about your thresholds.

      So, don't over think the percentages. Use a reasonable standard (like percentages), and then refine. At that point, you will know what method works best and you will have some protection in the meantime.

      And this is where you want a sysadmin. They clean up the mess and analyze the data. They tell the automation to trigger at 95% or 10GB remaining, or a growth velocity of greater than 1GB/hour. Or all three. And then he resets the alarms and/or mass deletes the files on the server and brings it back into service on his own terms and not in panic mode. And then hopefully writes a script to improve his log rotation (or whatever).

    3. Re:Sigh... by tnk1 · · Score: 1

      And for my part, I think that a sustained growth velocity metric is very useful out of the box. You know or you can calculate easily how big your filesystem is, so you can calculate a "time to full" which becomes your "window of opportunity" to fix the issue within. Any time the rate means that the remaining free space will be consumed in under a certain "safety" interval, you know you need to act. You then set a alarm threshold which makes sense with your reaction time.

      If you have automation to deal with the issue, you set the alert and auto-fix interval to some amount of time that is sufficient for it to react. If you are simply alerting a human, then you use a reaction time interval that is more in line with a human reaction time.

      You do have to pick a good sampling interval to reduce false alarms, but it is useful to have an alarm that states: "You now have less than one hour at the current rate of growth to prevent /dev/sda1 from filling up" Even if the rate slows down after a few seconds, you know that your system is doing something that is capable of a rate of growth that could fill your disk up in less than an hour, should it come to pass that the root cause manifests again.

  15. Create your own sensor by carlosap · · Score: 1

    Storage is going to be cheaper and cheaper, so percentage is ok for today and the future. You dont need trends an heuristics, just increase the percentage and you are good to go.

  16. ENOSPC by Anonymous Coward · · Score: 0

    On the operating system I use, the only alert I get is errno set to ENOSPC. and programs printing "No space left on device"

    No idea what you're talking about with annoying warnings and alerts at 90%.

    Although, most filesystems perform poorly when they are nearly full. As it is difficult to allocate new extents and it requires more system RAM to construct the mappings to many little extents.

  17. I don't think this analysis is right by seebs · · Score: 1

    While "only 5% of my disk" is now many times larger than it used to be, so are the things I'm moving around, so "95% full" is just as bad now as it used to be.

    Basically, once we got past quotas measured in single or double-digit numbers of kilobytes, this stopped changing for me. 95% full on a 100MB disk and 95% full on a 500GB disk work the same for me.

    --
    My blog: http://www.seebs.net/log/ --- My iPhone/iPad app: http://www.seebs.net/seebsfrac/
  18. monitoring versus reporting by Anonymous Coward · · Score: 0

    Most monitoring consists of taking a snapshot of something at fixed intervals. While you can generate alerts based directly on monitoring, they might not be as helpful as you would like, all those usage spikes and the drone of warnings when the thresholds are too low. When you look at a graph of monitored stats over time this is actually a reporting system, and can allow you to determine change over time, observe trends, and other good stuff. Use the reporting system to create second order monitoring information, and use that for your alerts.

    free_space / (current_usage - minus_1_day_usage) = number_of_days_until_full. Send an alert when this value is under 60.

  19. Synology by krray · · Score: 2, Interesting

    You're living in a digital cave IMHO.
    Don't worry, I was too until recently...

    Always mucked with fast external storage as the "main" solution -- firewire, thunderbolt, etc. This system is the main and had a few externals hooked up, that system had another, another over there for something else. It was a mess all around. How to back it all up??

    Gave them all away -- bought a Synology

    Then bought another (back it up :).

    180-200M/sec throughput is the norm. On the network. Beats out most external drives I've ever come across. Everything ties into / backs up to the array. Home and work now too.

    I use everything but Microsoft products. They're shit.

    My filesystem is 60T w/ under 10T used today. I'll consider plugging in more drives or changing them out in the Synology somewhere between 2017 and 2020...

    1. Re:Synology by nabsltd · · Score: 2

      180-200M/sec throughput is the norm. On the network.

      You have a 10 gigabit network? I ask because a 1 gigabit network can only provide 125MB/sec throughput. I know that some of the Synology units offer link aggregation support, but that also usually requires support in the switch and multiple network cards in each client.

      That said, even 200MB/sec isn't particularly good if you can only provide that total to one client at a time, especially for the cost of a Synology enclosure that can hold enough drives for 60TB of storage.

    2. Re:Synology by Anonymous Coward · · Score: 0

      A synology is seriously a beast at linear read. I am currently hitting about 800MB-900MB on one from 1 link (as I type this) on a 10gig link. However its random read/write are crap. I mean in the 5-10MB range. The one I have at home with a 1gig link is about like you said at 100-125MB. However, that one if I hit 10-20MB I am happy as it hits the rate it needs to stream what I need which just happens 802.11ac can do at my house (more like 30) :)

      To put that 95% in context you need to think about how long it took to get to whatever 95% represents. You also need to think about how much you started with and how long did it take to fill that empty space up to 95%. If it took 3 years you have some time. If it took 3 weeks you may have a problem. For example the synology I use at home took me nearly 4 years to fill up to 95%. I have some time. Which also gives me more choices (6TB-8TB drives to gang up on the problem) and will cost me less.

  20. Only part of the problem by AlanObject · · Score: 1

    I have less concern than the amount of data being stored as I do the incredible number of files that a typical system stores. Do an ls -lR / on a typical system and you will get tens or even hundreds of thousands of files.

    As recently as the days of Windows/NT 4 I could probably keep the gist of the entire structure in my head -- what each sub-tree is for and in most cases what each directory/file is for. Somewhere since then it has become impossible to do so and that goes for Windows, MacOS X, or almost any Linux distribution.

    1. Re:Only part of the problem by Anonymous Coward · · Score: 0

      I have less concern than the amount of data being stored as I do the incredible number of files that a typical system stores. Do an ls -lR / on a typical system and you will get tens or even hundreds of thousands of files.

      Typical -- what does that mean? Here at $WORK, it's not uncommon to have tens of thousands of files in a single directory.

      [Pause while I interrogate the database of backup statistics]

      Running ls -R1 | wc -l give an answer of about 750,000,000. That number can easily change +/- 100K per day. And this really isn't a very large system.

  21. Check_MK by tweak13 · · Score: 3, Informative

    We switched to Check_MK for monitoring. It's basically a collection of software that sits on top of Nagios.

    The default disk monitoring allows alerting based on trends (full in 24hours, etc.) or thresholds based on a "magic factor." Basically it scales the thresholds so that larger disks alert at a higher percentage, adjustable in quite a few different ways to suit your tastes.

    1. Re:Check_MK by Anonymous Coward · · Score: 0

      Love Check_MK!

      Simple to setup with the OMD dev branch. Literally install Ubuntu Server, grab the package for that version and run through the OMD 4 step install process :)

    2. Re:Check_MK by Anonymous Coward · · Score: 0

      We switched to Check_MK for monitoring. It's basically a collection of software that sits on top of Nagios.

      The default disk monitoring allows alerting based on trends (full in 24hours, etc.) or thresholds based on a "magic factor." Basically it scales the thresholds so that larger disks alert at a higher percentage, adjustable in quite a few different ways to suit your tastes.

      This is the first post that actually answer's the original question, and doesn't go into some philosophical debate about whether or not he is wrong to ask the question!

    3. Re:Check_MK by coofercat · · Score: 1

      We're switching to check_mk too. Honestly though, anything with a graph will do - periodically stick something into Graphite or just stick another line onto a CSV. Then draw a graph, draw a rough trend line and there's your answer. Getting a nice email/text message with that information takes a bit more work (where check_mk might help), but so long as you can see it with enough advanced warning, checking the disk graphs weekly (or even monthly) is probably enough.

  22. I share the same thoughts by sentiblue · · Score: 1

    Not only monitoring system should look at historical growth versus estimated time left to full, it should also keep track of storage addition, so that I can smartly tell us 3, 2, 1 months ahead, each with one email... and then it should also keep track of sudden increase and send an emergency alert if for example the past 7 days have climbed more than a whole month before... stuffs like that.

    All existing monitoring tools these days are not setup that way and I wish folks get more creative and dilligent with their works.

  23. It's a problem regardless by Anonymous Coward · · Score: 0

    If my drive is 90% full, I don't care if it is 2 MB or 100 GB, I have a situation I need to know about.

  24. 90% is still a good rule by azadrozny · · Score: 1

    If you are an enterprise shop, you likely have so many disks spread across so many servers that you probably have an admin team responsible for projecting utilization for the next 12 months, so that procurement and installation costs can budgeted.

    For the home user, or a small business, 90% is still a good rule of thumb. I would hate to see some additional process running in the background constantly projecting when the disk will be full. Just throw a warning for the user when you reach 80-90% capacity, and let them figure it out. They are probably more likely to fill their thumb drives than they are the local media.

    1. Re:90% is still a good rule by Anonymous Coward · · Score: 0

      What about storage virtual machine vhd images that are not sparsed? In across the board percentage based monitoring these tend to alert when it's not a concern

    2. Re:90% is still a good rule by relliker · · Score: 1

      I think an enterprise shop's bean counters would love to be told by their own machines that they don't need to upgrade to that bigger datastore for a few more months because "there's enough space there for another 8 months boss!" A human-powered utilization projection would have so many "safety margins" added in that the new datastore would have to be bought yesterday. The reality is that most of the times it's the 'counters that dictate stuff and they'd lap up this intelligent monitoring in a jiffy if it helps them justify cutting even a minor expense.

    3. Re:90% is still a good rule by Rob_Bryerton · · Score: 1

      These guys have got better things to do than squeeze a few extra months out of an array that's on a 3 or 4 year lease or support contract.

      If you're doing your sizing projections correctly (very tricky, I'll give you that), space is not an issue, but time is. If you lease, then you need a replacement installed before the lease is up; the leasing company wants their hardware back. There is no stretching it out. If you purchase, you are bound by the length of your support contract, as nobody sane is going to run an enterprise array without a current support contract.

  25. unlimited storage is a right by Anonymous Coward · · Score: 0

    It is ridiculous in 2014 that we have to worry about running out of disk space. We have the technology to solve this problem. The government should provide cheap unlimited disk space to all.

    Oh wait, I thought I was on the broadband thread...

    1. Re:unlimited storage is a right by Anonymous Coward · · Score: 0

      Obummer letting us all down.

  26. It's True by Anonymous Coward · · Score: 0

    One thing that I've noticed.

    Internet browsers tend to allocate cache space in terms of a %, and so does the OS itself (checkpoints, deleted space and so on). Those percentages haven't changed much over the years and allocating 5-10% seems quite reasonable at first.

    However do the math on a modern HDD. Let's take the latest 8 TB drives, just a 5% allocation for any caching purpose gives that cache roughly 400 GB! Now thats a humongous cache and many caching algorithms cannot efficiently use such an enormous cache space.

    The lesson is that as storage volumes continue to grow in capacity, optimal configuration requires the percentage allocated to reserved system uses needs to be scaled back.

  27. Use Splunk by Tuki · · Score: 1

    You can build advanced, predictive analytics with Splunk. It can do exactly what you asked for.

    --
    robots obey what the children say - TMBG
    1. Re:Use Splunk by relliker · · Score: 1

      Does look nice and "trendy" , but a bit overkill just to monitor a file system don't you think? :)

    2. Re:Use Splunk by Tuki · · Score: 1

      Start with advanced disk monitoring -> solve a bunch of problems that no one else can -> get a raise / profit! ;)

      --
      robots obey what the children say - TMBG
    3. Re:Use Splunk by manu0601 · · Score: 1

      You can build advanced, predictive analytics with Splunk. It can do exactly what you asked for.

      But that will generate GB of data, worsening the problem, won't it?

  28. The actual number you are looking for is 85%. by tlambert · · Score: 1

    The actual number you are looking for is 85%.

    Straight out of Donald Knuth volume 3: Sorting and Searching; at 85% fill, a perfect hash starts degrading in performance.

    The basis of the Berkeley Fast File System warn level was an 85% fill on the disk, which the filesystem effectively hashed data allocations onto. As people started getting larger and larger disks, they began to be concerned about "wasted space" in the free reserve, and moved the warnings down to 10%, then 8%, and so on.

    This is what the OP is suggesting (again) for very large disks, but without something like an LFS and a background defragger, fundamentally, most FS implementations performance still starts to drop of a 85%+ fill. Background defraggers/"cleaner daemons" have their own performance issues (e.g. like Garbage Collectors, they tend to run at the worst possible times, as in when you are putting performance pressure on the system already).

    But as Ken Thompson said: "The steady state of disks is full".

  29. Monitorix by Psicopatico · · Score: 1

    I use http://www.monitorix.org/ both at home and at work.
    It monitors (nearly) everything, filesystem(s) usage included.
    Can trigger a script of your choice when an arbitrary treshold is reached.

    Has nice colored graphs too.

    --
    Mastering the English language is fucking easy: all you have to do is to put an f* word in every fucking sentence.
  30. Smarter storage management by schakrava · · Score: 1

    I've been working on a open source storage solution(http://rockstor.com) to address some of these concerns, which I broadly categorize as "smart storage management". So far we have a few dashboard widgets that give insight into usage patterns and some probes for storage analytics. I think that alert mechanisms should model storage consumption and I/O patterns at the very least. Not only is it important to alert, but also provide recommendations so the admin/user has a clear action to follow up with. For example, "hey, you are running out of space at rate X, mainly due to files of type Y and you have W weeks until you completely run out of space. You can migrate these Z-set of files to archival storage which give you M more months of time." We hope to get there with Rockstor.

  31. Predictive Monitoring by Anonymous Coward · · Score: 0

    Most enterprise type monitoring packages (HP OM, IBM Tivoli Monitoring, CA Spectrum) have a predictive feature either installed by default or obtainable as a free bolt-on. As mentioned above, Nagios has Check_MK also.

  32. 80% by Anonymous Coward · · Score: 0

    If you're using a filesystem like ZFS, 80% is a critical threshold. Even if you have 50+ terabytes of storage.

  33. Large sites have long lead times by Anonymous Coward · · Score: 0

    As the title says, for a large site you'll typically needs 3-6 months notice to get from desire to delivery.

    You need to allow time for financial approvals, corporate governance, power approvals, floor space and cabling, quarterly forcast budget vs. exemption processes, etc.

    In that type of environment you need to monitor usage and pipeline, and initiate the procurement process at 60-80% capacity. The alternative is to risk running out of capacity before the new kit is operational (and having to explain that to the business).

  34. Yes, you are still living in a digital vave by Anonymous Coward · · Score: 0

    There are commercial monitoring products (BMC, IBM, etc) that can give you that level of information/alerting out of the box, usually the open source solutions are not that smart, on those cases usually you have to save your metrics to a data warehouse and from there you can do capacity planning and alert or automate stuff based on that.

    I would suggest to review your thresholds and update them (and generate a monitoring baseline thresholds for all your servers, yes, apply same baseline to all similar servers) accordingly to avoid getting alerts when there is no need, usually only actionable items should be alerted by default.

  35. No one answer by brausch · · Score: 1

    On the servers I manage, the usage is fairly stable so we have alerting set at various levels for each file system. Some are set above 95% and others as low as 60%. I want to know when disk usage changes abnormally, no matter what the absolute level is.

    Some disks are less important than others so they just send email alerts. The file systems that are critical send text messages since we're a 24x7 shop.

    --
    "Almost every wise saying has an opposite one, no less wise, to balance it." - George Santayana
  36. FreeNAS with ZFS by JPyObjC+Dude · · Score: 1

    I would always shoot for more disk but then issues arise from managing such large disks in the 1+ TB range that we tend to fill up fast.

    For a laptop or desktop, I am targeting two large drives in a RaidZ mirror on Linux. I would do the same for a desktop.

    For more data and centralization for my house or office, I would choose an iXSystems FreeNAS Mini. It has all the features that you need for your data and can be easily configured to send out warning messages on various measurements like disk space, SMART messages and raidz warnings. I think that de-dupe is coming soon if its not there already. The Mini is super powerful for its size and power footprint.

    With ZFS, it solves the nasty issue of having to recover files on massive disks like those we get today. There is nothing worse than waiting for fsck, surface scans or recovery operations on 1TB+ drives; It takes forever. With a well maintained ZFS system those issues are gone.

    Another really cool thing about ZFS is the ability to maintain a perfect audit on the faults in your drives. Once ZFS starts saying there are issues with the drive, you send it back to the vendor in warranty period with the error messages and you get a brand new drive. I met someone at BSDCan this year who has not purchased a new drive in years because he keeps finding errors before the warranty expires. Pretty sweet.

  37. duh by Anonymous Coward · · Score: 0

    why not just monitor gigabytes/terrabytes free rather than a percentage then?

  38. Doesit matter anymore? by swb · · Score: 1

    I think most individual server filesystem monitoring for free space is kind of a waste of time anymore or at least low prioirty.

    SANs and virtualized storage and modern operating systems can extend filesystems easily. Thin provisioning means you can allocate surpluses to filesystems without actually consuming real disk until you use it. Size your filesystem with surpluses and you won't run out.

    Now you only have to monitor your SAN's actual consumption, and hopefully you bought enough SAN to cover your growth until you can buy another one.

  39. Performance monitoring by rlh100 · · Score: 1

    Interesting things to monitor are I/O rates and read/write latency. More esoteric things might be stats about most active files and directories or percentage of recently accessed data -vs- inactive data. But these are more analysis than monitoring. What other parameters would a sysadmin want to look at?

    RLH

  40. For different values of server by dbIII · · Score: 1

    But does that server use local disc?
    The discussion is a bit closer to the metal here than something in a virtual machine dealing with data on a SAN even though that technically is also a server. It's just not a file server.

  41. They've reset that date from 2005? by dbIII · · Score: 1

    The linked article used to be about how RAID was going to stop working in 2005 or similar.
    It didn't because disks and controllers got much faster as well as dealing with more capacity, while the premise assumed nothing but a change in capacity.
    So now we have arrays 10x larger that rebuild in less than half the time of the old ones. We also have stuff like ZFS that acts like RAID6 in many ways (with raidz2) but can have much shorter rebuild (resilver) times because it only copies data instead of rebuilding the full capacity of the disk like a hardware RAID controller would do.
    I'd expect someone running FreeNAS to know more than a journalist rewarming an old article that was a poor prediction in the first place, but I suppose seeing it in magazine format does make it look more credible.

    1. Re:They've reset that date from 2005? by __aaclcg7560 · · Score: 1

      I'd expect someone running FreeNAS to know more than a journalist rewarming an old article that was a poor prediction in the first place, but I suppose seeing it in magazine format does make it look more credible.

      RAID6 was something I heard about five or six years ago, but never seen in action or in the field. Supposedly it was the next great thing. I'm still figuring out ZFS on my FreeNAS box. Damn 8GB flash drives keep zapping out every six months, forcing me to install the current version of FreeNAS.

    2. Re:They've reset that date from 2005? by dbIII · · Score: 2

      ZFS raidz2 is pretty well RAID6 with an awareness of what is going on with the files in the array giving a variety of improvements (eg. resilver time normally being vastly shorter than a RAID6 rebuild time). A few years of seeing RAID6 in action was ultimately what drove me to ZFS on hardware that's perfectly capable of doing RAID6.
      Anyway, the "raid only has five more years" article keeps on getting warmed up, and keeps getting disproved by the very reasons given for the RAID use by date. Increasing capacity has only been possible by increasing the data density on the disks which means the heads pick up more information - thus faster read and write speeds. Better controllers also made a massive difference. Now dedicating lots of cycles to many cores of fast CPUs (instead of the processors in the controllers) is once again making a massive difference. It's only three hours to do a scrub on a 12 x 1TB 7200rpm drive system here with an i5 CPU and it would take close to the same to resilver a new drive. That is six mirrors so faster than raidz or raidz2, but still, it's not a huge amount of time to replace drives now even though that's bigger than the 500GB or so that was supposed to take forever to rebuild.

  42. It's a block size vs available space issue by dbIII · · Score: 1

    It's a block size vs available space issue so 90% full kills performance on small drives with big blocks (eg. SSDs from a couple of years back) but at 90% of 4TB you've still got a vast quantity of available blocks so it still performs very well.
    So although I'm not the poster above I've had experience of both - the percent full number is only a rough guide and falls down when the block size is very small compared with the available space.

    1. Re:It's a block size vs available space issue by sribe · · Score: 1

      It's a block size vs available space issue so 90% full kills performance on small drives with big blocks (eg. SSDs from a couple of years back)...

      OK, while I've not experienced that myself (no SSDs deployed), it certainly makes sense--much more so than the "blanket 90%" claim that people repeat mindlessly.

  43. No point nitpicking aboutt no "b" by dbIII · · Score: 1

    No point nitpicking just because the "b" denoting Megabits was forgotten. A speed of 200Mb/s is not huge but it's not too bad either, even though a fairly old machine (6 years) with a few disks in an array can get close to five times that and saturate gigabit (or even twice over if a second connection is going somewhere else).

    1. Re:No point nitpicking aboutt no "b" by nabsltd · · Score: 1

      No point nitpicking just because the "b" denoting Megabits was forgotten.

      It's not nitpicking because there is a vast difference between 200MB/sec and 200Mbps (about 25MB/sec). It's trivial to get 100MB/sec over gigabit Ethernet, and you don't have to spend a lot of money to do it. I've got a case that will hold 15 3-1/2" hard drives, and the guts (case, motherboard, RAM, RAID card, hot swap bays) cost about $1,500, which is a lot less than the Synology case without drives. I use 2TB drives because that's what I needed at the time, but could have the same 60TB as you if I used 4TB drives.

      That machine also has a 10Gbit NIC, and when using that, I get around 600MB/sec transfer to this server. Now, part of that speed comes from the $350 I spent on 480GB of SSD cache for the array, but that's an option I have and the Synology doesn't. Even without the SSDs, 350MB/sec was the speed I saw. So, 25MB/sec for all the money you spent is pretty lame.

      A speed of 200Mb/s is not huge but it's not too bad either, even though a fairly old machine (6 years) with a few disks in an array can get close to five times that and saturate gigabit (or even twice over if a second connection is going somewhere else).

      Again, 200Mbps is terrible for transfer across a gigabit network. Even a single drive source should be able to supply 240Mbps, with any array easily saturating the link.

      Also, you don't need disk arrays anymore...single SSDs will max out gigabit ethernet and give 10Gbit a run for its money. With arrays for the large data, fronted by SSD cache, fronted by RAM (again, lots of RAM is cheap, but not really an option for the Synology), even 10Gbit isn't up to the task.

    2. Re:No point nitpicking aboutt no "b" by dbIII · · Score: 1

      It appears I have to stop putting multiple points in a sentence in this place.

  44. Xymon by totoroxxx · · Score: 1

    It runs on every machine and almost on every os. And yes I have one server running on my home.!

  45. I turn the alerts off. by Karmashock · · Score: 1

    I don't need the computer to tell me when a big disk nearly full. That would be something I was aware of for some time.

    In an enterprise setting where there could be many disks... one would assume the sysadmin has set reasonable alert levels rather then leaving everything on default.

    So... I guess this is relevant to non-power users in residential contexts? But then how is a non power user filling a terrabyte harddrive? I mean... seriously.

    --
    I've decided to stop wasting my time responding to AC trolls/sockpuppets... so if you want a response from me... login.
  46. Its still usefull by allo · · Score: 1

    the disk fills up with the same relative speed.
    okay, the OS does not get a big problem with 99% full disk. but your media collection does. you still need to upgrade your storage, when its getting full, because you will still get new big files.