Slashdot Mirror


Making Use of Terabytes of Unused Storage

kernspaltung writes "I manage a network of roughly a hundred Windows boxes, all of them with hard drives of at least 40GB — many have 80GB drives and larger. Other than what's used by the OS, a few applications, and a smattering of small documents, this space is idle. What would be a productive use for these terabytes of wasted space? Does any software exist that would enable pooling this extra space into one or more large virtual networked drives? Something that could offer the fault-tolerance and ease-of-use of ZFS across a network of PCs would be great for small-to-medium organizations."

448 comments

  1. Porn by Anonymous Coward · · Score: 5, Funny

    It's the obvious choice.

    1. Re:Porn by tristian_was_here · · Score: 1, Funny

      obviously not gay porn... its not my thing anyway.

    2. Re:Porn by fbjon · · Score: 1

      You jest, but I wonder how much of this "unused" space is really unused? It's not just admins who have a few files of "random OTP data" or "misc dll's", ya know!

      --
      True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
    3. Re:Porn by linzeal · · Score: 2, Funny

      You can save on compression though. Especially if it is hot gay twin porn.

  2. vista? by stillb4llin · · Score: 5, Funny

    install vista on them, that would fill up that space and give you something to manage your time a little better than wondering about what you could manage..

    1. Re:vista? by PolarBearFire · · Score: 1, Informative

      Yeah, according to apple Vista takes up 80GB.

    2. Re:vista? by Mantaar · · Score: 4, Funny

      No, Vista is useless. Here's something that makes a lot more sense:

      http://www.uniquepeek.com/viewpage.php?page_id=1517

      :-)

      --
      I'm an infovore...
    3. Re:vista? by nametaken · · Score: 1

      I kinda want to stab anyone that uses 1500 HDD's as dominoes. :(

    4. Re:vista? by Anonymous Coward · · Score: 0

      Note: That crashed my Ubuntu 7.10 system on 5600+ AMD 1 GB RAM. Firefox 2.0.0.12

    5. Re:vista? by Quick+Reply · · Score: 1

      Now they are ready for eBay

    6. Re:vista? by Anonymous Coward · · Score: 0

      If you have any windows distro after win2K, you have the ultimate sharing box right
      at your desk. It is your windows operating environment itself. Buried in its many megabytes of layered bloatware are the hidden backdoors that microsoft rents out to businesses that want to invade users machines to data mine. Also there are hidden remote administration routines rented to businesses and/or governments to spy on users en masse. Less known are the programs that allow disk access to strangers that yet for another fee to microsoft provide businesses of all type to store their data on your machines; all they have to do is pay microsoft rent for using the space on your machine. Your operating system will never be the wiser as none of this is picked up on the FAT or the VTOC. Secret parts of microsoft's systems know about the hidden data, and if it is threatened with overwriting, it simply moves to another machine. Connection to the internet is key to all this, so computers not connected to the internet are actually free from having illegal chinese espionage, illegal pix of all types, criticism of scientology, whatever stored on them as the machines have no way of getting it from the net. No windows box is secure in any way. Just look at the shares reported by the system. For every drive setup in your system, C: D:, etc, there is a corresponding 'default' share: C$, D$, and so on. You can delete these 'default shares' for which microsoft and only microsoft or its' delegated operative know the logons and passwords, but at next reboot they will reappear in all their infamy. You are never secure with windows, Period. Any windows after win98! Even Win2k and Millenium had the habit of logging on to Redmond each and every boot if the internet was available. All those log files that windows keeps......Bill Gates' buddies read and maybe sell every one! Just try to get rid of that pesky 'cookies.dat' file that some of you know about. Now about all those hexadecimal named directories in curly brackets and hidden attributes that are in all the 'NT' type syetems....heh...heh...heh. So do not ask about sharing your unused storage space! IT IS TAKEN and IS NOT YOURS

    7. Re:vista? by Anonymous Coward · · Score: 0

      So the truth comes out on why there are so many XBox failures!

    8. Re:vista? by blair1q · · Score: 1

      3 down, 1497 to go.

      set 'em up again!

      (oh, and, damn is that a nice house; i wish i'd been modding xboxes the past few years)

  3. easy! by Anonymous Coward · · Score: 5, Funny

    Does any software exist that would enable pooling this extra space into one or more large virtual networked drives?

    Absolutely! Just hook them up directly to the internet before you update the machines, wait a few minutes, and voila! They'll be filled up with extra files in no time! Hey, you didn't say anything about wanting to be in control of what gets put on the machines...

  4. Not without heavy utilization of other resources by Mostly+a+lurker · · Score: 4, Insightful

    If you have a very robust local network with plenty of spare capacity, and can accept a performance hit on the client computers, I am sure some kind of linked filesystem would be possible. In most practical situations, I think this idea would be a non-starter.

  5. Do you really have control of the boxes? by Marc+Rochkind · · Score: 5, Insightful

    If they're in a computer room, then such a scheme might work. But, if they're on user's desks, you don't really have control. They're subject to filling up, being shut off, being knocked about, crashing, etc. I don't think in this case you would really get the reliability that the diversity and independence would suggest.

    --Marc

    1. Re:Do you really have control of the boxes? by McGiraf · · Score: 1

      You just use some kind of distributed raid. I'm sure sofeare for this already exist.

    2. Re:Do you really have control of the boxes? by teslar · · Score: 1

      They're subject to filling up, being shut off, being knocked about, crashing, etc
      Well, filling up is kinda the point of the entire exercise, but you're right - being shut off, crashing, or being otherwise disconnected is enough of a problem to make this a non-starter. We're basically talking about a distributed filesystem in which subparts may fail without notice. I'm sure there are ways to minimise the problems this will create - you can for instance make sure that any one file is always completely located on one physical hard disk, so if that one goes down you're at least not left with half a file which may still be open in an editor somewhere. I guess you can also be clever with redundancy, so that say half the hard disks in your network can go down but you're still left with a working system (provided the right ones go down), but basically, because you cannot guarantee which hard disks will be up at any given time you also cannot guarantee that you're system won't break in horrible ways. Hence it's not practical unless you don't particularly care about which files are available at any given time as long as some are there. So basically, that means it'll be alright for porn and mp3s, neither of which you'd particularly want lying around in a corporate environment, but I fail to come up with applications that might actually be useful.

      Besides, with hard disks these days being as cheap as they are, why not just buy another one if you do need more space? Do you even need more space? Or is this just trying to salvage something you can't really use in order to create a solution for a problem that doesn't really exist?
    3. Re:Do you really have control of the boxes? by arivanov · · Score: 1

      There is a number of clustered storage apps operating on P2P basis with N:M redundancy model. Just do an internet search and choose your poison. Neither one of them offers amasing performance, but the actual availability often exceeds what you get from an average SMB Winhoze server.

      --
      Baker's Law: Misery no longer loves company. Nowadays it insists on it
      http://www.sigsegv.cx/
    4. Re:Do you really have control of the boxes? by cbart387 · · Score: 2, Insightful

      Or is this just trying to salvage something you can't really use in order to create a solution for a problem that doesn't really exist? Bingo Bango Bongo! If you read the submitter's question, it simplies to:
          a) Is there there something productive I can be doing?
          b) How to do it?

      Everything else is fluff that tends to lead slashdot readers off on tangents, flamewars, Emacs Vs Vi (emacs), KDE vs GNOME (gnome)
      --
      Lack of planning on your part does not constitute an emergency on mine.
    5. Re:Do you really have control of the boxes? by Firethorn · · Score: 1

      Besides, with hard disks these days being as cheap as they are, why not just buy another one if you do need more space? Do you even need more space? Or is this just trying to salvage something you can't really use in order to create a solution for a problem that doesn't really exist?

      Let's say 50GB, on average, is free on each computer. We want some fairly hefty redundancy, so let's knock it down to 10GB of storage per machine.

      Spanned across 1000 work machines - that's 10 Terabytes of storage, with 4X or so redundancy.

      As long as you don't go replacing huge numbers of machines at once, of course.

      As a bonus, assuming that each client machine has some software to allow it to 'serve' the files directly - you've just decentralized your file sharing. Having that 10GB pipe in your data center is no longer as crucial, the central server simply tells the machine off the requestor's own switch to serve it.

      I'm not going to say that it's ultimately useful, or that the benefits would outweigh the risks, but it IS an interesting proposal.

      --
      I don't read AC A human right
    6. Re:Do you really have control of the boxes? by fbjon · · Score: 1
      Why not a sort of gradual compression redundancy, like a hologram? The more machines are on the network, the more details about the data in storage can be extracted.


      Now, this would only work well for lossy compression like, say... images, but I'm sure that is the intention of the submitter. Incidentally, it may provide an incentive for the workers to keep their computers turned on, and even stay late after work.

      The more, the merrier. Literally.

      --
      True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
    7. Re:Do you really have control of the boxes? by Seth+Kriticos · · Score: 1

      Everything else is fluff that tends to lead slashdot readers off on tangents, flamewars, Emacs Vs Vi (emacs), KDE vs GNOME (gnome) vi, gnome is ok..
    8. Re:Do you really have control of the boxes? by 644bd346996 · · Score: 1

      Umm... Google?

      Their stuff runs off distributed computing clusters composed of the same kinds of computers that the submitter has at hand. Google's computers might be a bit more reliable by virtue of being left alone, but that is really irrelevant. The point is that their clustering system treats them as inherently unreliable, and whether it is 5% or 10% doesn't make a difference.

      Also, given that the submitter has the free time to contemplate a project like this, we can assume that he is competent enough to have already implemented things like disk quotas and can prevent the users from shutting down the computers from the OS.

    9. Re:Do you really have control of the boxes? by Anonymous Coward · · Score: 0

      Oooo-weee, has the average Slashdot poster's technical acumen fallen that much since the old daze? The submitter presumed replies would consider anything as staggeringly obvious as storage availability, as did the designers of distributed file systems. Now, I'm not busting your chops in particular but seeing your post moderated +5 is, well, disheartening.

    10. Re:Do you really have control of the boxes? by mpe · · Score: 1

      We're basically talking about a distributed filesystem in which subparts may fail without notice. I'm sure there are ways to minimise the problems this will create - you can for instance make sure that any one file is always completely located on one physical hard disk, so if that one goes down you're at least not left with half a file which may still be open in an editor somewhere.

      Actually you might want each file spread amongst several disks in some multi layed RAID approach. Which won't leave you with anything like the amount of storage you expect (and eat up a fair bit of bandwidth.)

      I guess you can also be clever with redundancy, so that say half the hard disks in your network can go down but you're still left with a working system (provided the right ones go down), but basically, because you cannot guarantee which hard disks will be up at any given time you also cannot guarantee that you're system won't break in horrible ways.

      Including due to the likes of power failures. Whilst it might be sensible to UPS your network infrastructure, especially if you have PoE switches which power devices such as WAPs and IP phones, it is unlikely that you will have UPSs for every workstation. (Unless they all happen to be laptops).

    11. Re:Do you really have control of the boxes? by Leadmagnet · · Score: 1

      You could partition the unused disk space then install a iSCSI target drivers, and then use a iSCSI initiator on a server to stripe across them in a RAID6 / RAID10 / RAID51 fashion. I deploy storage networks, business continuity, virtualization, and disaster recovery solutions for a job, and would never under take this if these PCs are on the desktops. Your best bet is to use a lowend iSCSI storage array like a Dell AX150i over a dedicated gigabit network just for your servers, and then share it out to the desktops using NFS/CIFS. Or you could just buy a NFS/CIFS appliance for the clients and servers to use like a EMC Celerra NS20.

      --
      http://www.leadmagnet.50megs.com
    12. Re:Do you really have control of the boxes? by mpe · · Score: 1

      As a bonus, assuming that each client machine has some software to allow it to 'serve' the files directly - you've just decentralized your file sharing. Having that 10GB pipe in your data center is no longer as crucial, the central server simply tells the machine off the requestor's own switch to serve it.

      You'd need this "central server" to know which switch each machine is connected to, which is a single point of failure, so probably you'd need this information to be distributed. The protocol also needs to be capable of handling machines appearing and disappearing randomly.
      You'd need something like a trackerless bittorrent filesystem with an SNMP agent.

    13. Re:Do you really have control of the boxes? by glymph · · Score: 1

      If you do have control of these machines, why not take out the disks (having moved the data, obviously) and have them network-boot off a decentralised cluster of servers - you could use the disks thus freed-up to increase the capacity of the servers, backup user files more efficiently, save on installation/support issues, and potentially make every desk a hotdesk too.

      Oh wait, they're probably windows machines and windows probably doesn't do well at PXE booting off a server - move along, nothing to see here.

    14. Re:Do you really have control of the boxes? by DamnStupidElf · · Score: 1

      I guess you can also be clever with redundancy, so that say half the hard disks in your network can go down but you're still left with a working system (provided the right ones go down), but basically, because you cannot guarantee which hard disks will be up at any given time you also cannot guarantee that you're system won't break in horrible ways. Hence it's not practical unless you don't particularly care about which files are available at any given time as long as some are there.

      You can use a redundancy coding scheme so that any failure of M out of N devices is fully recoverable. Reed Solomon coding is a good example. If you have N storage devices out of which up to M may be unavailable at any time, use an (N,N-M) Reed Solomon code to encode all data, then split the codes up so each word of the code is stored in a separate device. Reed Solomon codes can recover from M errors in code words as long as it's known before-hand which words are in error. If you can keep a pool of additional devices available as hot-swap spares, a completely failed device can be rebuilt on another device by just recoding the data after it's recovered.

    15. Re:Do you really have control of the boxes? by drsmithy · · Score: 1

      Well, filling up is kinda the point of the entire exercise, but you're right - being shut off, crashing, or being otherwise disconnected is enough of a problem to make this a non-starter.

      Actually, no, this is a relatively trivial issues easily handled by a) significant redundancy and b) internal policies that instruct users not to power off machines (or implementation of something like WOL to make it irrelevant even if they do).

      The biggest problem in this idea is how to handle writing data. Either you have something that works at the block level (like RAID) and suffer horrible (no, even worse than you think) performance due to latency, or you have redundancy at the file level (multiple copies of the same file) and risk two people modify different copies simultaneously.

      You also need some sort of centralised authority to co-ordinate all the different machines and redundancy between them, potentially introducing a single point of failure (although this is also fairly trivial to work around).

      With that said, I wouldn't be surprised to see something like this originate out of Microsoft, extended from their work with Windows Home Server and combined with DFS - most all of the groundwork is already laid there.

      The most obvious application for it (at least in a corporate setting) is online backups and highly static data - both of which essentially circumvent the writing problem by very infrequent modification and only allowing writing under carefully controlled circumstances.

    16. Re:Do you really have control of the boxes? by MichaelKaiserProScri · · Score: 1

      Pull them out of he PC's. Network boot the PC's. Build a SAN with the removed drives... Possible, but hardly pratical. Weigh value of the drives vs value of the time it takes to do this and you will find it WAY easier to just but a 1TB SATA drive for $399.

    17. Re:Do you really have control of the boxes? by kernspaltung · · Score: 1

      The most obvious application for it (at least in a corporate setting) is online backups and highly static data - both of which essentially circumvent the writing problem by very infrequent modification and only allowing writing under carefully controlled circumstances. EXACTLY! Finally somebody who gets why this might be useful. Thank you.
    18. Re:Do you really have control of the boxes? by Anonymous Coward · · Score: 0

      Love your sig. Nothing says "deranged zealot ahead" like that dollar sign.

    19. Re:Do you really have control of the boxes? by Firethorn · · Score: 1

      The protocol also needs to be capable of handling machines appearing and disappearing randomly.

      No doubt, that's part of why I specified such massive redundancy.

      You'd need something like a trackerless bittorrent filesystem with an SNMP agent.

      Good point. We should have enough knowledge now to have each computer act as a part of the tracking system - added complexity, but not necessarily a bad thing.

      --
      I don't read AC A human right
    20. Re:Do you really have control of the boxes? by RedK · · Score: 1

      The question is not the usefulness of network based storage, the question is the cost, both associated with the initial deployment and the support of the chosen solution and the problems it might add due to complexity. In the end, if you really need extra storage, your money is probably better spent on a NAS, either very low-end like Buffalo Technologies or QNAP for small businesses up to big arrays or even SANs for large businesses.

      I think the biggest point everyone is missing here is how do you upgrade the desktops once their part of your storage network ? This is the type of added complexity that makes this solution just not worth it.

      --
      "Not to mention all the idiots who use words like boxen."
      Anonymous Coward on Monday August 04, @06:49PM
    21. Re:Do you really have control of the boxes? by kernspaltung · · Score: 1

      I think the biggest point everyone is missing here is how do you upgrade the desktops once their part of your storage network ? This is the type of added complexity that makes this solution just not worth it.

      The reason I mentioned fault-tolerance in TFQ is because I am not so ignorant as to think you'd just add all this space together and present it as one big share to put files on. Since all this is currently wasted space, there's little penalty for high-order redundancy. I should be able to pull three or four PCs right out of the wall and still see a healthy pool that has started some peer-to-peer activity to bring the duplicity of any data lost on those nodes back to some safe level.

      I am shocked at utter lack of imagination among many of the posters, and how many of them assume I'm an idiot who hasn't thought about even the simplest aspects of this (not yourself, specifically, but just in the comments as a whole).

    22. Re:Do you really have control of the boxes? by drsmithy · · Score: 1

      I think the biggest point everyone is missing here is how do you upgrade the desktops once their part of your storage network ? This is the type of added complexity that makes this solution just not worth it.

      Again, this is the sort of problem that's so trivial it's strange anyone even brings it up. Any appropriate solution would have multiple copies of all data. If you want to "upgrade" one of these desktops, you just pull it out of the pool, do your work to it, then put it back in. Since the data on it is stored in multiple locations, the only potential impact is to a machine that is actively accessing data on that "server" at the time (and even avoiding that shouldn't be too difficult). Effectively, it's just a RAID array using multiple computers as component devices instead of disks.

      I know exactly what the OP is after and I've contemplated trying to build such a system before (but never had the time to pursue it past block diagrams on paper). He wants to be able to run some "server" on all the client machines which presents some proportion of their unused local disk space into a big pool. A centralised "director" would then portion out any stored data to these various components, always making sure a certain level of redundancy is retained.

      Like I said elsewhere, combine the data redundancy scheme used in Windows Home Server and DFS, and Microsoft are already 95% of the way to doing this.

    23. Re:Do you really have control of the boxes? by Anonymous Coward · · Score: 0

      Wow the problems you mention are pretty much the problems that exist with any hard drive and yet we keep using those. Furthermore this sounds a lot like the way that GoogleFS works except that there aren't really users on those systems. So what if you have to keep 10 copies of everything across the enterprise as long as it works the degradation in space isn't important. The thing to note here is that you have more space available then you did before. In fact the scheme sounds similar to way people have talked about distributed storage across the Internet.

    24. Re:Do you really have control of the boxes? by RedK · · Score: 1

      Just pulling 1 desktop at a time for upgrading is nice and dandy, but what about mass-upgrades ? You can't just pull all nodes from the distributed network and hope all the blank PCs you put in manage to replicate data from thin air. So in the end, you can't just buy a batch of machines and upgrade an entire department in 1 night.

      Also, anytime you run client updates through some automated system like Tivoli or SUS, you just rebooted your network based storage since pretty much the whole park is going to reboot at the same time.

      Also, you're now generating much greater network load for writes/reads since parity and mirroring is now occurring over the network instead of over your storage array's internal bus. This will impact desktop network performance or generate some unwanted costs.

      Speaking of networks, if you happen to lose a switch, instead of those PCs not being to be able to get on the network, you run the risk of losing too many nodes in your virtual RAID which could cause a major failure where everyone is not able to access the data in the company instead of just the people who were on that particular switch/interface card.

      In the end, I don't think it is worth it from a budget perspective. There are all kinds of storage arrays for all kinds of budgets on the market that will provide the needed redundancy and fault tolerance. These will be placed in a controlled environment, namely your datacenter or server room and won't be at the mercy of a user using it as a Desktop PC.

      --
      "Not to mention all the idiots who use words like boxen."
      Anonymous Coward on Monday August 04, @06:49PM
    25. Re:Do you really have control of the boxes? by Daengbo · · Score: 1

      When I read the article, I started thinking back a few years ago when I booted LTSP thin clients off of a Mosix kernel. I think the LTSP guys can do the distributed disk space, but I'm not up on the project anymore.

      Then I realized that I know nothing about how to solve this problem with Windows, and I just started reading the comments to get the answer.

      Depending on requirements, you could start PXE booting the clients via LTSP into an RDP client, connect to a Windows Terminal Server, and pull the hard drives as a last step, but I don't want to think about the power requirements of several RAIDs with a total of 1000 ancient 40GB drives.

    26. Re:Do you really have control of the boxes? by Kent+Recal · · Score: 1

      Well, as I understand it cluster-filesystems such as GFS or GPFS (and likely others) can do that today.
      You basically tell them how many copies you want for redundancy and they make sure that all data
      is replicated at least that often.

      I don't know about a viable windows implementation and I'm quite sure that these systems are not
      designed for nodes constantly entering and leaving the network - but still, the concept seems applicable.

      I, too, find the whole idea intriguing, actually I'm sure there is a product idea hiding in there.
      One thing that immediately springs to my mind would be backup. With a high level of redundancy (5 or so?)
      and some mechanism to make sure that the copies are always distributed to the physically most
      separated hosts (so a burnt down building can not cause data loss) I could very well imagine
      this to become an interesting part of backup infrastructure.

      Having 5 copies of all data may sound much at first but if we're talking about desktop nodes
      then there is not much data in first place. And even if some disk-only nodes need to be added
      to make the equation work then that would likely still be cheaper than all that highly-redundant
      and supposedly super-reliable backup-hardware that we're using today...

    27. Re:Do you really have control of the boxes? by complete+loony · · Score: 1

      If you build something robust enough I doubt it would be an issue. Do you think google overly care if one of the nodes in their cluster goes down?

      --
      09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
    28. Re:Do you really have control of the boxes? by SleepyHappyDoc · · Score: 1

      There's probably a decent peer to peer photo sharing app in what you just said.

      --
      Stasis is death. Embrace change.
    29. Re:Do you really have control of the boxes? by linzeal · · Score: 1

      You can put 3-4 TerraBytes in an average server and save the 1000's of watts of electricity you would waste trying to get this to work.

    30. Re:Do you really have control of the boxes? by geekboybt · · Score: 1

      Couldn't (and shouldn't) a distributed file system be designed for such possibilities? Like, say, only storing redundant, encrypted portions of files on each desktop?

  6. Waste of electricity by Anonymous Coward · · Score: 0

    You'll get more selling everything you have and investing in a storage solution then you will paying for the electricity to run all those crap drives.

    1. Re:Waste of electricity by bconway · · Score: 1

      The engineers might have a tough time doing their work after you take away their computers. Methinks you didn't read the article.

      --
      Interested in open source engine management for your Subaru?
  7. Download and mirror the Internet... by SiegeTank · · Score: 5, Funny

    ...just in case your connection fails.

    1. Re:Download and mirror the Internet... by psychicsword · · Score: 1

      Because he will run out of room very fast I recommend downloading all the pron sites first. That way if the internet is down at least he still has porn.

  8. not enough info by YrWrstNtmr · · Score: 2, Interesting

    Is this a company, college, or just a random collection of boxes in your mom's basement? What function does your organization want to do that it can't because of a lack of a few terabytes? What does the actual owner of these boxes have to say about your little enterprise?

    1. Re:not enough info by BeanThere · · Score: 4, Funny

      100 computers in his mom's basement? That's a big basement.

    2. Re:not enough info by Anonymous Coward · · Score: 1, Funny

      and a big Momma

    3. Re:not enough info by Anonymous Coward · · Score: 0

      and warm... very, very warm...

    4. Re:not enough info by Anonymous Coward · · Score: 0

      My mom has an enormous basement, you insensitive clod. (oh wait...

    5. Re:not enough info by click2005 · · Score: 1

      Its a botnet. Serving trojans, viruses & spam doesn't make full use of the hard drives.

      --
      I am a free slashdotter. I will not be modded, blogged, DRM'd, patented, podcasted or RFID'd. My life is my own.
    6. Re:not enough info by g33k+and+destroy · · Score: 1

      Only 100? Not big enough

    7. Re:not enough info by Anonymous Coward · · Score: 0

      Not really older Compaq 1U servers are very compact and only 4 72" racks can fit ......

      Oh CRAP! I just let out where I live! DAMMIT DAMMIT DAMMIT!

      no really I'm married, I have had sex, I dont live in my mom's basement! really! WHY DONT YOU BELIEVE ME!

    8. Re:not enough info by Chas · · Score: 1

      Not really. A hundred mid-towers can fit on a few large three-tier roll-away carts.

      This'll fill a small room. But an entire basement? Not really.

      The REAL problem becomes supplying power for all of them without constantly blowing a breaker, setting the house on fire (through electrical or thermal means), or cooking the systems by producing too much heat. ....

      Not that I've ever tried such a thing...no...not me...

      *Whistle*

      --


      Chas - The one, the only.
      THANK GOD!!!
    9. Re:not enough info by BeanThere · · Score: 1

      Well sure, if you put it that way ... from the submitter's description combined with the comment, I had this mental image of 100 full desktops with monitors and people working away on them doing office-y stuff in his mom's basement :) If this was a private cluster, apart from costing a small fortune, it would solve his problem and make his question irrelevant, as there are solutions available.

      Indeed, power will be a problem, if you assume 300kW per system (i.e. with monitors) that's 30kW! Plus a few switches. Headless machines I suppose you might get away with, I don't know, 15kW I guess.

    10. Re:not enough info by kernspaltung · · Score: 1

      It's a company. There's nothing we desperately need the space for, but I was thinking about how to expand our present backup solution once we've exceeded its capacity. Here I am pricing out off-site backup services, tape robots, SAN stuff, more drives for the servers, RAID boxes, etc., and it just dawned on me that we'd purchased 40 new PCs with big hard drives that won't ever have more than a few gigabytes of data on them. I've been encouraged by some of the positive, helpful replies here, but I am overall discouraged at the thought processes of the typical Slashdotter involved with IT. It's always "not worth it", "buy a SAN solution", "stupid idea". These are some of the most think-inside-the box people I've ever encountered.

      The following is not specifically aimed at you, but at all the naysayers:

      YES, I know each client can be turned off, burn up, be hacked.

      YES, I know this isn't a good idea for high-availability, high-bandwidth storage.

      YES, I know they should all be thin client terminals running Linux and booting off a server. Welcome to the real world.

      YES, I know it sounds crazy when you can just buy yet ANOTHER terabyte drive from Seagate when you need more storage. But why should I do that when I know full well the know-how exists to combine all this unused space in a highly-redundant, fault-tolerant manner? Hell, I could write the software to do it myself if I had six months of nothing else to do.

      Thus, the reason I asked the question, figuring I couldn't be (and some comments prove me right) the only one to have thought of how to use all the wasted storage out there. And if others have thought of it, maybe they've found solutions. I thought that's what "Ask Slashdot" was all about.

      I know that for versioning, backup, and archiving uses, all this space is available and could be made useful with the right software--software that combines the approaches of BitTorrent, ZFS pooling, encryption, etc. I know it's possible and I know it would be useful to lots of companies with the same setups.

      Fin.

    11. Re:not enough info by buysse · · Score: 1
      There's also some insanity involved with the idea of distributed backup -- it's not so much thinking "inside the box," as thinking that backups are something you don't fuck around with. You can call that inside the box if you want to -- I personally don't agree with a large part of how the industry has gone -- I don't want to trust my backups completely to some third-party, not without significant assurances. I want physical copies (tapes, additional disks, etc.) well offsite. I want more than one copy. If at all possible, it's encrypted.

      On a daily basis, I have to think about HIPAA, FERPA, PCI, and other private data, some of it going to international-incident level if it's breached. Someone gets hold of one of our backups, and it's a bloody world of hurt. Certified letters to every student at a major public university. Maybe, based on the state law that requires notification, letters to the population of Bolivia after the raw census data is leaked. Other nightmares...

      I think the people telling you to just buy a SAN or other solutions are thinking of two costs you aren't. It's not just the cost of the software/hardware. It's the time involved -- can you afford for it to be down for an hour? Can you afford to spend hours fixing it, or updating it, with little or no documentation? If you're talking about using it for backups, can you afford either a data leak or data loss when a PC gets stolen by someone breaking in to the office, or a disgruntled employee rooting his own desktop box?

      It might be worth it -- but for most of us, I don't think it would be. It'd be an excellent research project, though... I'd personally start with distributed replicated block devices, not filesystems. Have each block with a rev number and transaction log so that conflicts (and updates) can be handled when a node goes down or comes online. Build a filesystems over the top of it... but like you, I don't have six months or more of free time to do it.

      --
      -30-
  9. Maybe move with the times? by line-bundle · · Score: 2, Insightful

    You could try to use something like "Localhost Azureus" for distributed data storage. The only problem will be that it will cost you in terms of processor and network hogging.

    Is it cost effective to reclaim that (small) space? Probably not. My suggestion is to realize that no-one tries to save clock cycles any more and maybe this is the way disk storage is probably heading that way.

    1. Re:Maybe move with the times? by Ciph3rzer0 · · Score: 1

      Just because nobody tries to save clock cycles or disk space, doesn't mean it shouldn't be done. You should always try to optimize.

      There are good uses for that space. It might not be a bad idea to use the free space as backup. That way, your main files are on a reliable server somewhere, but then if something goes wrong you can get the files off the network. If there is a problem with a computer crashing or being shut down, the file transfer can wait since its just the backup. I would say use it as a third backup, not the only backup.

  10. Space is not that important any longer by eebra82 · · Score: 4, Insightful

    It's a very interesting question, but from my point of view, hard drive space is so ridiculously cheap nowadays that it is utterly pointless to look for a useful application that will fill it up.

    Let's assume that the average computer has 80 GB of storage. Multiply that by 100 and you get 8 TB of space. That's what you can get into one or two computers nowadays without plunging out too much cash.

    What's more interesting is how much processing power you have as well as how fast the internet connection is.

    1. Re:Space is not that important any longer by jaxom · · Score: 5, Insightful

      I disagree with this and face this question all the time in work. Disks are cheap, storage systems aren't. If this is for a business that requires reasonable uptime, then the only solution would be to implement a SAN using Fibre Channel or iSCSI and then take out the drives. With the right array, all of a sudden those drives become superfluous (you decide if boot from SAN is right for you), management is easier and you'll be able to get a lot of reuse out of the drives.

      Now a lot of people will start to question the cost of doing all of this and it isn't cheap, however you have to analyze the data correctly. We migrated 200 servers from DAS to a SAN and had our money back within 12 months. Add on top of that the implementation of VMs, all of a sudden those 200 went to 20. That's a big difference in cost of ownership.

    2. Re:Space is not that important any longer by Mondo1287 · · Score: 1

      Exactly. Storage is relatively cheap these days, and doing something like this doesn't make sense. While you can easily spend a couple million dollars on a large SAN, there would be a massive hit in reliability, redundancy, and performance by using the approach you have described. I know there is a commercially available product to do just what you have described, but I can't remember the vendor. Let's say you have 1000 machines with 80GB drives, and the average machine has 50GB of free space. That will give you 50TB. On such a system, I'd want the data to be redundant on no less than 5 machines, cutting you down to 10TB of useable space. Now imagine the crippling effect this will have on even a network with gigabit to the desktop with 10g switches at the core. Remember every change will have to be replicated across the network to 5 machines. Not to mention the processing overhead each machine will experience. Then there is the problem of securing the data. I can see something like this working in a small network, but it still doesn't make sense as a nice server with a couple TB of space or a NAS device would make more sense.

    3. Re:Space is not that important any longer by gladish · · Score: 1

      Disk space is cheap is a the biggest misunderstood cliche people always use. Yes, 500 gig sata drives a are cheap on newegg. But once you put a bunch of disks together and make a giant filesytem, managing it becomes very clostly. What most people fail to realize is that most high-performance NAS systems don't use cheap drives. They use reliable expensive drives that are front-ended by high-performance expensive hardware with high-performance expensive software running on them. Yes disks are cheap, but no large filesystems are not. The other thing to remember is that both cheap and big are relative terms. FOr a home user a couple hundred bucks for 1TB of SATA space might seem cheap. But a couple hundred grand or a couple million for real NAS that has failover and reliability isn't cheap.

    4. Re:Space is not that important any longer by Vellmont · · Score: 1


      Disks are cheap, storage systems aren't.

      No, SOME storage systems are expensive.

      You can pretty easily put together an inexpensive SATA array with multiple terrabytes of storage. To anyone that thinks Fibre Channel or iSCSI is just a million times better or more reliable than SATA, I'd say you're being sold a bill of goods by your vendor. Unless you have very high performance needs like say a database being hit by thousands of people, SATA will serve you just fine.

      --
      AccountKiller
    5. Re:Space is not that important any longer by STrinity · · Score: 2, Insightful

      The solution is obvious -- the company should have just one or two 80 gig hard drives that employees connect to via Unix terminals.

      --
      Les Miserables Volume 1 now up with my reading of
    6. Re:Space is not that important any longer by LWATCDR · · Score: 5, Insightful

      Yep a better question is Why do all these PCs have harddrives?
      If they are really only using it for the OS, a few applications, and a few docs why not use diskless workstations?
      Less power, heat, and fewer things to break.
      In other words don't use all those drives, get ride of all of them.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    7. Re:Space is not that important any longer by scotch · · Score: 1

      iSCSI and FC are SAN, SATA is DAS. Apples and crab-apples If you can get by with DAS, then fine, but don't pretend like DAS can substitute for SAN for enterprise uses.

      --
      XML causes global warming.
    8. Re:Space is not that important any longer by jpetts · · Score: 2, Insightful

      You're talking about different things: for example, I have just put together a NAS device using SATA disks that offers out volumes as iSCSI targets over GigE.

      SATA is a drive interface spec. NAS is a generic description of a type of storage device. iSCSI is a communication protocol, as is GigE.

      It's being used as storage for an Oracle database server used by around a hundred simultaneous users.

      By buying commodity parts from Fry's I managed to get 3T usable for under $2000.

      Oh, and I had fun building it.

      --
      Call me old fashioned, but I like a dump to be as memorable as it is devastating - Bender
    9. Re:Space is not that important any longer by mpe · · Score: 1

      Disk space is cheap is a the biggest misunderstood cliche people always use. Yes, 500 gig sata drives a are cheap on newegg. But once you put a bunch of disks together and make a giant filesytem, managing it becomes very clostly.

      RAID used to stand for Redundant Array of Inexpensive Disks. Now it stands for Redundant Array of Independent Disks. Thing is that even if the disks are fairly cheap you need to multiply by how many you need add in the cost of cradles, possibly also special enclosures, etc, etc.

    10. Re:Space is not that important any longer by scotch · · Score: 1

      You seem to be confused that I am confused. Not once did I bring up NAS. Re-read the GP post.

      --
      XML causes global warming.
    11. Re:Space is not that important any longer by drsmithy · · Score: 1

      Now a lot of people will start to question the cost of doing all of this and it isn't cheap, however you have to analyze the data correctly. We migrated 200 servers from DAS to a SAN and had our money back within 12 months.

      Difficult to see how that could be true, given how much even a cheap SAN with suitable availability would cost and all you're talking about is replacing system drives that are so cheap they're practically free. How ?

    12. Re:Space is not that important any longer by Anonymous Coward · · Score: 0

      I disagree - Space *IS* important!

      After all, it *IS* the Final Frontier!

      Seriously, I use my 250GB MyBook drive for testing various Linux
      releases. My laptop uses openSUSE 10.2 by default (booting it off
      of the second internal drive; first internal drive still has its
      Windows XP Media Edition Infection on it to keep the warranty valid).
      But, I have a Fedora 8 installation on one partition that I can boot
      into easily, and have Fedora 9 Alpha and openSUSE 11 Alpha on two
      other partitions, as well as a backup copy of my openSUSE 10.2 system
      on yet another partition. And, just for giggles, I have one more
      partition to back up my Windoze bits onto, even though I don't, for
      some odd reason, seem to have my Administrator password handy. (*sigh*).

    13. Re:Space is not that important any longer by ph43thon · · Score: 1

      Why is no one as jazzed about the idea of using all of this extra space up?

      With 100 PCs with 80GB hdds... reserve 40GB on each machine to give 4,000GB of available storage.

      Divide out that 4,000GB by 8 to get 500 GB of storage with is mirrored 8 times over. That's 5GB of 8x mirrored file storage for each PC.

      The hard part.. is having some software genius write out a decentralized program that can keep track of which machines are up and decide how to distribute any changes to the files.

      This program could store the users files (that would normally be on a network file server) on the PC they usually logged onto. It could do a sort of Volume Shadow Copy process to distribute changes in the files to the other 7 mirrors.

      If one HDD on the network goes down, I'm assuming one would have to replace it since the PCs are in use.. once the HDD is replace and the PC is running again... the mirror would be rebuilt.

      I don't know about you... but on my network of 120 PCs, I haven't had 8 HDDs go out at the same time nor do my end users have control over whether they shut down the machines. Not to mention the fact that I can remotely force machines to wake up anyhoo...

      I'm guessing that we will eventually move to full decentralization... it's the safest way to run things... if you can have all of the data spread out over the LAN... and have a program that manages everything... why is that a bad idea?

      Why do people want to have everything stuck on some big machine in the server closet? What happens when that goes down? What happens when the SAN or NAS or whatever goes out?

    14. Re:Space is not that important any longer by ph43thon · · Score: 1

      ..but all of those drives could be the array.

      That, to me, is the whole point.

      Why set up all this SAN, Fibre Channel, iSCSI stuff when you could have a decentralized program that spreads out over your network and mirrors the data across every machine?

      I thought everyone was on board that decentralization was where everything was headed?

    15. Re:Space is not that important any longer by kernspaltung · · Score: 1

      Trust me, I would love nothing more than to have thin clients on everybody's desk. It would be hard enough to convince upper management to let me replace PCs with thin clients if thin clients cost less to purchase, but when a cheap HP business PC is $375 and their thin client costs just as much if not more, it's impossible. Add that to the licensing costs for Terminal Server or Citrix, and forget it.

    16. Re:Space is not that important any longer by thegrassyknowl · · Score: 1

      Nobody is jazzed about it because it seems a waste of time for VERY little gain. The time to set all this up, test it against failure (l-user error) and so on probably far outweighs the benefits. Failing that, he wants to do it with Windows; an operating system that at its very core is single user (they like to claim otherwise) and horrid.

      RedHat ship a clustered filesystem with RHEL. I don't know how well it works but it can't be that bad if they actually decided to charge for it.

      There are better things to do with the time/space than finding ways to use it. I agree with one of the parent posters - why have disks if they're all running what sounds like a standardised operating environment. That's an awful lot of power being chewed up in his building. Multiply that by the thousands of buildings in a similar boat and you'll soon see significant numbers. The same goes for all those little fans on video cards that generally don't need to be there for joe-user at work (how often do you know people in an office that push their 3d card to its limits?).

      Computers have become so fast, large and cheap that there's a lot of wastage in the computer industry. Best to work in elimination wastage by efficient solutions rather than using wastage in inefficient hacky solutions to make you feel better.

      --
      I drink to make other people interesting!
    17. Re:Space is not that important any longer by TooMuchToDo · · Score: 1

      Google uses SATA drives in all their commodity boxes. Seems like it works for their high-performance needs just fine.

    18. Re:Space is not that important any longer by scotch · · Score: 1
      Of course it works fine. SATA is used in many SAN and NAS applications as well. I never said SATA is unacceptable in the enterprise. It's just the SATA v. SAN comparison made by the GP that is stupid. DAS works fine for some uses too, like the example you bring up, but the key to the google architecture probably has more to do with scaling and redundancy at a higher level than at the storage layer. I don't know, I've never seen it.

      Can't you people read?

      --
      XML causes global warming.
    19. Re:Space is not that important any longer by jaxom · · Score: 1

      Someone else replied to this more eloquently than I, however the problem is basically to do with performance. For the majority of real world workloads, a distributed model like the one you are proposing just does not scale. There's a network bandwidth issue. You are just moving the cost from the storage to the network.

      -jax.

    20. Re:Space is not that important any longer by Anonymous Coward · · Score: 0

      Yep a better question is Why do all these PCs have harddrives?
      If they are really only using it for the OS, a few applications, and a few docs why not use diskless workstations?

      Care to point us to the diskless WinXP workstation How-To?
    21. Re:Space is not that important any longer by jaxom · · Score: 1

      We decommissioned a large number of ancient DEC and IBM SSA storage arrays that we just kept adding storage to. It was simply a cost-avoidance situation. As the business grew, they needed more storage and to do that we had a choice: spend huge amounts of money on storage that was old and needed expensive upgrades (more shelves/frames), or consolidate down into a single array with reallocatable storage.

      Talking in rounded, rough numbers (as I can't remember the details -- this was about four years ago), a new SSA frame on the IBM side and additional storage on the DEC side would cost in the order of $100k. The new array came in at around the same cost. During the migration we consolidated filesystems between the various platforms so that, for example, the 100GB needed on node A went down to 50GB since that was all it needed. Repeat that across all filesystems and all SAN-attached systems and all of a sudden you can easily reclaim a huge amount of space.

      Of course, I can't go into exact details due to confidentiality issues, but these types of savings are real in certain situations. As always, your mileage will vary, hence why you need to do a thorough analysis up front.

      -jax.

    22. Re:Space is not that important any longer by jaxom · · Score: 1

      You are correct in what you say, however I don't think you quite mean what you mean. :-)

      Here's the real situation: in most large scale enterprise computing environments, those large arrays from the likes of EMC, HDS, HP, IBM, etc are better and reliable. The reasons are varied, but fundamentally come down to the fact that there is real money involved with management and uptime. For most businesses the downtime experienced when server with DAS goes down is inconvenient. In anything like a larger enterprise, any downtime can be easily translated into lost revenue and, in certain circumstances, actual significant losses.

      Let me provide an example. Most financial institutions are connected to a messaging network called SWIFT. Simplistically, systems that interface to the SWIFT network are usually money transfer, dealing and other financial applications. You do not want any of those things on DAS systems, regardless of how reliable the underlying hardware is [1], mainly because of things like clustering and replication. Application level replication in these high-volume environments have strict performance limitations and the best way of dealing with these problems is something like SRDF, PPRC or Continuous Access (distance limitations notwithstanding).

      This is a specific example in the financial services sector, but you can quite easily find other examples in areas such as manufacturing, healthcare, retail operations, etc. If you are doing your due diligence, it becomes apparent very quickly that the cost of building out the infrastructure is actually small compared with the potential losses you may incur if you don't implement such technology. You can also do the same thing (and in fact it is related) with disaster recovery and RTO/RPO calculations.

      In all fairness, this is quite fun stuff. Honestly. :-)

      -jax.

      [1] The exceptions here are obviously things such as Tandem and mainframes, however mostly these are either connected to their own proprietary versions of SAN-based storage or, in the mainframe world, use FICON. The latter is basically Fibre Channel anyway and is integrated almost as such in most environments (eg Cisco MDS switches with FICON interfaces connecting to an EMC DMX).

    23. Re:Space is not that important any longer by ph43thon · · Score: 1

      If individuals are already accessing the data over the network (if it's on a file server), then one already has bandwidth issues.

      For a LAN, most offices are not using up much bandwidth internally... you could have it set to sync the mirrors in a way that didn't monopolize the bandwidth.

      Why would that not work?

    24. Re:Space is not that important any longer by miracle69 · · Score: 1

      Everyone is saying how it is a bad idea, but what if it was used as a second on-site decentralized backup.

      I.E. you have your enterprise level on-site database, perhaps one backed up remotely, and then a third backup that is locally decentralized.

      I.E., at night when everyone is gone, the database is backed up remotely as well as the local decentralized system. This would give your data an extra level of protection in case a fire hit the server room but was contained before hitting the cubicals, and you could rapidly restore your data from your distributed on-site database backup.

      Setting up something like that would make version control much easier and eliminate problems with access during the day, while giving you an extra measure of data security without much extra cost.

      --
      Linux - Because Mommy taught me to Share.
    25. Re:Space is not that important any longer by LWATCDR · · Score: 1

      One you don't backup your enterprise level database to some random pc in a cube.
      You will back up on to tape. You will multiple backups some of which will be in a fire resistant safe and some that will be in a safe off-site.
      You will also probably mirror your data at an off-site location as well.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    26. Re:Space is not that important any longer by LWATCDR · · Score: 1

      I wasn't thinking about a thin client. Just a PC that boots off the network. You could use a NAS or even a SAN to provide a local drive for each machine but they would be sitting down in your server room.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    27. Re:Space is not that important any longer by mcrbids · · Score: 1

      I thought everyone was on board that decentralization was where everything was headed?

      And... how did you get that silly idea? Right now, with the Internet, the world is on the most incredible path towards centralization ever. Don't confuse ubiquitous access with decentralization. They are NOT the same.

      Ubiquitous access == you can access your Hotmail account from any computer.

      Decentralized == your email is being hosted/served on YOUR computer.

      With web-based applications being all the rage, the trend is clearly towards centralized, ubiquitous access.

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    28. Re:Space is not that important any longer by Anonymous Coward · · Score: 0

      Then take those disks, and place them in a RAID array. Problem solved.

    29. Re:Space is not that important any longer by James+Youngman · · Score: 1

      I disagree with this and face this question all the time in work. Disks are cheap, storage systems aren't. If this is for a business that requires reasonable uptime, then the only solution would be to implement a SAN using Fibre Channel or iSCSI and then take out the drives.
      Maybe the reason you face this question at work all the time is because you're too dogmatic. There are manifestly other ways of dong this that aren't anything like the "only solution" you mention.
    30. Re:Space is not that important any longer by Geekbot · · Score: 1

      I was thinking along those lines as well. You could lose the drives, have less power consumption, one less point of failure, easier support (centralized image that can't be changed by the user), network file storage.

      Actually, aside from the support I doubt the savings in hardware would come to much, but it very well could if he has a large pool of computers.

    31. Re:Space is not that important any longer by turbidostato · · Score: 1

      "One you don't backup your enterprise level database to some random pc in a cube."

      You probably missed the *second* part in "second on-site decentralized backup".

    32. Re:Space is not that important any longer by LWATCDR · · Score: 1

      Not at all. There is just no value in an on-site decentralized backup.
      It would be very complex.
      It would require that the backup data be encrypted not a terrible thing but still an extra layer of complexity.
      It would require that the data be be stored in multiple places spread over the workstations of the network.
      It would require that most of the machine be powered up all them time. A real waste of power.
      An you would gain... Nothing.
      You wouldn't want to replace you centralized on-site backup system with it. You wouldn't want to replace you off site back up system with it. So what would you gain but a lot of complexity.

      A decentralized off-site backup system does have value. I could see you mirroring data at multiple sites in a multi-site company. But I wouldn't use the desktops to do it.
      The best solution for a company described would be to get rid of the HDs all together.
      Boot from the network card and use NAS for the user data. Fewer hard drives to fail, Fewer hard drives to wipe when you dispose of the computer, few hard drives to power, and few hard drives to cool.
      Yes you would make the network a single point of failure but that is already the case in most places. But you would have more resources available to make the network as reliable as possible.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    33. Re:Space is not that important any longer by turbidostato · · Score: 1

      "You wouldn't want to replace you centralized on-site backup system with it."2

      Provided is doable I certainly would want to replace my near-line centralized backup with it. For the most part backup incidents are of the kind "I deleted by mistake..." and "can you recover some document I thought I wouldn't need anymore..." or "could you store this gigabunch of data for a while? My team will only need it some few weeks". At least on my enviro, near line is not part of the "enterprisey" backup/restore procedure but a convenience for end users (a *big* convenience, by the way) if (yes, a big "if") we could better manage that dispersed underused storage it certainly would be a very intersting gain (hey, dream is for free, isn't it?).

    34. Re:Space is not that important any longer by LWATCDR · · Score: 1

      "At least on my enviro, near line is not part of the "enterprisey" backup/restore procedure but a convenience for end users (a *big* convenience, by the way) if (yes, a big "if") we could better manage that dispersed underused storage it certainly would be a very intersting gain (hey, dream is for free, isn't it?)."
      Yea dreaming is free but I fear that for all the reasons I pointed out that it is a really bad idea. Just not the best use of resources or even the right way of looking at the problem.
      The Problem is "I got a bunch of resources that are not being used." The solution is don't keep them around.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    35. Re:Space is not that important any longer by jimicus · · Score: 1

      all you're talking about is replacing system drives that are so cheap they're practically free. How ?

      SATA and PATA drives are so cheap they're practically free.

      SAS and SCSI are substantially less cheap. Not staggeringly oh-my-god-I-will-have-to-sell-my-granny expensive, you understand. Probably about 3 - 5x the cost per gigabyte. That's before you consider that you'll want to RAID them, which adds more to the cost per gigabyte.

      Fibre Channel is dearer still.

      And if you start looking at proprietary interfaces like IBM's SSA (which is basically SCSI commands over a serial cable, but it predates SAS by several years and isn't compatible with it) you're looking at serious cash per drive. And with SSA, you need to put the drives into proprietary frames which only IBM make.

      I can well believe that buying a SAN with enough disk capacity for the whole lot would work out a similar price to throwing a few dozen SSA drives and a new frame into the mix.

    36. Re:Space is not that important any longer by jimicus · · Score: 1

      It's the warm fuzzy factor you need to account for.

      You can provide a server room with a massive SAN backed up with the most fantastic support contract, redundancy at every step of the way and access times that desktop PCs can only dream of - but people seem to like knowing that the facility to use a drive that's physically sat on their desk exists.

      Granted, any significant downtime of the SAN would affect a lot of people. But if you've already got a SAN you'd be using it as the backend infrastructure for a lot of business systems anyway, so significant downtime of the SAN would still affect a lot of people if they did have desktop PCs.

  11. Dumbest question yet... by Aaron32 · · Score: 1, Insightful

    This is the dumbest /. question I've seen. Decentralized network storage pooled together with no means of practical management? Sign me up! Oh yeah, let's rely on the ditzy end users to help make sure it doesn't crash. I'm sure everyone will leave their computers on 100% of the time so you can make use of it. Don't tell anyone at work of your idea, they might not ever stop laughing.

    1. Re:Dumbest question yet... by ZeroPly · · Score: 2, Insightful
      You haven't put any thought into this - it takes about 20 seconds to answer your concerns given an introductory class in OS design.

      Obviously computers will crash or be turned off. We have this wonderful concept in architecture design called "redundancy" which we can use to address problems like that:

      Assume the probability of any computer being offline is d(c_n). For some computers you will have d(c) very low, such as user out of town often, other will have d(c) quite high, either the user leaves it on all the time or it has background processing to do.

      Computing and updating d() is fairly easy given any modern management tool. Then create clusters of computers with a required availability so that you stripe data across the componenet computers taking into account d() of each computer. Availability of the cluster would be a function of your modified striping algorithm. When you save data, you just choose what availability you would settle for, and the right cluster is chosen.

      Let me answer your next question in advance: if this is so obvious why is no one producing a product that's cheap and easy to implement? Because you'd have about 25 patent trolls lined up at the courthouse - too many teeth, not enough ass.

      --
      Support microSD: in a post 9/11 world, it is unwise to carry your data on media that you cannot comfortably swallow.
    2. Re:Dumbest question yet... by Aaron32 · · Score: 1

      I have already though about what you said, that's why I qualified the response with "practical management". It just isn't realistic to do all that preparation and maintenance for such little gain. It'll increase management infinitely, that's why SAN's are so popular for centralized management. It would be best to get a NAS box and throw it on the network somewhere. But then again, if he's bored implementing this would give him plenty to do.

    3. Re:Dumbest question yet... by drsmithy · · Score: 1

      This is the dumbest /. question I've seen. Decentralized network storage pooled together with no means of practical management? Sign me up! Oh yeah, let's rely on the ditzy end users to help make sure it doesn't crash. I'm sure everyone will leave their computers on 100% of the time so you can make use of it. Don't tell anyone at work of your idea, they might not ever stop laughing.

      Actually, if you have a large amount of static data, it's a perfectly reasonable solution (the availability "problem" is trivial to work around). Given that this requirement is not at all unusual, that makes your post not only childish, but stupid as well.

    4. Re:Dumbest question yet... by jaydubscott · · Score: 0

      And how do you propose to create redundancy and availability when any number of components could fail at any time? The GP is correct, this is a ridiculous question. No matter what level of redundancy you put into a system, there is no way to ensure that even a minimal number of computers will be on at all times. Building in enough redundancy would immediately make this cost prohibitive compared to just purchasing some additional disk drives.

      Also, any company that is in any way interested in reducing energy costs would ask their employees to turn off computers when not in use!! This means that any data you store on these would probably only be sporadically available between 9AM and 4PM (or whenever a quorum of computers would be operational.

      This is not even touching on the other problems, such as the additional network load, increased failure of disk drives, and cost in time or software of managing such a solution.

    5. Re:Dumbest question yet... by kernspaltung · · Score: 1

      Also, any company that is in any way interested in reducing energy costs would ask their employees to turn off computers when not in use!!

      We do. Many of them don't. And I'm not so draconian as to force the PCs to shutdown at a certain time.

      This means that any data you store on these would probably only be sporadically available between 9AM and 4PM (or whenever a quorum of computers would be operational.

      Everyone assumes I wanted to use this pool for live, high-availability storage. What about using it for a versioning backup system? Archiving? There are dozens of other uses which don't require the pool to be available 24/7/265. The other issues I've addressed elsewhere. They're all surmountable with relatively common distributed computing approaches. The problem is, I don't have time to write the software myself, but I guessed other people had thought of this and might suggest a product that already exists. Some did just that. Most just complained about how it didn't make sense and can't be done.

    6. Re:Dumbest question yet... by ZeroPly · · Score: 1

      As I had mentioned, the algorithms are fairly well known - which is the problem. I would imagine there are multiple, possibly overlapping, patents which cover all this from various angles. Anyone trying to make a commercial product would have to deal with a legion of patent trolls and parasitic lawyers. The big players already provide storage systems and the small players can't afford the litigation. It's just not worth the effort.

      --
      Support microSD: in a post 9/11 world, it is unwise to carry your data on media that you cannot comfortably swallow.
  12. GlusterFS by Anonymous Coward · · Score: 3, Informative

    Check out GlusterFS. (http://www.gluster.org)

    You definitely can't run Windows in order to utilize this, but it should be a minimal effort to setup a quick netboot lab to test it with.

    Cheers.

    1. Re:GlusterFS by Insightfill · · Score: 1

      Check out GlusterFS. (http://www.gluster.org)

      You definitely can't run Windows in order to utilize this, but it should be a minimal effort to setup a quick netboot lab to test it with.

      One could envision setting up small VMWare Player instances running under a different account on Windows launch using "Scheduled Tasks" for that account (set to launch on reboot). Or - run VMWare Player as a service. A little beefier would be VMWare Server (free), but a bit more of a hassle (need to also install IIS on each XP Machine). The advantage of either setup is that the VM instance will run without a window, but will be visible as a running task in the Task Manager. The Scheduled Task approach would also let you tinker with scheduling, such as a VM that powers up at 6pm and powers down at 6am.

      Install Debian or distro of choice in VMWare image, giving it a massive virtual drive in a user account directory. Keeping it a specific user account directory will hide it from non-admin eyes. I mention Debian because it's the one I have the most experience with, with good flexibility in image size.

      Admittedly, it wouldn't be the fastest array in the world, but it should work. The bonus is that the Windows machines would continue running as usual, with only slight memory and disk performance drop. That hit would be scattered among the machines at random times based on usage of this virtual array.

      If it works well with one machine, you could duplicate the whole VM and just give it a new machine name on the network and move on.

    2. Re:GlusterFS by Verteiron · · Score: 1

      Just FYI, you don't need IIS installed to use VMWare server. You only need it if you want to use a web-based interface. The VMWare management software quite happily connects to remote VMWare services without IIS being installed.

      --
      End of lesson. You may press the button.
    3. Re:GlusterFS by Insightfill · · Score: 1

      Just FYI, you don't need IIS installed to use VMWare server. You only need it if you want to use a web-based interface. The VMWare management software quite happily connects to remote VMWare services without IIS being installed.

      Way cool tip!! THANK YOU! I spent FOREVER the first time I put in on a Windows XP machine wondering why I couldn't connect, until I found some guy's blog entry on how to do disable the default web site. WOO HOO!! Now I get to take IIS off of my kids' Windows XP machine. Ironically, the whole reason I have VMWare Server on it is to host my Debian server, for Apache and FTP. Yeah, I know, sounds backwards, but my kids use the machine for (Windows) games, and I wanted to have a machine that was always on to host my server.

    4. Re:GlusterFS by Verteiron · · Score: 1

      Yeah, just install the VMWare management console (not the whole server) on another computer and you can remotely control your VMWare server, view the sessions, etc. It's network-intensive but it works quite nicely.

      --
      End of lesson. You may press the button.
  13. Short answer... by Anonymous Coward · · Score: 0

    Short answer: No.

    Long answer: Nope.

  14. Send them to our troops in Iraq by kipin · · Score: 3, Funny

    I had a drive fail on me last year and I wanted to take my frustration out on it so naturally I did what any good American would do. I shot the shit out of it. Surprisingly it seemed to make for a pretty good piece of bullet proof armor. It stopped multiple rounds of full metal jacket 9mm rounds and managed to get a couple rounds lodged inside the casing. (None appeared to penetrate fully)

    --
    If I can not smoke in heaven, then I shall not go. -- Mark Twain
    1. Re:Send them to our troops in Iraq by boombasticman · · Score: 1

      This means usually, that you need a stronger weapon. Next time, don't use the gun of your wife for such scientific demolition tests.

    2. Re:Send them to our troops in Iraq by R2.0 · · Score: 1

      You should have hooked up a garden tractor battery and had it spinning when you shot it - 7200 rpm of centrifugal goodness.

      --
      "As God is my witness, I thought turkeys could fly." A. Carlson
    3. Re:Send them to our troops in Iraq by eagl · · Score: 5, Informative

      The drive survived because the 9mm is weak. Get a better gun using a better round, like .40 cal or even a good old .45.

      I've had a chance to read after-action reports from Iraq and Afghanistan, and the 9mm is pretty much a joke. Most of the forces that really rely on hangun stopping power have obtained emergency authorization to bypass normal procurement processes in order to get better handguns using better ammunition. To my knowledge, a modern .45 is considered one of the best alternatives.

    4. Re:Send them to our troops in Iraq by Firethorn · · Score: 2, Informative

      Nahhh...

      Remember, pistol rounds are pistol rounds, and rifle rounds are rifle rounds.

      Next time he should test it with pretty much any centerfire rifle.

      --
      I don't read AC A human right
    5. Re:Send them to our troops in Iraq by Anonymous Coward · · Score: 0

      I hate it when I don't fully penetrate.

    6. Re:Send them to our troops in Iraq by dlapine · · Score: 2, Funny
      Well, I know that .45 ball ammo won't penetrate a maxtor 40 GB drive casing- just makes a nice big dent with a nicely mushroomed round. Fired the round myself. Try it out. we had a guy with a .44 magnum and his shot punched clean through the spindle. We had a tachometer at the range that day, and the .45 was doing about 900fps. No holes in the drive though.



      Now, that doesn't mean that a .45 doesn't have more stopping power than 9mm, just that it wouldn't penetrate the aluminum casing of a hard drive. Fortunately for us, the bad guys don't use old drives as armor.

      --
      The Internet has no garbage collection
    7. Re:Send them to our troops in Iraq by retro128 · · Score: 1

      Thick metal like what you see on HD's will stop most pistol rounds (Except maybe the S&W .50). If you want to punch through, you've got to use a rifle.

      --
      -R
    8. Re:Send them to our troops in Iraq by localman · · Score: 1

      I've always thought the Five-SeveN was a pretty sweet, with well-designed light armor piercing rounds. Plus you can swap them into the P90 FTW.

      Yeah, I played too much Counter-Strike :)

      But in all seriousness, it is some well-thought-out weaponry. I particularly like the promo video.

      Cheers.

    9. Re:Send them to our troops in Iraq by ptudor · · Score: 1

      I'll first admit my bias toward NATO ammunition for practical reasons like availability and interoperability, but the .45 vs 9mm argument befuddles me. It's a whole 0.095669291 inches wider, so it's suddenly the be-all-end-all of pistol ammunition? I sometimes think people forget the purpose of a handgun is to keep you in the game long enough to get to your rifle and accuracy matters more than the size of the hole. Two rounds of 5.56 will poke a hole in a cinder block wall with twenty-eight left over for the fight; I wouldn't even consider a handgun for that task.

    10. Re:Send them to our troops in Iraq by david_thornley · · Score: 1

      I've been told that the favorite weapon is the good old .50-caliber machine gun M2. When you fire Ma Deuce at the enemy, it really doesn't matter what sort of cover he's in.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
    11. Re:Send them to our troops in Iraq by Anonymous Coward · · Score: 0

      Most of the "modern" 45's are 1911's with a few manufacturer specific modifications. However, FN has a nice 5.56 pistol with a cut down M-16 cartridge that really does the trick. Damn is it a hot round, can punch through some armor, and isn't much heavier then a .45. I'd suggest looking into it if you have a use for a hand cannon.

    12. Re:Send them to our troops in Iraq by Anonymous Coward · · Score: 0
      Fortunately for us, the bad guys don't use old drives as armor.

      Jesus Christ! Who are the good guys in your world??

  15. Sanmelody by theoverlay · · Score: 4, Informative

    Datacore offers software called Sanmelody to turner servers into a cheap storage network and there are other vendor solutions as well. http://infiniteadmin.com/

  16. Re:Not without heavy utilization of other resource by bostonsoxfan · · Score: 1

    I don't know at where I work tons of engineers leave their computers online overnight you could do backups over night or transfers or whatever. Or you can do something similar to Seti@ Home, run when computers are idle or not utilizing any processing power. I think the big hurdle is partitioning off a part of each hard drive so that the user can't access it, so what they don't know about they can't be angry about losing. The thing I see as a problem if you use it as a virtual drive or backup or something, security. Servers are nice because they are locked up and monitored and generally protected more than user workstations. Where I work the workstations aren't locked up or anything, I would be very wary of allowing secure company documents to be stored on something that is amorphous by its nature. I work in the aerospace industry and I know that heads would roll if some of the documents we generated were leaked because lots of the stuff on servers is classified, proprietary or IP.

  17. AFS by arabagast · · Score: 5, Informative

    OpenAFS is a distributed file system. It seems to fit your bill. No personal experience, so don't know how well it actually works.

    --
    Doolittle : ...What is your one purpose in life?
    Bomb no.20 : To explode of course.
    1. Re:AFS by xoundmind · · Score: 1

      My first exposure to Unix (1986) was on the AFS network at CMU. I don't about using this on Windows, but our disk access was never an issue.
      Thanks for the reference, that really takes me back...

    2. Re:AFS by Monx · · Score: 1

      IBM uses AFS internally. It works. Use it.

    3. Re:AFS by Secure+Endpoints · · Score: 2, Insightful

      AFS would be applicable if you were interested in turning each end user workstation into a centrally managed AFS server and dedicate storage for holding replicated readonly volumes. I wouldn't store single instance read-write volumes on a machine that at the mercy of an end user to turn on or off. I would also be resistant to deploying centrally managed storage on end user controlled machines in any case due to the access control issues. Anything that is stored on a machine that the end user has physical control over can be accessed by the end user.

      As others have pointed out, storage is so inexpensive these days. 8TB can be obtained for a few thousand dollars and managed in a much more reliable manner.

    4. Re:AFS by wik · · Score: 1

      You really don't want to put normal users on the AFS file servers. Managing volumes across these machines would also be a headache more painful than you can imagine. AFS also has very poor handling of server outages and volume replication exists only for read-only volumes.

      - A former AFS admin

      --
      / \
      \ / ASCII ribbon campaign for peace
      x
      / \
  18. Help TPB! by Anonymous Coward · · Score: 0

    They need secret servers on unknown locations, you know...

  19. Solution for Linux by Anonymous Coward · · Score: 2, Informative

    There's project dedicated to this on Linux, http://nbd.sourceforge.net/.

    If there's nothing similar for windows, you might be able to run it through cygwin.

    Actually, this claims to run on Windows: http://www.vanheusden.com/Loose/nbdsrvr/

    1. Re:Solution for Linux by mjrauhal · · Score: 1

      nbd is nice for some stuff but lacks fault-tolerance. Of course, you can run RAID, possibly several levels (say, a raid-6 on top of raid-1 or something) on top of nbd devices to trade space for fault-tolerance as much as you want, but you still lack flexibility. The advantage to RAID-over-nbd, on the other hand, is of course that you can do that right now if you want :] (And yes, the nbd server shouldn't be overly hard to run on Windows, one would think; it's rather simple...)

      A better solution would work on a bit higher level, though. If a host goes down, it would be desirable to flexibly duplicate its data (from other mirrors and/or parity data) onto others. Possibly such a system could be created on top of nbd as well. Hell, maybe ZFS with an NBD pool could someday hack that, but seems to me they'd need to work out at least bug 4852783 first.

    2. Re:Solution for Linux by BobTosh · · Score: 2, Interesting

      Raid on top of NBD works (with caveats), I tried a proof of concept once, RAID5 made out of nbd units. The configuration needs to be though through carefully so that data is striped across sufficient clients to prevent excessive resource (CPU and network) at the client end. I did it my building one PC with Linux and "mounting" each of the NBD pieces shared by the enn-user Windows PC's, then simply build RAID over the top of that. With sufficient planning you can make it quite resilient, just in case a user decides to switch off their PC. I did find that re-building the stripes when a PC did get turned off, caused the "server" (ie the Linux PC) to be heavily utilised, and this caused the clients that mounted/used the shared-out space from the "server" to receive quite poor performance. The only way I could think of really making this a serious possibility would be to beef-up the power of the "server" quite significantly and to ensure really fast network connections between it and the nbd hosting machines.

  20. You read my mind! by Danathar · · Score: 1

    I've been thinking of the same thing of late. Our IT department uses this huge SAN at $$$ money. Why couldn't a distributed fault tolerant (with something like striped with parity) be implemented across a LAN with 100Mb/GigE? The standard drive size being shipped on new PC's is at a minimum about 200GB. For biz users that is WAY overkill.

    Our whole organization is about a 1000 Windoze desktops, but I'd like to try it in our local workgroup first (maybe 20 systems). I looked around but couldn't find anything that would pool unused desktop space.

    1. Re:You read my mind! by BlueF · · Score: 1

      I've always wanted something that would do this (for windows)! If done with enough parity, poling systems for uptime and cpu/hdd/network utilization, both the client/network impact and "network disk" performance could be managed quite well I think.

      Seems like it's only a matter of time before something comes along to provide this obvious function. Here's hoping its well thought out and coded, preferably a commercial app. I would have no problem justifying such an expense if such a product existed in polished form.

      Think of all the data archiving (daily/weekly/monthly backup sets, etc) which could be done right in-house. Add a bit of encryption and stripe data in a manner so reconstruction would be near impossible with out all the parts...

    2. Re:You read my mind! by jeffmeden · · Score: 1
      reconstruction would be near impossible with out all the parts

      You had me up until there. Um, what happens when YOU lose one of the parts?

    3. Re:You read my mind! by fat_mike · · Score: 1

      200gb? I believe whoever purchases your IT stuff must be getting a kick back. The new Core 2 Lenovo's were buying are just now coming with 80gb drives.

      Windoze? Haha, haven't heard that before.

      Fault tolerant distributed storage using employees computers. Hang on, I'm laughing to hard to type.

      Okay, I'm good. These are the same employees who do things like have portable heaters underneath their desks that point at the PC. They kick them, roll their chairs into them, stack crap on and around them, spill stuff, drop crumbs, reboot whenever they feel like it and generally abuse the hell out of them.

      You made my day.

    4. Re:You read my mind! by BlueF · · Score: 1

      Oops... I meant to say, reconstruction of the data would be impossible without a significant number of nodes (i.e. were a few to many hard drive nodes compromised -- stolen, backdoored, etc -- the network file system would not be recoverable. My bad. A significant difference in approach and security of this hypothetical system.

    5. Re:You read my mind! by turbidostato · · Score: 1

      "Fault tolerant distributed storage using employees computers. Hang on, I'm laughing to hard to type."

      You are laughing at it? It must be out of ignorance.

      -Fault tolerant: Of course, they will be desktop computers
      -Distributed: Of course, eack employee has his own desk
      -Storage: Of course, it's about all this free space what we are talking about.

      Or what did you think to bring some benefit out of all that spare space? Fault-intolerant centralized storage? *That* would be laughable!

  21. use it local by Anonymous Coward · · Score: 1, Interesting

    You could use extensive subversioning on each machine individuall to get an benefit out of unused discspace und computing power. User who accecidentially overwrite or delete could get them back from there own disc space. Some kind of NFS would use a lot of network traffic an bandwith is often a limiting faktor.

  22. Storage by Genocaust · · Score: 2, Informative

    I tried to tout the merits something like this could have for non-critical regular user backups, but as previous posters mention, it was shot down.

    I was suggesting to run DrFTPD as a backend with NetDrive as an access medium. It looks good on paper, but I've never had the chance to apply it so widescale :)

    With DrFTPD it's easy to setup whatever kind of redundancy you would want, ie: "at least 3 nodes will mirror all files in /doc" or whatever. NetDrive (and I'm sure there are others) help take away the learning curve and hassle of "here, use this internal ftp for backups, not a network drive" as it will map the actual FTP to a network drive and appear like normal.

    Just my 2c.

    --
    It could be that the only purpose of your life is to serve as a warning to others.
  23. the IT guy with time on his hands by westlake · · Score: 1
    What would be a productive use for these terabytes of wasted space?

    The first question to ask is whether what you want to do makes any sense for your employer. Who has to maintain this beast once you build it.

    1. Re:the IT guy with time on his hands by kernspaltung · · Score: 1

      We sure don't have much faith in software do we? I don't have time on my hands, which is why I am envisioning a bulletproof "load it, configure it once, and let 'er rip" solution. If you consider what's accomplished with BitTorrent and other peer-to-peer architectures, along with the fairly hardware transparent storage pooling provided by ZFS, and encryption, all the pieces exist to make this work. This would not be for high-availability data. More like a way to have the network back up to itself, or provide versioning as another poster suggested.

      You're telling me that if you had a service you could load on each client machine that, with a few minutes of setup, would provide a pool of highly-redundant storage, you wouldn't use it?

  24. dCache by Rev+Saxon · · Score: 3, Interesting

    http://www.dcache.org/ You will need a system to act as a master, but otherwise your normal nodes should work great.

    --
    I am that much more enlightened and proportionally disillusioned
    1. Re:dCache by scheme · · Score: 1

      http://www.dcache.org/ You will need a system to act as a master, but otherwise your normal nodes should work great.

      Dcache won't do what he wants it to and would require quite a bit of changes in applications that use it since the dcache filesystem is not posix compliant. Dcache 2.0 with chimera should present things as a nfs4.1 share but chimera has just been completed and I think no one but the developers have it running anywhere.

      --
      "When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
  25. Revstor by theoverlay · · Score: 1

    Try Revstor's Sanware which allows you to designate nodes (servers) that will provide resources to create a storage area network. http://infiniteadmin.com/

  26. Slashvertisement for wuala? by jiadran · · Score: 0, Offtopic

    This sounds like somebody is asking for wuala. Possible slashvertisement?

    1. Re:Slashvertisement for wuala? by imsabbel · · Score: 2, Funny

      Acutally, this sounds nothing like that thing you link to.
      More like your post being a slashvertisement.

      --
      HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
  27. I have a similar problem by imsabbel · · Score: 1

    He have a few compute nodes around here. Each of them has an HD, and as those are so cheap we gave them 500Gbyte ones.

    They dont really need lots of space (maybe 30Gbyte for OS and temp-files), otoh without redundancy the other 450Gbyte are worthless.

    As the task is emberassingly parallel, Network traffic wouldnt be a problem.
    If there was a solution to compine all this storage (doesnt even have to be transparent) into a distributed, redundant storage network, i could surely make use of those Tbytes

    --
    HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
    1. Re:I have a similar problem by imsabbel · · Score: 1

      To add to this:

      What i am imagine doesnt need to be low-level.

      Just a userland-application with container-files would be fine:

      They can listen to each other, and each file gets replicated on every node. If the filling level gets higher, copies are purged up to a minimal redundancy level.

      Even the factor 2 loss of non-parity redundancy would still be a lot better than not using th espace at all.

      --
      HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
  28. Re:Not without heavy *use* of other resources by Anonymous Coward · · Score: 2, Insightful

    Please stop typing words like "utilization" when you mean "use". You sound like a PHB trying to sound smarter than he really is and you make it a pain for people to read what you write, especially non-Anglophones. Read George Orwell's essay on this topic.

  29. Give them to yahoo... by mecenday · · Score: 1

    ... they'll need them.

    --
    Tautologies, they are what they are.
  30. Backup by m0pher · · Score: 2, Informative

    If you don't already have a backup mechanism for the data that may be on these systems, one way to use all the available storage is for backup. Vembu StoreGrid a solution designed specifically for this problem. Get more info @ http://www.vembu.com./

  31. Looking at the problem another way... by pedantic+bore · · Score: 4, Insightful

    You might want to ask yourself why, after more than a decade of research and countless papers and prototypes that address this problem, your PCs storage are still underutilized...

    It's harder than it looks to get something reliable. Your PCs have extra capacity because it's cheap, but mining that capacity is not cheap. As other posters have pointed out, putting together (or just purchasing) a server with a few TB of storage is simpler and cheaper, less prone to getting wiped out by a virus, easier to manage and backup.

    --
    Am I part of the core demographic for Swedish Fish?
    1. Re:Looking at the problem another way... by Libor+Vanek · · Score: 1

      And yet Google experiences show otherwise ;-) (lot of desktop-class PCs with distributed filesystem)

    2. Re:Looking at the problem another way... by maxwell+demon · · Score: 1

      However for Google it works because they don't care much for a single failed PC (that's what the redundancy is for). If someone is sitting in front of the PC, this is very different. It doesn't help him that there are many other PCs still working.

      --
      The Tao of math: The numbers you can count are not the real numbers.
    3. Re:Looking at the problem another way... by PhiberOptix · · Score: 1

      yes, but those desktop class pcs @ google are not used by people as desktops. So they are not turned off and are not used as desktop pcs, like the ones that the poster mention.

    4. Re:Looking at the problem another way... by Doppler00 · · Score: 1

      I think the better question is, if 80GB is too much, what can we use that is cheaper? I think flash memory will definitely start making its way into a lot of lower end business class PC's that don't need more than 16GB or so memory. It could be integrated right onto the motherboards. The only trick would be getting Microsoft to have a smaller/customizable version of the OS that doesn't require so much bloat that you're not going to use for office applications.

  32. I'm not sure that's a good idea... by ralph90009 · · Score: 2, Interesting

    While I was in college, I worked in the IT department. In my experience, your end-users will have a proverbial shit-fit if their computer's HD starts spooling up when they aren't doing anything. While it would be nice to use the spare space for data storage, I'm not sure it would be worth the headache. The volume of user complaints would skyrocket, you'd have to train them to leave the things on all the time, and you'd have a distributed data pool to manage. Changing user behavior is like teaching a two-year-old to say "thank you" (It's possible, but not fun) and your electrical and manpower expenses would probably outstrip the savings.

    1. Re:I'm not sure that's a good idea... by Anonymous Coward · · Score: 0


      > your end-users will have a proverbial shit-fit if their computer's HD
      > starts spooling up when they aren't doing anything

      Not any more dude.
      They are all used to running Norton AV, and that'll swap like a motherfucker :-)

    2. Re:I'm not sure that's a good idea... by dave420 · · Score: 1

      You can't hear the hard disk in most computers, especially in an office. From my experience, office users don't care if the hard disk starts doing cartwheels - as long as they can use their computer, they're fine. As for having them leave their computer on all the time, well, you can use redundancy to minimise the unavailability of files, and then use Wake on LAN to turn specific computers on should both/all computers with the data you need be off at once. The users don't even have to be told what's going on, and this entire process can be automated.

  33. iSCSI + ZFS by Anonymous Coward · · Score: 0

    this is all hypothetical, but you could create disk images and use each client as an iSCSI host, mount each of the servers in your favorite RAIDZ configuration on a network server, and then reshare everything through Samba or even back as pools of iSCSI volumes.

    That actually sounds like a pretty cool project, and with enough redundancy, it could be fairly robust.

  34. Storage at Desk by phooji · · Score: 2, Informative

    is a project at the University of Virginia that tries to do exactly what you describe: take unused storage on a bunch of machines and turn it into a file system. http://vcgr.cs.virginia.edu/storage_at_desk/index.html

  35. P2P by Anonymous Coward · · Score: 0

    I've often thought a Napster-like P2P network could be the basis for a fault-tolerant distributed storage system. By "Napster-like" I mean a P2P system with a central index. Add access control and versioning software that can push files from peer to peer. Once a document is on, say 5, peers there is no need to back it up.

    Image a system like this:
    1. A couple of redundant index servers
    2. An integrated versioning system with push capability
    3. A large chunk of desktop disk space hidden from the user
    4. Appropriate access control at the index level
    5. ???
    6. Profit! (this is /. after all)

    Unfortunately, some powerful corporations are so terrified of P2P that they're doing all they can to kill it in its infancy.

  36. Re:Not without heavy utilization of other resource by AndGodSed · · Score: 1

    Well the average Windows install doesn't recognize an EXT3 filesystem (as a for instance, most Linux filesystems aren't "seen" from windows) so partitioning the drive with a windows and linux partition should be fine, then use these drives for multiple backup mirrors via a small linux apache server...

    You could secure them with passwords and so on.

    Oh go ahead and poke flaming holes in my suggestion *buries face in hands and sobs*

  37. Switched off? by danhuby · · Score: 1

    What if the PCs are switched off?

    Even using something akin to RAID, so you store the same data across several machines, you've still got the risk that switching off PCs will cause data to be temporarily unavailable.

    Leaving a hundred PCs switched on just to get some extra disk space isn't going to be eco-friendly or cost effective. You can build a several terabyte file server very cheaply these days.

    Dan

    1. Re:Switched off? by dave420 · · Score: 1

      Wake on LAN would take care of that. It could even turn the computers off when they're not needed, if it woke them up in the first place (and no-one started using them in the mean time). Having data striped across multiple machines means that it's less likely all machines with the specific data will be off at the same time, and when they are, Wake on LAN will do the trick.

    2. Re:Switched off? by danhuby · · Score: 1

      Not exactly speedy data access though? Having to wait for a machine to start up...

    3. Re:Switched off? by dave420 · · Score: 1

      That's why you use redundancy. Having a delay is better than having no storage in the first place, and if you have the data on multiple machines, the chance of all of them being turned off is drastically reduced. The delay of turning a computer on and having it start up (about 30 seconds) isn't that bad, considering it wouldn't happen that often.

  38. Re:Not without heavy utilization of other resource by GIL_Dude · · Score: 1

    Wait a minute; if Windows can't see the data, how will it serve the data up to your remote machines? Or are you saying that he should remotely (or on an schedule) reboot the machines into Linux overnight to do this? Because there is no way an OS is going to serve up files from a partition it can't even read.

  39. Re:vista? - DFS by whackco · · Score: 4, Informative

    You know, make fun of Microsoft all you want, but they actually have something for this - DFS - Distributed File System. Just create a share with each of these and POOL IT with a DFS system. Then use and manage it to your hearts content with all the midget-donkey-goatse crap you want.

  40. Versioning Clarification by Anonymous Coward · · Score: 0

    When I wrote "versioning system" I didn't mean a CVS. I mean software with enough brains to know that a document was edited so it can push the new version to all the peers storing the document.

    So if an AC replies to his own post is that an act of brazen cowardice?

  41. Grr! this is what I hate most about sysadmins by Anonymous Coward · · Score: 0

    The user boxen are for the users, not for you.
    The diskspace/CPU cycles/whatever is not idle, it's being kept available for the users' needs.

    Don't be such a prick. Pee in your own sandbox.

    1. Re:Grr! this is what I hate most about sysadmins by Anonymous Coward · · Score: 0

      What an idiot.

  42. botnet by Anonymous Coward · · Score: 0

    i'm sure some p2p botnet could use the space

  43. Re:Not without heavy *use* of other resources by DarrenBaker · · Score: 3, Insightful

    Hrmm... Funny, he didn't come across that way to me at all. You, however, come across as a pompous linguistic Nazi, much like Orwell. If you compose sentences for people who don't have command of the language, then you are really quite delusional.

    As is my understanding, resources are utilised, while tools are used. He was correct in its usage.

  44. My first (serious) thought... by Zocalo · · Score: 1

    Was to use a software driver to export the spare part of the disk as an iSCSI (or iATA, if you prefer) target. For performance and integrity, you'd probably be better having a dedicated partition the OS couldn't easily fiddle with, but it shouldn't be too hard to create an array of ~50GB iSCSI targets that you could then collate into larger volumes. Performance wouldn't be stellar, unless you could use a dedicated NIC/VLAN on the hosts, but should be reasonable enough for use a nearline storage of non-critical data that was already archived to tape. But so much for the pros, what about the cons?

    The big problems with this idea though are going to be MTBF, storage redundancy and power consumption. You're going to be building your storage array using desktop PC rated HDDs, so lets say an MTBF of 50,000 hours, *but* you have about 100 of them so you should be anticipating a fairly frequent drive failure rate. That means both striping and repeatedly mirroring the data across workstations to ensure that it's always available should a drive or two die - or just be powered off overnight, unless you want all your workstations powered up 24/7 ($$$). You'd also need to be able to dynamically rebuild the data set in the event of a drive failure; but how do you detect a drive failure vs someone simply tripping over the power/network cable - that software's not looking so simple now, is it?

    I think it's an interesting idea, but the overheads of maintaining enough copies of each element of data online to survive drives becoming unavailable, intelligently managing the replication of data when a drive is deemed to have failed and not just gone temporarily offline, plus network congestion issues make it non-viable. It'd almost certainly be cheaper and faster to write off the spare HDD capacity in your workstations and buy cheap 1U servers with a couple of GB NICs onboard and cram them full of high capacity SATA drives for storage.

    --
    UNIX? They're not even circumcised! Savages!
  45. Microsoft DFS is an easy answer by whackco · · Score: 1

    Microsoft makes an easy to use utility for this EXACT situation called DFS - Distributed File System.

    1) Simply make a share on all those machines and POOL them with a DFS server and you are good to go.

    2) ????

    3) PROFIT!!1!

  46. Grog likes it simple by upside · · Score: 2, Insightful

    Great, let's all dumb down to the lowest common denominator. English is a rich language and all the better for it. If you're too lazy to learn it, your choice. I'm a non-native speaker but prefer a vibrant, expressive language to some "for-dummies" international pidgin.

    --
    I'm sorry if I haven't offended anyone
    1. Re:Grog likes it simple by cammoblammo · · Score: 2, Funny

      No. You should never use a big word when a diminutive one will suffice.

      --

      Cogito, ergo sig.

    2. Re:Grog likes it simple by gumpish · · Score: 1

      I'm a non-native speaker but prefer a vibrant, expressive language to some "for-dummies" international pidgin.
      I think you fail to grasp the positive impact a truly universal second language would have.
    3. Re:Grog likes it simple by Enderandrew · · Score: 1

      Replace diminutive with small, and perhaps you have a point.

      --
      http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
    4. Re:Grog likes it simple by FiloEleven · · Score: 1

      Oh no! You've fallen into the sarchasm!
      >_

    5. Re:Grog likes it simple by volpe · · Score: 1

      Great, let's all dumb down to the lowest common denominator. There's no such thing as a "lowest common denominator". There's a "greatest common denominator" and a "least common multiple".

      (Is being a math Nazi as bad as being a grammar Nazi?) :-)
    6. Re:Grog likes it simple by Blkdeath · · Score: 1

      Great, let's all dumb down to the lowest common denominator. There's no such thing as a "lowest common denominator". There's a "greatest common denominator" and a "least common multiple".

      You'd be absolutely incorrect, both in the mathematical sense and the colloquial sense. Lowest common denominator is a perfectly valid mathematical term and in common usage the lowest common denominator is the most minimal subset of data available that's universally understood. For example, Slashdot's comments are composed primarily of ASCII characters, the lowest common denominator in electronic communications that even the lowly typewriter, screen reader, or teletype machine can understand. If you want a vehicle that anybody can drive you pick up a vehicle with an automatic transmission because it is the lowest common denominator and requires the least amount of specialized skills to drive. In purchase transactions cash in the national currency is the lowest common denominator because the individual or vendor in question may not accept the likes of debit cards, Visa, MasterCard, American Express, Diners Club etc.

      I could go on all day but you get the idea.

      (Is being a math Nazi as bad as being a grammar Nazi?) :-)

      When you're incorrect, they're both devastatingly bad ways to be.

      --
      BD Phone Home!

      Shameless plug. Like you weren't expecting it.

    7. Re:Grog likes it simple by brad-x · · Score: 1

      0.999... != 1. Archimedes says hi. Enjoy.

      --
      // -- http://www.BRAD-X.com/ -- //
    8. Re:Grog likes it simple by turbidostato · · Score: 1

      "There's no such thing as a "lowest common denominator"."

      Yes there is. And it even has the advantage of being truly common.

      It's "1".

    9. Re:Grog likes it simple by volpe · · Score: 1

      As for the mathematical sense, you've provided neither an example nor a citation. As for the colloquial sense, you entirely missed my point. I'm aware, obviously, of the existence of the colloquial sense. My point was that the colloquial sense is nothing more than an egregious abuse of mathematical terminology.

  47. Re:Not without heavy *use* of other resources by fretlessjazz · · Score: 5, Funny

    Well, you sound like a troll. I seriously doubt anybody misunderstood what he meant because he used the word "utilization". Or, should I say he utilized it? UTILIZE UTILIZE UTILIZE UTILIZE UTILIZE UTILIZE UTILIZE UTILIZE Does it hurt yet?

  48. Re:Not without heavy *use* of other resources by Dogtanian · · Score: 5, Funny

    Read George Orwell's essay on this topic. Going by his dislike of overused, cliched phrases expressed in that essay, today's "businessspeak" (mindless repetition of words and phrases that have long since been driven into the ground by thoughtless, banal, stupid repetition) would have him spinning in his grave so much that we could use him as a form of renewable energy.

    The solution is obvious. We need to think outside the box and raise the bar when it comes to language... someone needs to step up to the plate and bring something new to the table. I'm thinking of someone I have synergy with, not just the type that goes for the low-hanging fruit.

    Ooh.... he's spinning nicely. Another couple of Orwells and we'll have enough electricity to power the world :)
    --
    "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
  49. A project for Google? by eagl · · Score: 1

    Isn't this something Google either has already done, or *should* do? Google Distributed File System... GDFS. It has the added benefit of also being a curse if it goes wrong. Seriously, isn't this an ideal project for Google? And if they've already done it, is it available for implementation by everyone else?

    I'd like to see some sort of distributed filesystem as a standard installation option in a linux distribution... The question would be something to the effect of "would you like your computer to find unused disk storage space on your network, and use it for managed redundant storage available across your network?

    It likely wouldn't be very fast (imagine RAID 1 or 5 with each disk connected only by ethernet) and the controller on yet another computer also connected only via ethernet, but for a lot of people, absolute speed isn't really required and having all that free space managed in a usable form would make up for the lack of speed.

  50. Re:Not without heavy *use* of other resources by Anonymous Coward · · Score: 0

    Well in this case, "utilization"/"utilisation" does mean use. Utilization is actually clearer for non-English-speakers, IMO, because it is always a noun, whereas "use" can also be a verb (perhaps also an adjective). By the way, what is a PHB?

  51. Re:A project for Google? - whoops here it is by eagl · · Score: 1

    Whoops, should have "googled" this first. Here it is, google file system.

    http://labs.google.com/papers/gfs.html

    The big questions of course are is it usable by regular people, and is anyone actually working on implementing and including this in any of the major operating systems?

  52. Circular Backups by PeterJFraser · · Score: 1

    The trick is how, for my machines at home (all 3 of them). I have the first backup to the second and second to the third, and the third to the first. I have thought for some time, that there should be some method of automating that procedure. But keeping track of where things are and which machine has what space would not be easy.

    1. Re:Circular Backups by mrbcs · · Score: 1

      Cobian backup. Free. Automatic. I have our personal machines doing an automatic backup to a server. Set it up for 2:00 am. works perfect. Will do incremental backups.

      --
      I'm not anti-social, I'm anti-idiot.
    2. Re:Circular Backups by flyingfsck · · Score: 1

      rsync on Cygwin.

      --
      Excuse me, but please get off my Pennisetum Clandestinum, eh!
  53. Microsoft Farsite (and related topics) by Jered · · Score: 1

    What're you're talking about is not a new concept, it just turns out to be really hard to build in a useful way. The most comprehensive discussion of the problems involved can be found at the Microsoft Research project Farsite.

    The short version of the problem is that the level of service you can expect from each system is incredibly variable, so it's hard to offer a meaningful QoS for the system as a whole. It's not quite as bad as the distributed-hash-table problem (a.k.a. P2P file storage), but it's still bad. (Zooko once told me that MojoNation saw an average 50% turnover in nodes in a 24 hour period.) But it's also not as easy as having all your distributed nodes dedicated to just storage, and even that's a really hard problem to solve. (I should know; my company is one of the few vendors doing it.)

    Someone else suggested OpenAFS. OpenAFS is fantastic, but not for unreliable server environments. I really don't think there's a complete solution out there, but not for lack of asking.

    1. Re:Microsoft Farsite (and related topics) by drsmithy · · Score: 1

      The short version of the problem is that the level of service you can expect from each system is incredibly variable, so it's hard to offer a meaningful QoS for the system as a whole. [...] But it's also not as easy as having all your distributed nodes dedicated to just storage, and even that's a really hard problem to solve. (I should know; my company is one of the few vendors doing it.)

      In a centrally-managed environment (like, say, a corporate network) it's so trivial a problem to solve with a combination of policy (always leave your computer on) and technology (Wake-On-LAN), that it hardly even seems worth bringing up.

  54. Re:Not without heavy *use* of other resources by DarrenBaker · · Score: 0, Offtopic

    I'm a troll, and he's modded +5 insightful? Must be a lot of non-English speakers here.

  55. Replace the drives? by LihTox · · Score: 1

    I know little about hardware, so forgive a stupid question: would it make any sense to pull out these computers' drives, replace them with smaller ones, and either sell the lot or assemble them in one place (a RAID?) for easier maintenance? Having your storage spread out through a company becomes a problem if one computer goes down (or is turned off by its user).

    I know the cheapness of drives may make this silly.

    1. Re:Replace the drives? by kernspaltung · · Score: 1

      You can hardly buy smaller drives if you wanted to. That's really what started me thinking about this. I was worried about how to provide backups once our present solution is at capacity, and I was thinking "We just bought 40 new PCs with 80GB drives and I'm here pricing out ways to add another dozen gigabytes to our backup system. We don't have a budget (nor really a need for) big fancy SAN solutions and iSCSI and tape robots and all that junk. Reading some of the comments in this thread make me realize how stuck in the '80s and '90s most IT folks are as far as philosophies and approaches to problems.

    2. Re:Replace the drives? by milsoRgen · · Score: 1

      Has anyone answered your question yet kern? I couldn't seem to find a straight forward answer... I would like to do the same thing on my home LAN...

      --
      I'm sick of following my dreams. I'm just going to ask where they're goin' and hook up with 'em later.
    3. Re:Replace the drives? by kernspaltung · · Score: 1

      No, not as in "Product XYZ was designed to do just that! Just install it on each machine, set up your pools, and forget about it". Lots of links to projects that sorta-do-that-kind-of-distributed-something-or-other-but-on-linux-or-on-Windows2K3Server-or-, but like you said, no straight-forward answer.

      It's rather baffling how many replies are "why would you want to do that?", "That's stupid.", "Buy an iSCSI SAN thing you idiot!". They just don't get it, and it seems so simple to me. The peer-to-peer nature of BitTorrent, combined with on-the-fly encryption, combined with the virtualized and redundant approach to storage of ZFS, and you've got all the pieces. Just nobody seems to have put it together in one project or product. I'd write it myself if I had six months of free, my-bills-are-paid time.

    4. Re:Replace the drives? by milsoRgen · · Score: 1

      Yeah I couldn't quite grasp why all the people were knocking ya, sounds like a great idea. It wasn't until I started thinking about it that I realized how useful it could really be.

      --
      I'm sick of following my dreams. I'm just going to ask where they're goin' and hook up with 'em later.
  56. Re:vista? - DFS by OnlineAlias · · Score: 4, Interesting


    This is why SAN manufacturers have come up with "thin provisioning". NetApp is quite good it, read more here.

  57. Re:Not without heavy *use* of other resources by Gewalt · · Score: 0

    Parent is not troll!
    GP is 100% troll.. not insightful.

    --
    Modding Trolls +1 inciteful since 1999
  58. Birth of the Matrix? by TropicalCoder · · Score: 5, Interesting

    What would be a productive use for these terabytes of wasted space?

    Well, I had this idea when I read about some Open Source software that allowed distributed storage (sorry, forgot what that was, but by now I am sure it has already been mentioned in this discussion). The idea was this - suppose we have such software for unlimited distributed storage, so that people can download it and volunteer some unused space on their HD for a storage pool. Then suppose we have some software for distributed computing like we have for the SETI program. Now we have ziggabytes of storage and googleplexflops of processing power, what can we do with that? How about, for one thing, storing the entire internet (using compression, of course) on that endless distributed storage, and then running a decentralized, independent internet via P2P software? The distributed database could be constantly updated from the original sources, and the distributed storage then becomes in effect a giant cache that contains the entire internet. Now we could employ the distributed computing software to datamine that cache and we could have searching independent of Google or Yahoo or M$FT. Beyond that we could develop some AI that uses all that computing power and all that data to do... what? - I'm not sure yet. Just thought I would throw this out there to perhaps maybe get stepped on, or who knows, inspire further thought.

    1. Re:Birth of the Matrix? by 77Punker · · Score: 1

      Sounds a lot like Freenet. http://www.freenetproject.org/ It's still kinda slow, but it's still getting better. The only problem I have with it at the moment is that there's very little content worth seeing on it.

    2. Re:Birth of the Matrix? by Anonymous Coward · · Score: 0

      Just Wait ;)

  59. distributed file systems by Fireshadow · · Score: 1

    I think a better question is define your problem better with some additional details. Do you want a separate drive letter to appear to the customers for them to keep their stuff on? Or do you want something that only you can get to store backups on? What kind of network is it? 100Mb/s? 1 Gb/s?

    You'd asked two questions: "What would be a productive use for these terabytes of wasted space? " I don't know if I'd ask the slashdot crowd this.

    "Does any software exist that would enable pooling this extra space into one or more large virtual networked drives?" A few. Localhost Azureus http://p2p.cs.mu.oz.au/software/Localhost/faq.html but it hasn't been maintained since 2006. Lustre http://en.wikipedia.org/wiki/Lustre_(file_system)#Networking is a neat read but I don't think is applicable in your situation. It'll give you an idea as to what's out there.

    In theory, you could use MRTG to measure your fileserver's switch port to see how much traffic the desktops pull from the server. Divide it by the number of desktops and that tells you on average how much each requests. Now consider that this average would be going to distributed across the network, with each desktop seeing an increase. A Gb LAN may be able to take this with no sweat.

    As for how much disk space you are going to practically gain is up for debate. Let's say a 20 Gb quota from each drive. Doing the math , that just under 1.95 Tb. If you ever have to reload a number of those workstations, a good chunk of that is going to be unavailable. You may be better served with a NAS storage device.

    --
    "It's one thing to talk about the poetry of machines. Quite another to listen to it for yourself."
    1. Re:distributed file systems by ranok · · Score: 1

      Hadoop is an open source DFS that is Java based, so it runs on Windows. It is pretty fault-tolerant so it might work in a workplace environment. We run it in a computer lab where the machines are constantly up and down, and it works pretty well. It also has MapReduce, which lets you distribute IO tasks. http://hadoop.apache.org/core/ Jacob

      --
      (>'.')>
  60. Re:Not without heavy *use* of other resources by Anonymous Coward · · Score: 0, Insightful

    The difference is that I wasn't nasty about it, I explained a problem and gave him a link to an essay about it. You, on the other hand, called Orwell and me names, attacked a straw-man, and said something incorrect about the words that is trivially debunked by glancing at a dictionary.

  61. Re:Not without heavy utilization of other resource by pionzypher · · Score: 1

    Linux can read FAT32 and NTFS partitions just fine. So yes, perhaps have a vm boot the image at night, mount the windows partition and backup the drive.. shutting down after. Or some custom app that just writes to the ext2 partition. As Bostonsoxfan alluded to, security might be an issue. Encrypting the partition the backups were stored on would probably be sufficient for most places.

    Of course the risk of backing up your data on the same physical drive remains. I suppose a VM booting, a secure copy to a peer as well as accepting a copy of the peers backup would address that well enough. Now you'd just need a secure way of choosing the peer (unless you're going to hardcode all the pairs).

    --
    I'll believe in corporations having personhood when Texas executes one... - advocate_one
  62. Two products you should probably take a look at by NSIM · · Score: 1
    There are two companies out there that may be able to do what you need:

    http://www.seanodes.com/

    http://www.revstor.com/

    Both claim to be able to pool unused storage on desktops and application servers and make it available to hosts on the network.

  63. Re:vista? - DFS by MikeyTheK · · Score: 0

    Yeah, but what happens if the local user needs the space? Does DFS give priority to local storage and move the files? If it has to quickly that could be a pain since the throughput would be poor, right?

    --
    Friends help you move. Real friends help you move bodies.
    Never forget: 2 + 2 = 5 for extremely large values of 2.
  64. NBD + RAID + Truecrupt by Anonymous Coward · · Score: 0

    What I would do is set up a large file on each machine, and export it using nbd - I think they do a Windows version.

    Then, gather all these NBDs together at the server, using RAID to add massive redundancy to cope with users switching off their machines/crashing/whatever.

    Finally, apply strong cryptography (eg. Truecrypt or LUKS) to the RAID volume, so that all the data sent across the network and stored on the machines is unintelligable to anybody except you.

  65. Re:Not without heavy *use* of other resources by OS_Neutral · · Score: 0, Offtopic

    I find it hypocritical and mildly ironic that you use the hyphenate "non-Anglopohone" in criticizing someone else for using unnecessarily complex speech that may not be easily understood by non-native speakers. And, for the record, the performance monitors on my Windows systems tell me the "percentage utilization" of a given resource, not the "percentage use".

  66. You could also help SETI by BrendaEM · · Score: 1

    If you have a little extra processor time, you could help SETI. I believe they have more data than they can search through. The client that loads SETI also can do a number of other projects, such as folding. The client can be throttled, and set to only run while the machines are not being used, akin to the time you might be running screensavers. http://setiathome.berkeley.edu/ With the extra space, you could always use Clonezilla to back up one machine on another.

    --
    https://www.youtube.com/c/BrendaEM
  67. Re:A project for Google? - whoops here it is by allenw · · Score: 1

    Google hasn't released anything other than papers on GFS and their implementation of MapReduce. At this point, though, I'm not sure it matters since we have Hadoop (which, being mainly Java, C, and a little bash) runs perfectly fine on all of the major operating systems, including Windows.

  68. Re:Not without heavy *use* of other resources by DerekLyons · · Score: 1

    "Utilization" is a perfectly good word, and perfectly clear in usage and meaning to any educated person. I can't believe that on Slashdot a comment complaining a word was 'too big' would get modded up.

  69. Grid or Utility computing! by certain+death · · Score: 0

    www.3tera.com This makes use of hard drive space similar to the old Novell NSS volumes. It grabs unused sections from machines and turns it into addressable volumes...too bad it only runs on Linux :o)

    --
    "My immediate reaction is "WTF? What kind of moron doesn't make things 64-bit safe to begin with?" Linus
  70. There is a solution to this by dirvine · · Score: 1

    Be Aware: I work in this company http://www.maidsafe.net/ which has spent a lot of time and money creating such a system for global use. It is getting close to beta testing now. It is basically a DHT with a self authentication mechanism and much more. Totally distributed network (although a commercial version is in the works). There are patents (11) to protect us (product and system patents, but please it's a whole other argument) and its not yet open source. The reasons are complex but never the less well meant (however arguable). We have over 60 investors (mostly local people) and are pretty happy so far with development, but we do need to make some profits to pay investors back. I own most shares and a foundation is being set up to promote innovation and fund inventors to bring good products to the market for the common good. The system will be FREE and eventually open source when we get some traction, we need as many eyes as possible on the code :-). This is merely stage 1 and others will enhance this I hope to become the network of the future. There's too much to explain but a visit to the site may help. Public launch should be March / April.

    1. Re:There is a solution to this by OS_Neutral · · Score: 1

      Watched the video. I've got to say, that seems like a terrible, terrible idea. I'm sure you'll make billions.

    2. Re:There is a solution to this by dirvine · · Score: 1

      There is a lot to it, the video scratches the surface, main thing is an ID you own and control etc. If successful the money goes to innovators not me. Thanks for the feedback though, honestly.

    3. Re:There is a solution to this by OS_Neutral · · Score: 0

      It sounds like a platform for technical advancement, but not anything with a practical/marketable/legal application in the current vision.

      The main uses I can think of would appeal mainly to criminal/terrorist elements.

      Ultimately there are two major problems. The first is legal. I can assure you that I, for one, will not volunteer to share out any of my storage to host anonymous encrypted data. Whether or not the files are stored in whole, the law in most places would not take kindly to the pleas of people who willfully signed up to distribute content that is either illegal or copyrighted.

      Second, the target demographic is just too small. Most people aren't going to go to this kind of an extreme for security. Precisely because storage is so abundant and reasonably good encryption is so readily available, there is neither the motivation to try to "borrow" storage from a massive peer network nor the perception that additional security in needed.

      Third, the weakest component in any security system is attached to the keyboard. I'm in the IT dept of a multi-billion dollar company. I can't get systems administrators and developers to stop writing down passwords, using username=password, shouting passwords across open rooms, and checking 'password never expires'. These are the IT people. By and large, people don't care about security.

      Fourth, network infrastructure is still very fragile. The issues with undersea cables illustrates this. Yes, it's getting better, but reliability is not where it needs to be.

      Fifth, as an enterprise tool (vs. the massive internet peer network)I'm not going to use this for all of the reasons stated in this thread. Trying to mine unused disk space across distributed systems is just not worth it.

  71. Re:Not without heavy *use* of other resources by dreamchaser · · Score: 5, Funny

    I think what you're saying is we need to leverage a new paradigm in order to take things to the next level. Am I right?

  72. Re:Microsoft DFS =! an easy answer by Thundermace · · Score: 1

    DFS is truly only distributed in the sense you are talking about on Windows 2003R2, in Windows 2003/SP1/SP2 DFS only publishes a link to a single storage and replicates the link instead of using the disk in aggregate. Instead you could do a couple of things if you want to help your organization. 1. You could examine the use of VMWare Server(GSX) or VMWare ESX to consolidate the number of physical boxes essentially freeing up hardware. Hardware that can then be re-used to created Shadow Volume Copy Services. Be aware that walking the VM environment will cause you to carefully plan the amount of memory each server contains and you should not exceed 50% of the total phyiscal memory for any hosted machines. 2. Shadow Volume Copy services will provide for users making bonehead mistakes and with a simple document you can train them how to enable the Volume Shadow Copy on their machines giving them the ability to retrieve past revisions without having to dip into slow backups. 3. You should talk over your concern with your management and discuss any plans so that you have their buy-in. 4. DFS is a definite option if you have the ability to essentially free up a ton of space that will be dedicated strictly to storage and needs to be replicated to other sites (i.e. network installed applications, etc.) To all of you reading this and suggesting Linux Solutions, I love the ideas however, the reality is that not everyone has the freedom to introduce OSS into their environment. I tried and was successful for a short period o time, however, it was deemed to be a non-starter since all of the applications are designed and run on WinBlows. AH to dream......

  73. The better question: by grumling · · Score: 1

    Why do desktops in a work environment need local hard drives anyway? My Windows folder (created Sunday Nov 10, 2002) is about 4GB. A 4GB SD card is about $30, and a lot of RAM would eliminate the need for a swap file. Basically the only thing that is a bottleneck is the \temp folder and there may be a way to do that with a ramdrive as well. My company requires all user storage to be on a network server, although not really enforced.

    The answer, of course, is that there are a lot of business applications that only install themselves on the C: drive and don't play nice without a \temp folder. The standard model PC is a motherboard, RAM, hard disk drive, graphics card and KB/mouse. Add to that Microsoft licensing agreements that discourage virtual machines and other lightweight desktops, remote offices with less than ideal network connections, and "power users" who have real/perceived needs for local storage, laptops, etc. and we can't seem to shake the hard disk.

    --
    "Well, good luck finding a judge that doesn't run a bestiality site."
  74. Large, reliable storage pool? by effzee25 · · Score: 1
    Take the drives out of the machines and make the machines diskless workstations. Move the drives into an arrangement of NAS / RAID storage to taste (depending on whether you want performance or redundancy). Partition to taste. Install copies of preferred OS to taste.

    Obviously, this is deal of work so the decision to go forward really lies with how much value is placed in the size and speed and manageability of 8TB storage. With the cost of drives as they are these days, it would probably be more effective to buy 16 500G or 8 1TB drives and achieve the same that way.

    1. Re:Large, reliable storage pool? by strul · · Score: 1

      My thoughts exactly.
      The obvious solution for a Un*x setup would be some kind of single image nfs diskless setup, and the windows one being iscsi or aoe.
      For more info on iscsi/aoe booting on windows check out the ehterboot/gpxe project.

  75. Rainbow tables by drozofil · · Score: 1

    Build yourself a huge set of rainbow tables, and show to your users how weak their passwords are :)

  76. Re:Not without heavy *use* of other resources by jozmala · · Score: 1

    Actually after reading the exact meaning of the word utilization it fits the context perfectly.
    But what he might have meant by such a choice of words, could be restated as "Not without increase in utilization".
    As we all know the queuing theory by heart, everything else becomes redundant in his post.

    Please, bare my lousy English. English is my second language and its been nearly a decade from last time someone taught me it.

    --
    ©God :Copyright is exclusive right for creator to determine the use of his creation.
  77. Why is parent modded down? by BotnetZombie · · Score: 1

    I completely fail to see how the OP is flamebaiting. I'd rather mod it as interesting if I had any points.

  78. Do nothing. by ivan256 · · Score: 1

    "Waste" the space. It's not worth it. Once you start doing this, the increased load on cheap desktop drives is going to lead to a several percent per year failure rate increase. It's probably not worth your time. If you want to store a few terabytes of data at much higher performance than this, spend a few hundred bucks on two or three modern drives and a SATA multiplexer.

    Unless you like re-building machines with dead disks it's just not worth it.

  79. Re:Not without heavy *use* of other resources by zen-theorist · · Score: 1

    Hrmm... Funny, he didn't come across that way to me at all. You, however, come across as a pompous linguistic Nazi, much like Orwell. If you compose sentences for people who don't have command of the language, then you are really quite delusional. As is my understanding, resources are utilised, while tools are used. He was correct in its usage.
    this is outrageous, shouldnt that be "He was correct in its utilization"?
  80. Re:vista? - DFS by Penguin+Follower · · Score: 1

    AFAIK, DFS is for consolidating shares on servers, not clients. Which is what this article is asking.

  81. Re:Not without heavy *use* of other resources by cypherz · · Score: 1, Offtopic

    "By the way, what is a PHB?"

    It's a Dilbert reference. http://en.wikipedia.org/wiki/Dilbert

    "PHB" is the short form of "Pointy Haired Boss".

    --
    This sig kills fascists.
  82. JFGI by SEMW · · Score: 1, Funny

    Now, I could go and read through Microsoft's webpage about DFS, and speend a few minutes paraphrasing it into a post for your edification; or maybe you could, I don't know, go do it yourself...?

    --
    What's purple and commutes? An Abelian grape.
  83. Re:Not without heavy *use* of other resources by Anonymous Coward · · Score: 0

    Utilization is actually clearer for non-English-speakers

    Perhaps if you have an extensive vocabulary, but for most people this isn't the case. "Use" is a ubiquitous word you learn in the first few months of learning English and you need it in everyday conversation. There is no question anybody even remotely competent with the English language is very familiar with it. "Utilize", on the other hand, is hardly ever used and there's no value in using it when "use" would express the meaning of the sentence just as well.

    By the way, what is a PHB?

    "Pointy-Haired Boss", it's a term originally from Dilbert that refers to the type of manager that is completely clueless yet says and does stupid things to give the impression that he's smart and knows what he's talking about. If you have an alternative term that expresses that concept, I'd appreciate it. It used to be the case that the term was in wide use on Slashdot and everybody knew what it meant, but now I think perhaps it's fallen out of favour.

  84. Re:Not without heavy *use* of other resources by BrentH · · Score: 1

    And that's bad? I'm sorry we're not all white Anglo-Saxons.

  85. Re:vista? - DFS by airedalez · · Score: 1

    DFS distributes the files, meaning that it copies them to the servers that are in the link. This doesn't sound like it is going to be exactly what he is looking for unless he wants to be limted in every share.

    If you think of it this way:

    Server 1: 40 GB Space
    Server 2: 30 GB Space
    Server 3: 50 GB Space

    Well your DFS Root could combine this space theoretically into one root, but each share would technically only have the space that is available on that server and not one big pool. So each share would be able to look like this:

    Share 1: 40 GB Space
    Share 2: 30 GB Space
    Share 3: 50 GB Space

    Accessing it would be easy though:

    \\dfsroot\Share 1
    \\dfsroot\Share 2
    \\dfsroot\Share 3

    Also DFS is used a lot of the time for replication. So find your lowest common denominator and you can replicate all of that data across all of your servers!

  86. Re:Not without heavy *use* of other resources by tony1343 · · Score: 1

    I was always under the impression that use and utilize were synonymous. Maybe you are right that utilize is unnecessary; maybe the Orwell article explains, but I'm not going to read it. For one, that's the great thing about English (and I guess languages in general), you have choice. I'll use whichever one I damn please. Also, language evolves so deal with it. Finally, you want us to dumb down our speech for non-native speakers? That's absurd. I understand acquiring and maintaining multiple languages is difficult. I'm pretty much a failure in that respect, but this isn't a language learning site. People should be free to write however they want. I wouldn't care if people write in non-English (not sure if that is allowed by the rules or not, but it should be).

  87. Re:Not without heavy *use* of other resources by smallpaul · · Score: 1

    Utilization is a well-defined technical term. http://en.wikipedia.org/wiki/Utilization

  88. Highly Impractical by OS_Neutral · · Score: 1

    As someone who designs storage systems for a fair sized business, this is an impractical use of resources for a number of reasons. First, as has been pointed out, you are competing for other resources. Storage on clients is highly variable. Just because somebody has 20GB of free space today doesn't mean he's not going to go out and download download a couple DVD's worth of data. Your system would have to take into account this possibility, and be prepared to cope with a space issue on the disk in real time. This would usually mean self-deleting when free space is too low, since you can't anticipate that there will be enough time to move the data over the network before the disk runs out of space and your clever scheme has now impacted the productivity of someone else. This is a less likely scenario in a server farm, but still a consideration. Further, in a server environment, you've seriously got to consider the function of each device before you go farming out spare resources. If you've got a web server with highly static data and overall low utilization, then this would actually be a pretty good candidate to participate in a distributed file system. However, a DB server, not so much. Some applications also desire a certain amount of free space on their volumes to ensure optimal efficiency. You certainly don't want to do this on any system wehre disk I/O is a major factor in the performance of an application. What this really comes down to is a simple cost/benefit analysis. Before you decide if this is a good idea or not, you need to establish that there's actually a value to the business. Answering these questions will help: Is there a business need to provide additional storage at this time? What is it? How much storage do you need? What is you current overall efficiency on storage today? Given the technologies available to create a distributed file system, what is a practical amount of usable space you can gain by doing this? Does this amount meet the stated need? What is the approximate cost in time and cash spent to implement? What is the management overhead of such a system in hours over a period of time? What are the risks of doing this? Do the risks outweigh the benefits? What is the cost (upfront and ongoing in cash and labor) of alternative solutions that meet the need (like a new file server)? Once you answer those questions, you'll have a pretty good idea of whether or not this is something you should even be considering. As a trend, the problem you describe is exactly why people are so enamored with virtualization. In a server environment, virtualization makes absolute sense. The overall efficiency of virtual servers is an order of magnitude (or more) higher than most physical server farms. Desktops are a different picture. The same principles apply, and vendors are purshing virtual desktops. The problem is that since the advent of the PC end users have never preferred thin clients. With the majority of PC purchases now being laptops, it's even more so than it was in the past. Users want to have their data close, and to be able to take it with them.

  89. Re:Not without heavy *use* of other resources by LS · · Score: 1

    Of course he's hypcritical - the guy's a skilled troll and everyone is biting. Please ignore him.

    --
    There is a fine line between being a cultivated citizen and being someone else's crop. - A. J. Patrick Liszkie
  90. There's "full" and then there's "full" by karlandtanya · · Score: 1
    There's "full" as in "If you put any more crap in this box, I can't get to the crap that's already in there."
    And then there's "full" as in "Hey, I can cram more crap into that box!"

    I need some of that space to defrag the HDDs on my windows box.
    Now, if only there was some filesystem whose performance didn't degrade over time due to fragmentation...

    --
    "Reality is that which, when you stop believing in it, it doesn't go away." - Philip K. Dick
    1. Re:There's "full" and then there's "full" by argent · · Score: 1

      Now, if only there was some filesystem whose performance didn't degrade over time due to fragmentation...

      You mean, just about all of them?

  91. Re:Not without heavy *use* of other resources by fbjon · · Score: 1

    No, language is a tool.

    --
    True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
  92. Solution in search of a problem... by bmcent1 · · Score: 1
    So... you've got unused space. And you're looking for a reason to fill it up?

    People (especially PHBs) think a resource that is not 100% utilized is going to waste.

    However, consider which is more efficient: extra unused disk space, that is spare capacity with no strings attached ... or, an elaborate pooling and sharing mechanism with much more management overhead and electricity use?

    It's so true, disks are incredibly cheap these days. The unused space costs nothing (except if you could have gotten away with an even cheaper, smaller disk.)

    Look at the amount of redundancy and resources that would be required if you were going to try to pool those PCs' storage together into some sort of hive:

    • How much redundancy is needed so that random computers can be powered down, or crashed, and the storage pool doesn't go offline? Even if most are left on at night, would you need a redundancy factor of 3 or more?
    • Electricy use goes up / the bill goes up, if you have to require most PCs to stay on to utilize this "free/spare" capacity
    • Do things like Service Pack updates or other patches that get pushed from a central server become much more complicated now because you cannot reboot more than a couple computers at once without the hive store going offline?
    If you need a large pool of storage and a way to manage it centrally, use SAN or even NAS and consider VMs.

    OTOH, if you want to keep it simple, bask in the cheapness of disks and high powered PCs these days and don't fret about "unused" capacity.

    --

    "Hey Albert, Good luck exploring the infinite abyss."

  93. Other vendor solutions by Anonymous Coward · · Score: 0

    I have not worked with either product, but an eWeek article about new storage technologies mentioned Seanodes' Exanodes and RevStor's SANware.

  94. Re:Not without heavy *use* of other resources by 19thNervousBreakdown · · Score: 1

    The double irony of making the central joke of your post the tired old "Orwell spinning in his grave" is delicious.

    --
    <xml><I><am><so><damn>Web 2.0</damn></so></am></I></xml>
  95. The fundamental flaw of this is..... by 3seas · · Score: 1

    Users trying to be productive at their workstation really don't need additional slowdowns happening because someone elsewhere is accessing their hard-drive.

    Not so long ago MS did an upgrade that brought my system to a halt in my usability of the system as it was using all additional drive space to cache my system, on my system.

    When it stopped even teh cache didn't show this usage but I was able to determine it was all in the cache and had to learn about clearing teh cache, not by delete but by ctrl-del to reinitialze the cach and of course rebooting.

    I was not a unfortunate as many others in the amount of drive space this took up as I only have around 34gigs total on that partition. Others, I have learned, have lost as much as over 100gig.

    The point is that on windows, disk access can slow a system down quite a bit.

    If you really need additional drive space to figure out what to productively do with it, terabyte drives are rather inexpensive today. But trying to make use of available drive space on user systems on a network is very anti-productive, not to mention how it will slow the network down too.

    1. Re:The fundamental flaw of this is..... by BeanThere · · Score: 1

      Honestly, any modern computer *should* be able to handle basic office apps and have some hard disk / network accessing in the background now and then and barely feel it. Heck, a ten-year old computer should. Maybe I'm old-fashioned, but it seems like nobody knows how to write software anymore that isn't insanely bloated.

      Somebody else posed the question of why all these computers even need hard disks at all. I think that's an excellent point; I don't see any reason why most office computers shouldn't just be thin clients or at least running apps straight off a single server. I've been mulling over trying this out in our office, e.g. letting a new admin person use a Linux desktop running off the Linux server with OpenOffice etc. It just feels to me more like 'how the world should be', not this crazy idea of every single desktop requiring 2GB RAM, beefy graphics hardware, big hard disks etc. for e.g. bloated crap Vista + Office 2007, the inefficiency of it all if you look at the scale of this, globally, I'm sure makes baby Jesus cry. The majority of office desktops shouldn't *need* to be anything more than something like the Asus EEE these days - cheap little disposable pieces of crap, with localised Linux servers doing the work and storing the real documents. This has many other advantages, you don't need to do all this admin work replicating these software setups, updating them all, everyone's documents are on the server in neat little home folders so they're easy to backup, harder to lose via viruses etc. or 'my hard disk crashed' or other nonsense, harder for people to 'lose' their documents somewhere on their local system with a trillion silly folders on it to host the bloated OS etc.

    2. Re:The fundamental flaw of this is..... by BeanThere · · Score: 1

      ... other advantages, if a client fails, you just replace it and in literally five minutes the user can be up and running exactly as before with their entire desktop exactly as it was - no hours or days of downtime as a new system has to be set up. Furthermore they can also move from any machine to another in the office, log into their profile and voila there's their desktop. Throw VPN into that, and mobile users can access their desktops anywhere.

      Linux/Unix have so much *potential* to blow Windows out the water as an office desktop system that it's sad that its powerful capabilities are not being marketed better.

      What's more, none of these features are new, all this capability is more than ten years old. Yet MS slowly continues to implement their own bloated crappy "equivalent" functionality, markets the hell out of it as something amazing and mind-blowingly new and innovative and visionary (roaming blah blah access your e-mail anywhere this-and-that), and everyone thinks MS invented something fantastic. Unix has little hope, it's incredibly powerful, years ahead of its time almost, but hardly anyone knows it exists.

  96. Re:Not without heavy *use* of other resources by smallfries · · Score: 1

    Just because it has become a well-defined technical term does not stop it being a bastardisation of the English language. I actually read the essay that the AC linked to to find that Orwell makes some good points. In particular the term utilization is one of a number of phrases used to dress up a meaning and attempt to make it sound more impressive and scientific. Just because the bad authors who do this have succeeded in making the term utilization standard does not change the AC's point in any way.

    --
    Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
  97. File Server Virtualization by forq · · Score: 1

    Take a look at F5 Acopia. Create shares on all your servers with extra space, ensure you set the permissions and quota correctly, so as not to affect the core function of the server, and then use Acopia's product to merge all the servers into a single name space and present it to clients. Much like DFS, but it will also do NFS as well as CIFS, and has a very flexible policy engine to allow you to live migrate files and load balance between many servers. You'll be able to claw back all the unused space and any standard NAS client can make use of it for whatever application needs it, with no special client software required.

  98. Re:Not without heavy *use* of other resources by smallfries · · Score: 1

    You have completely misunderstood the AC's point. Nothing that you have written adds to this discussion in any way. I can believe that your comment would get posted on slashdot.

    --
    Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
  99. Re:vista? - DFS by felipekk · · Score: 5, Informative

    Running DFS (to serve files) on Windows XP clients? What are you smoking?

    From Microsoft TechNet:

    The servers that will participate in DFS Replication must run Windows Server 2003 R2.

    It is possible to use DFS Namespaces when domain controllers and namespace servers run a mix of Windows Server 2003 R2, Windows Server 2003 with SP1, Windows Server 2003 without SP1, and Windows 2000 Server, but some functionality is disabled or available inconsistently, depending on the operating systems on the servers.

    From: http://technet2.microsoft.com/WindowsServer/en/library/1aa249c0-40f3-4974-b67f-e650b602415e1033.mspx?mfr=true

  100. Re:vista? - DFS by dmsuperman · · Score: 1

    Once you've got that set up, use a tool like this: http://projectdistributor.net/Releases/Release.aspx?releaseId=404 And fill it with random dummy data ;)

    --
    :(){ :|:& };: Go!
  101. Re:Not without heavy *use* of other resources by Matthias+Wiesmann · · Score: 1

    Actually, for a french speaker, "utilisation" is very easy to understand, as it is a french-word...

  102. Freenet plus samba? by Cheerio+Boy · · Score: 1

    Couldn't you use something like a localized version of freenet+samba to do this?

    It would allow the local drive connections to not necessarily see what anybody else is storing on the nodes AND it can be locally throttled to keep from interfering with local apps.

    (I'm sure this post will now garner a bunch of "Only if he wanted store warez and kiddie porn on it!" replies.)

    --

    "Bah!" - Dogbert
  103. I'm not convinced by 32771 · · Score: 1

    You might want to switch the work place pcs off occasionally and then you don't have any access to the data. If you want to have people work over the weekend you must switch all the machines on. This is a more energy inefficient solution than having a large file server and Flash drives in your work PC for instance.

    --
    Je me souviens.
  104. It's not unused by niceone · · Score: 1

    It's just storing random data.

  105. Re:you need a better gun by Anonymous Coward · · Score: 0, Funny

    gun people are creepy... technophile gun people are super creepy.

  106. Feasible , but economical ? by oh2 · · Score: 1

    In order for all of the interesting ideas in here to work the computer needs to be on. Now thats fine as long as you only need access to these distributed storage bits during office hours. For 24/7 efficiency and reasonable performance all the networked computers would need to be powered and online always. The power costs would more than double from say 1.5KwH for 10Hours (assuming a power consumption of 150W per computer) to roughly 3.5kWh per day. Assuming a power cost of 15 cents US, thats 37.5 cents a day per computer: 12$, per month. Probably cheaper to lease an additional storage server pr an offsite backup in the long run, not to mention that the extra power is mostly wasted and unneccesarily contributes to pollution...

    --

    Now the world has gone to bed, Darkness won't engulf my head, I can see by infra-red, How I hate the night.

  107. I have a similar problem by brusk · · Score: 0, Troll

    I run a large network with thousands of computers, and I recently noticed that there are thousands of keyboards sitting there unused most of the time. And even when they are being used, typically only one or two keys are being depressed, leaving over 100 keys unpressed. Can anyone smart think of a way to put those hundreds of thousands of underutilized keys to work? It just seems like such a waste.

    --
    .sig withheld by request
  108. hdfs by phatsphere · · Score: 1

    hadoops's hdfs is the only thing which comes to my mind but i don't know if this could be any useful at all.
    http://wiki.apache.org/hadoop/DFS

    another project is wuala ( http://wua.la/ ), but that's not for internal use...

  109. It's been done by Microsoft: DFS NameSpaces by Anonymous Coward · · Score: 1, Informative

    WoW... it appears you Penguins are just 'reinventing the Microsoft Wheel' (same w/ ZFS fans really) - Microsoft's already been there, & DONE that, & it works.

    Imagine SQLServer 2005 blazing away on a Distributed Namespace, spreading it db-devices across 100's/1000's (whatever) of systems, using their idle time for it, & diskdrive read-write heads + RAM & CPU, etc. et al + using a high-speed interconnect, & maybe toss in a few dozen Solid State Drives (placing critical devices onto them, for the clients that use those tables/files/devices the most, you place them locally onto THEIR machine node, etc.), well...

    YOU GET THE PICTURE!

    So... Hey Penguins, new NEWS:

    "It's been DONE (& works + is called DFS NameSpaces)"

    By Microsoft, already.

    1. Re:It's been done by Microsoft: DFS NameSpaces by Spokehedz · · Score: 4, Funny

      Even a blind squirrel will find a nut every once in a while.

    2. Re:It's been done by Microsoft: DFS NameSpaces by crazed+gremlin · · Score: 1

      A broken clock is right twice a day.

    3. Re:It's been done by Microsoft: DFS NameSpaces by cyber-dragon.net · · Score: 0

      Having seen examples of other Microsoft software that "works" I am skeptical to say the least.

      Have you actually implemented this? Seen the performance? Personally use it for mission critical services?

      If not... do so before you post obvious flame bate like this.

    4. Re:It's been done by Microsoft: DFS NameSpaces by RobFlynn · · Score: 1, Funny

      A rolling stone gathers two birds in a bush.

      --

      ---
      Rob Flynn
      Pidgin
    5. Re:It's been done by Microsoft: DFS NameSpaces by darrenkw · · Score: 3, Insightful

      Been there also and I disagree on the "just works" part. We're using it successfully but we've run into issues with losing files from some of the computers. Let's say that the admin changes permissions on somebodies directory so that they can write to it also. DFS will think that the file with the changed permissions is the newer one and blow the other one away. I hesitate to call that "just working".

    6. Re:It's been done by Microsoft: DFS NameSpaces by tzot · · Score: 1

      WoW... it appears you Penguins are just 'reinventing the Microsoft Wheel' (same w/ ZFS fans really) - Microsoft's already been there, & DONE that, & it works. [DFS, that is]
      A great post. Now, help me to understand its relevancy in relation to TFA:

      In a network, there are 100 desktop computers, each with 30 GiB spare disk space. Using DFS, how can I:

      • create a device / network share / thingie with 3 TiB free space
      • store a 1 TiB single file on it.
      Thanks in advance, Anonymous Coward. Your posts has been most useful so far. Be a good chap and keep up the good work.
      --
      I speak England very best
    7. Re:It's been done by Microsoft: DFS NameSpaces by Anonymous Coward · · Score: 0

      I've seen Ms' stuff work more completely, and in industrial environs for decades (inclusive of NASDAQ in fact, where SQLServer 2005 & Windows Server 2003 do the job, non-stop, 24x7 disseminating trade information & acting as the OFFICIAL RECORD of the trade data no less) than the 1/2 baked stuff that goes out as "freeware" (FOSS) for Linux, and for many purposes...

      How's that? 1/2 of that stuff for Linux, that is supposed to do the job that commercial software products do for the same data, let's use Foxit Adobe .pdf tool - it does 90% of what it has to, perfectly. The other 10%? Well... you know! BUGS!!!

    8. Re:It's been done by Microsoft: DFS NameSpaces by Anonymous Coward · · Score: 0

      Then, this admin. you mention (who makes those changes to file rights/acl's that are w/in a DFS namespace) ought to NOT make them... especially IF he is aware of this occurring... right?

      E.G.-> If you know a fire burns you, why stick your hand into it??

      I.E.-> Well, that said & aside (pure common sense) - this admin of yours ought to NOT be making changes to file level ACL's that are part of the DFS Namespaces then... (OR, next time, he ought to plan ahead better/more accordingly, prior to making changes).

      (In other words: He ought to architect better initially, planning for various users ACL's & planning better ahead of time is all, to account for possible changes being made to files, & to assign R/W rights to said users when needed, ahead of time... he must have been a fool to assume data & documents remain static & unchanging imo!)).

      Pretty simple.

    9. Re:It's been done by Microsoft: DFS NameSpaces by Anonymous Coward · · Score: 0

      I know you're trolling and I should just let it slide, but...

      Foxit is not open source, artard, and it does not even run on Linux. There is no way you can pin that abortion on the OSS community.

      Oh, and the fact that NASDAQ uses Windows is a terrible argument against Linux. NASDAQ's big brother, NYSE, uses Linux and Oracle.

      In other words, fuck off you ill-informed shrill.

    10. Re:It's been done by Microsoft: DFS NameSpaces by peccary · · Score: 1

      Hahahaha. Sorry, Microsoft DFS was a quick and dirty rip-off of the highlights of DCE DFS so they could claim "yeah, we do replication". To the back of the class with you.

    11. Re:It's been done by Microsoft: DFS NameSpaces by solid_liq · · Score: 1

      OH I see, you think this is something that microsoft can do, but Linux can't? Well, don't claim to know anything about Linux then.

      You can do this in Linux very easily, in multiple ways.

      a) You can partition the harddrives on the desktops to have a partition available for this (best way), which you then share over either ATA over Ethernet (ATAoE) or iSCSI, then put them into an LVM volume group.

      b) You can share portions of the harddrives over NFS, create a large file on the NFS share, create your filesystem inside the file (so that it'll appear as a block device), and then add that to an LVM volume group.

      c) Same as b, but with samba

      d) same as all three above, but with EVMS instead of LVM.

      Unlike with windows, however, no expensive server version of Linux is required to do this under Linux. You don't need a newer distro of Linux for this either. You can use one from the 90's to do this if you wish.

      Eat that, fanboy!

  110. Encryption key? by Eternal+Annoyance · · Score: 1

    Sure, it would take a while to encrypt some data, but that's a side matter :)

  111. freenet? by Punto · · Score: 1

    It's all encrypted, and since it's a LAN, it'll be fast. It also has the redundancy (and your end-users won't even know what's on their computer)

    Plus, it'll be a great example of a "legitimate use" for it (unless you're just looking for a place to store your child porn). Kinda overkill tho, I'm sure there are other distribute filesystems that make better use of the resources.

    --

    --
    Stay tuned for some shock and awe coming right up after this messages!

  112. Re:Not without heavy utilization of other resource by Z80xxc! · · Score: 1

    Your plan still has the problem that the computers would have to be online. If it's not ON, then you can't do all this stuff. I guess you could use some sort of Wake-on-LAN system, and wake up all the PC's in the place at once, however, there still remains the problem of workstations that are out of service - say a computer has to go in for repairs, or its hard drive fails or something like that, then that data wouldn't be available. With regular servers, it's not that big of a deal, since it's easy to backup and monitor servers; when you start using regular workstations for data storage, you run into problems.

  113. NSFW by debest · · Score: 1

    The video itself is fine, but the advertising is not!

    --
    Look at the tomato! Isn't it sad? He can't dance! Poor tomato!
    1. Re:NSFW by cammoblammo · · Score: 1

      Wow, I just saw your sig.

      A Veggie Tales reference on /.? I come here to escape, dammit!

      --

      Cogito, ergo sig.

    2. Re:NSFW by debest · · Score: 1

      A Veggie Tales reference on /.? I come here to escape, dammit!

      LOL. I cycle through bizare-sounding Veggie sigs occasionally. I once got a response like yours only about 4 hours after changing it. Guess its now time again!
      --
      Look at the tomato! Isn't it sad? He can't dance! Poor tomato!
  114. Re:Not without heavy *use* of other resources by maxwell+demon · · Score: 1

    Indeed, he should have used "Orwell being in the state of having a large angular momentum in his ultimate disposal place" instead.

    --
    The Tao of math: The numbers you can count are not the real numbers.
  115. Re:Not without heavy *use* of other resources by maxwell+demon · · Score: 1

    I can believe that your comment would get posted on slashdot.

    Not believing it would indeed be hard, given that it can be easily seen (and you obviously did see it, or you wouldn't have replied to it).
    --
    The Tao of math: The numbers you can count are not the real numbers.
  116. Re:vista? - DFS by porttikivi · · Score: 1

    AFAIK DSF will not suit this. But MS is active on advanced distributed file systems like Farsite http://research.microsoft.com/Farsite/faq.aspx Unfortunately is does not seem to be publicly available.

    --
    Anssi Porttikivi / app@iki.fi
  117. Cleversafe sounds like a possible solution by ApostleJohn · · Score: 1

    http://www.cleversafe.org/ It's open source, dispersed storage, encrypted, redundant... seems like it's worth giving a try. I haven't used it personally but it had been around a while. The more machines using it, the better a solution it is from what I can tell. The windows support may be the big question... but the project seems worth keeping an eye on.

  118. Re:vista? - DFS by tfiedler · · Score: 1

    DFS doesn't actually allow you to pool your disparate storage. It acts as a generic namespace that allows you to have multiple replicas of the same data, and keep your users from actually knowing where the stuff is kept.

    --
    Democrats and Republicans are like AIDS and Cancer, I want neither!
  119. Use Chirp by Anonymous Coward · · Score: 0

    Chirp, a distributed file system which gives you unix-esque access to files and doesn't require administrative privileges to set up would be pretty much perfect for pooling you free hard drive space.

  120. Re:you need a better gun by ScrewMaster · · Score: 1

    They get significantly less creepy once you've had somebody point a gun at you and pull the trigger. If you survive the experience and you've half a brain, you'll start learning a few things about guns. Those creepy technophiles are the best place to start.

    --
    The higher the technology, the sharper that two-edged sword.
  121. Re:you need a better gun by maxwell+demon · · Score: 1

    How exactly is an in-depth knowledge about the inner workings of a gun going to help me if someone points one on me? If I myself have a gun, the only necessary gun knowledge is how to use it. Otherwise, I don't see how gun knowledge will help me. Or did you maybe think about talking with him about his gun, a la "Oh, nice gun you have, did you know that ...", hoping that he finds that conversation interesting and therefore decides not to shoot you?

    --
    The Tao of math: The numbers you can count are not the real numbers.
  122. Re:Not without heavy *use* of other resources by youthoftoday · · Score: 1

    At this time I am able to answer your query in the affirmative.

    --
    -1 not first post
  123. Re:Not without heavy *use* of other resources by smallfries · · Score: 1

    Am I supposed to take your word for that? :)

    --
    Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
  124. Raises the real question by fuliginous · · Score: 1

    Doesn't it show that potentially better organisation is to have diskless machines with remote storage to allow the massively redundant storage arrangement.

  125. An answer by Zey · · Score: 1

    I do wish people would use Google or Wikipedia for such simple questions...

    See the list of filesystems listed under "Distributed fault tolerant file systems", "Distributed parallel file systems" and "Distributed parallel fault tolerant file systems" here.

  126. Re:vista? - DFS by hitmark · · Score: 1

    ah, the artificial barriers of windows strikes again...

    --
    comment first, facts later. http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm
  127. Mod parent up by Anonymous Coward · · Score: 0

    Oh come on, that was a joke people.

  128. Re:Not without heavy *use* of other resources by tubapro12 · · Score: 1
  129. Sesquipedalian verbalization by CustomDesigned · · Score: 3, Informative

    Since the Romans invaded Britain, English speakers have used latinate phrasing to appear scholarly. Anglo Saxon words were short and pithy, like "home", "pig", "horse", "cat". But scholars learn latin, so it's "domicile", "porcine", "equine", "feline". In modern English, the choice gives you a palette of moods - like colors on a web page.

    1. Re:Sesquipedalian verbalization by Anonymous Coward · · Score: 0

      Nothing to do with the Romans - prior to about the 10th century there are almost no latinate words in english. They come from French, the language used by the English Royal Family from the Normans until about Henry VIII's reign. French was the language of the court, and it trickled down into regular speech. That's why "Desire" and "Require" are slightly more upper-class than "Need" or "Want".

      The use of Latin for scientific terms would, I expect, be more a result of the fact most of the scholars were monks, and that it was the "lingua franca" (sorry) understood in monastaries across Europe.

    2. Re:Sesquipedalian verbalization by gregconquest · · Score: 1

      I've never before heard this. Latin/old English pairs (people/folk, pork/pig, etc.) did not come from the Roman invasion of England. It came from the Norman Conquest of England in 1066:

      http://community.middlebury.edu/~harris/EngLatGrammar.html
      The Norman Invasion of the 11th century made French the court and official language of England for several centuries, during which time English as a language came under the influence of French, not only in terms of words and manners of speech, but also in the way French grammar functions.

      French comes from Latin.

      Greg

    3. Re:Sesquipedalian verbalization by dave420 · · Score: 1

      I thought "cat" came from the Latin "cattus"...

  130. AFS? Coda? by SanityInAnarchy · · Score: 1

    Hardly a Microsoft invention. Of course, one of the two I mentioned might actually run on XP -- or you could switch all the client OSes to virtual machines and run Coda on Linux.

    --
    Don't thank God, thank a doctor!
  131. Re:Not without heavy *use* of other resources by NMZNMZNMZ · · Score: 2, Funny

    today's "businessspeak" (mindless repetition of words and phrases that have long since been driven into the ground by thoughtless, banal, stupid repetition)

    Kids! That word, meaning "trite" or "unoriginal", is pronounced "ba-NAHL". If you say it the wrong way like I did in an interview, it sounds naughty and you sound stupid.

  132. Re:you need a better gun by ScrewMaster · · Score: 2, Insightful

    Detailed knowledge of any technological artifact will make you better at using it, maintaining it, knowing when to use it, whether it's an automobile or an AK-47. Yes, some people find guns interesting to a greater degree than others (I don't, personally, nor do I own one) but whatever floats one's boat. Let me ask: do you find someone that has an advanced knowledge of computers creepy? Probably not, if you're on Slashdot ... but there are many that do, until they need him.

    When the time comes that I need a brain to pick, it's those "creepy" nerd types that I seek out. They're the ones most like to be able to help. Maybe you're anti-gun, and the fact that some people are not is offensive to you, I don't know. Regardless, you should look at people who know much more than you about a given subject as a potentially valuable resource, not an object of scorn.

    --
    The higher the technology, the sharper that two-edged sword.
  133. Re:Not without heavy *use* of other resources by raftpeople · · Score: 1

    Utilization is a standard term in computer and technical environments. It means something specific to most of us on Slashdot.

    Possibly you mistook this forum for a different one?

  134. Freenet by nurb432 · · Score: 1

    just install freenet and make your entire organization trusted with itself. Expand the local store by several GB.

    --
    ---- Booth was a patriot ----
  135. No Problem by PPH · · Score: 1

    Just post the addresses to your servers here, leave a few ports open and some industrious person is certain to put your system to good use.

    --
    Have gnu, will travel.
    1. Re:No Problem by kernspaltung · · Score: 1

      192.168.1.100, .101, .102, .103... And port 546 is open on 'em all. (+1 You Rock if you get why that's funny.)

      Cheers!

  136. File Hosting by rdradar · · Score: 1

    If you have lots of bandwidth aswell, set up file hosting site similar to TubeShare. Atleast you shouldnt run out of space :)

  137. Re:Not without heavy *use* of other resources by BeanThere · · Score: 1

    The best (and most entertaining) essay/book I've read on this topic is Less than Words Can Say ... it makes a compelling argument for clear, direct and precise language usage. I wholly recommend it (it's free online). Most importantly, as some of the responders to your post have failed to realise, there is a very big difference between "dumbing down" your language use, and making it clearer. Frivolous excess 'business' or bureaucratic verbiage usually *is* actually dumbing down the language in a different way as it makes meaning more opaque, even while giving the superficial appearance of intelligence and insight. Learning to recognize the difference is so critical our future actually depends on it.

  138. Re:Not without heavy *use* of other resources by asdfghjklqwertyuiop · · Score: 1

    As is my understanding, resources are utilised, while tools are used.


    Are tools not also resources?

  139. Re:Not without heavy *use* of other resources by g0at · · Score: 1

    The problem is that people often use big words without knowing what they really mean. In this case, the GP was correct in his use of "utilization", but the AC reply made the erroneous conclusion that the use was spurious (there's another great word). I agree with the spirit of the AC's reply (if not the particulars of this instance).

    -ben

  140. Re:Not without heavy *use* of other resources by Anonymous Coward · · Score: 0

    Indeed, that's why correct language utilization is important :)

  141. Backup? by simon741 · · Score: 1

    Well, actually, this may be useful as backup space. I'm working in a small company with a small server. Right now, we're regularly doing backups on tapes. But actually it would be quite easy to mirror the few GB we have on this server on one or several desktop computers. Wouldn't it? This should be possible with rsync. Is there any similar tool for Windows? Simon

  142. Re:Not without heavy *use* of other resources by dreamchaser · · Score: 1

    Excellent. Have your people call my people and we'll discuss the potential ramifications of these developments over sushi.

  143. Re:you need a better gun by Anonymous Coward · · Score: 1, Interesting

    Yet the "necessary gun knowledge" in using it involves many things which do require you to understand the rudimentary "inner workings" of a gun.

    What happens if a gun jams? You will need to know how to clean it. What happens if you know the firing pin struck the cap but the round hasn't gone off? Open the chamber to clear the round and it might explode in your face. Sure, these things probably only happen 0.5% of the time but just *one* occasion of bad luck is enough to snuff your life out.

    You really do not have the time to skip and pick up knowledge like this only when problem comes. It takes more than just pulling the trigger to really use a gun.

  144. #include great.success.h by newr00tic · · Score: 1

    Now, I could go and read through Microsoft's webpage about DFS, and speend a few minutes paraphrasing it into a post for your edification; or maybe you could, I don't know, go do it yourself...? He could! - - Instant profit stems from your wisdom(!)
    --
    A horse can't be sick, you know, even if he wants to.
  145. Error correcting codes are the answer by DamnStupidElf · · Score: 1

    Reed Solomon error correcting codes are a good example.

    Reed Solomon codes are of the form (N,M), where there are N code words of which M are data and (N-M) are parity or error correction words. Reed Solomon codes can correct (N-M)/2 errors if the position of the errors is unknown, e.g. out of N words it is not known which are in error, but Reed Solomon codes can correct all (N-M) errors if the position of the error words are known, as is the case in a distributed storage system (assuming each device stores data perfectly, which can be enforced by another level of redundancy on each device). If there are N storage devices out of which a maximum of E can be lost at any one time, an (N,N-E) Reed Solomon code will provide perfect data integrity. All data is encoded as an N word code, one word of each is stored on each of the N devices. If a storage device is lost, a new one can be added and all the words that were stored on the failed device can be rebuilt from existing data on the new device. RAID5 is just an implementation of a (N,N-1) Reed Solomon code, and many RAID6 implementations are (N,N-2) Reed Solomon codes.

    The speed of performing Reed Solomon encoding and error recovery in general is not as fast as RAID5 and RAID6, because the implementations operate on polynomials over GF(2) which don't have direct hardware support in most processors like XOR does for RAID5. From what I've found, the time to calculate parity can be made to scale roughly linearly with the number of parity words being calculated, but that relies on using 8 bit words and pre-calculated multiplication tables for each data word which scales quadratically in space. With 16 parity words the coding and error correction can take about 16 cycles per byte of data, and with 32 parity words the speed drops to 30 cycles per byte. That means a theoretical maximum of several dozen MB/s on modern processors. If each storage device can calculate its own parity information, the speed could probably be increased significantly for very large storage arrays.

    Aside from the original story's idea of using free hard disk space of workstations, I think it would be cool for people to be able to group their machines together on the Internet using the above scheme so that each user could donate X bytes of storage to the common pool, and once an acceptable level of redundancy (an (N,M) Reed Solomon code) was decided upon each user would have X*(M/N) bytes of highly redundant storage available. I hope to write either a FUSE driver or network block device driver that implements this for Linux. If anyone's interested, I'm planning on making everything available under either BSD or LGPL licensing when it's finished.

  146. Re:Not without heavy utilization of other resource by pionzypher · · Score: 1

    Agreed, that would be another hurdle. The solution that immediately comes to mind is a freenet style clustering. That would present more issues with versioning and redundancy. Maybe those have already been addressed elsewhere?

    --
    I'll believe in corporations having personhood when Texas executes one... - advocate_one
  147. Yes, DFS by KStieers · · Score: 1

    Not a replication DFS, but a namespace DFS.

    Create a "stand alone root", NOT a domain root, on a server.
    Then add links to it, where they point to shares on workstations.

    This acts more like a bunch of symbolic links to the various boxes, with one entry point the the share. Not the same data everywhere, like a domain root would be.

    http://www.microsoft.com/windowsserver2003/technologies/storage/dfs/default.mspx

  148. Re:Not without heavy *use* of other resources by ojustgiveitup · · Score: 1

    As is my understanding, resources are utilised, while tools are used. He was correct in its usage.
    Or maybe its utilization?
  149. Re:vista? - DFS by hjf · · Score: 2, Informative
  150. step two... by whopub · · Score: 1

    Porn Now all we need to do is explain the concept of "Unused Storage" to a porn collector...
  151. Re:Typical IT guy by kernspaltung · · Score: 5, Informative

    Way to jump to conclusions about me and how I manage a network. I honestly didn't ask the question as a "control freak", I don't spy on the employees, and I don't play Internet cop. I try to get them the tools they need to do their jobs, help them when things don't work, and otherwise stay out of their way. I also didn't imply the pool would be for me to do with as I please; I can see several ways in which that storage would benefit our business were it not spread out in small chunks. The users have all that space, and they simply DO NOT use it. In our business, they don't have much call for large files like photos, movies, etc. It's mostly spreadsheets and OpenOffice Writer documents. But thanks for being an ass.

  152. Run Community Projects by njdube · · Score: 1

    BOINC, Tor, Freenet and/or I2P are good examples of things you can put your extra resources to some use. Here are the BOINC projects I would run if I had 100's of system's at my disposal.

    Artificial Intelligence System, NanoHive@Home, Predictor@Home, Project TANPAKU, Spinhenge@Home, The Lattice Project, World Community Grid, SIMAP, Malaria Control, Proteins@Home and Rosetta@Home.

  153. May I recommend against this? by IanDanforth · · Score: 2, Insightful

    Having tried this in college, I can tell you a couple things.

    1. You will noticeably reduce the lifespan of the discs. (Which can anger cost conscious supervisors)

    2. Doing ongoing hardware maintenance, because of this reduced lifespan, on closed, used by others, boxes is a *serious* pain.

    Storage setups make hot swapping discs easy, trying to do this with full blown systems just gets tiresome. The solution I eventually came up with was the following.

    Implement a two tiered hardware replacement cycle where you reduce the time a user is allowed to keep any hard drive in their box before replacement. Then using the still reasonably good drives, create a centralized storage solution in which the drives can live out the rest of their useful spans. Data security, user happiness, and redundancy are all good selling points of this system. You still have to deal with monkeying around in user boxes but if it's on a schedule and it nets you more drives, it's not so bad.

    -Ian

    1. Re:May I recommend against this? by Anonymous Coward · · Score: 0

      not to mention no physical security on data.

      just not a good idea for so many reasons

  154. Re:Typical IT guy by Anonymous Coward · · Score: 0

    Or how about if all attachments are stored locally automatically and deleted from "your" servers - this would be great for all those idiots that make 30Meg office documents and send them to everyone in the department :-) This is probably the best idea, but I don't know of any email clients that do it (killer feature request here for Mozilla and Evolution teams). Oh right, the server would need the ability to delete just the attachments after downloading by clients (another feature request for someone). Lotus Notes does this with the MyAttachments tool and it's supported by Domino
  155. Re:Typical IT guy by RMH101 · · Score: 1

    +3 insightful? For this shite? Jesus wept.

  156. Sing together: by yoprst · · Score: 1

    Those terabytes are for porn!
    Those terabytes are for porn!
    Why'd you think hard drives were born?
    Porn, porn, porn.

  157. tahoe? by ntk · · Score: 1

    It's probably not there yet for you, but you might want to keep an eye on AllMyData's Tahoe project.

  158. Re:Typical IT guy by AlecLyons · · Score: 2, Insightful

    You make it sound like it's a bad policy keeping all business data somewhere properly managed. It won't mitigate any damage done to your company or your career because you told them to be careful. People will store data in the most convenient location, thats not stupidity - just human nature.

  159. REAL DEAD Product - Mangosoft Medley97 - May if re by ziperle · · Score: 1



    Mango pooling is the biggest idea we've seen since network computers
    The following is an excerpt from an InfoWorld article from January 12, 1998.

    Mango, in Westborough, Mass., is not your average software start-up. In 30 months the company has raised $30 million. Its first product, Medley97, has shipped, transparently "pooling" workgroup storage.

  160. Re:Not without heavy *use* of other resources by DarrenBaker · · Score: 1

    I digress.

  161. Re:vista? - DFS by AlecLyons · · Score: 1

    Across a bunch of user machines? Sounds really unreliable. I mean you wouldn't give users a reset button for your servers. It sounds like the guy just wants to use this space for the hell of it. In a business context either you need more storage space so you buy the hardware, or you don't. I think this proposal stems from geek mentality to find it abhorrent when a potential resource doesn't have a use, rather than a sound business reason.

  162. Re:Not without heavy *use* of other resources by Anonymous Coward · · Score: 0

    Rebutted? Do you mean recanted? I never heard that. Do you have a reference?

  163. Re:vista? - DFS by mrterrysilver · · Score: 0

    who said it was windows xp? obviously we're assuming its windows server, so what are you smoking?

    and it doesn't matter what type of a client you have to access DFS either, that's just applicable to the clients. but i'm not sure DFS even makes sense here, the intended use is to distribute files, not make one large drive.

    --
    -mr silver
  164. Starfish by Anonymous Coward · · Score: 0

    I won't bother going through comments to see if this has been posted.

    This story has some interesting comments from a guy who claims he's CEO of digitalbazaar.com, a company that created a distributed filesystem named Starfish.

    Open source and cross platform.

    AC because I can't remember my freaking password.

  165. Re:vista? - DFS by kernspaltung · · Score: 1

    You know, I'm no tree-hugging environmentalist and I'm as guilty as the next guy of buying all kinds of stuff I don't really need, but even I realize there comes a point when "buying more" isn't the best answer. If we could use all this space, maybe I wouldn't need to buy a fancy storage array. And the power to run it. And the drives to stick in it. And the place to put it. And all the garbage generated when they built it. And the room it's gonna take in the landfill when its time is done.

    What if I said, "I have a found way to magically extract all the gasoline sitting in gas tanks in junked cars, and now I can give every one with a car a free tank of gas." You'd surely raise your hand to get your tank filled. Sure, doing so is nearly impossible and completely impractical. But in the case of pooling unused desktop PC storage to use on the network, I know it's far from impossible, and with the right software, could even be practical. Otherwise I wouldn't have asked the question.

  166. Re:Typical IT guy by Profane+MuthaFucka · · Score: 1

    Yes, but Lotus Notes will also make baby Jesus and 300,000 IBM employees cry.

    --
    Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
  167. Re:vista? - DFS by hostyle · · Score: 1

    Gary Larson is turning in his grave ...

    --
    Caesar si viveret, ad remum dareris.
  168. Re:Not without heavy *use* of other resources by Profane+MuthaFucka · · Score: 2, Funny

    You've toolized the language!

    --
    Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
  169. Re:vista? - DFS by c_g_hills · · Score: 1

    Unfortunately you misunderstand DFS. You cannot pool multiple targets in the way you intend. DFS is to provide a unified namespace for disparate locations. DFS-R can be used to keep multiple targets in sync with the same data.

  170. Samba + UnionFS by jdb2 · · Score: 1

    First, create a uniquely named empty directory on each drive. Next, set up a Linux
    file server running Samba to be used as a proxy to access the distributed storage
    on the Windows machines. Finally create a union of all the empty directories using UnionFS :

    http://www.fsl.cs.sunysb.edu/project-unionfs.html/

    Problem solved.

    jdb2

  171. Veritas by thatmarkguy · · Score: 1

    I didn't read every single reply, but I have a similar situation. About 60 PCs, each with 300-500GB of hard drive capacity. but only 5-20GB used. All connected by a local CAT6/gbit network. I'm using Veritas BackupExec for our disaster recovery solution. I've setup shares on each box, with restricted permissions, and setup "Backup-to-Disk-Folders" (feature in veritas) on the server that point to each remote host. You can set a max size (how much to use) and a cushion (how much to leave free) Create a pool containing all the BTDFs, and run your backups to them. The great part is if a box is offline, or full, it will just dump over to another one. You should use tapes regularly as well. If you lose one box you could lose everything. But it sure gives the tapes a break, and its nice to know that space is being used.

    --
    -Mark
  172. Re:vista? - DFS by porttikivi · · Score: 1

    Well the name is obviously a tribute, microsoftphobia aside.

    --
    Anssi Porttikivi / app@iki.fi
  173. Sun is working on it by jfim · · Score: 2, Interesting

    Project Celeste is basically what the OP is talking about. It's a distributed filesystem with automatic replication, handles rogue nodes via voting and also exports the "filesystem" as CIFS. It's essentially a distributed object store, which can be used to implement a filesystem on top of it. I saw a demo of it last year and I was pretty surprised, it seems to work quite well for a research project.

    1. Re:Sun is working on it by jfim · · Score: 1

      Sorry, here's a better link about Project Celeste.

  174. Re:vista? - DFS by Anonymous Coward · · Score: 0

    The above-average Windows user's first question would be: how the heck do you install Windows (XP Pro) on something that is NOT a local disk? Excluding the odd case of adding a "vendor floppy with a driver" that emulates a block device (think NBD in Linux), the installer does not seem to know about any networked filesystems. And the Windows boot process does not seem like it could be running as a diskless environment either. (I welcome pointers to the contrary, and ideally, free solutions.)

  175. Re:vista? - DFS by EdelFactor19 · · Score: 1

    DFS was not even remotely created by MS or vista. its similar to AFS except there is more than one, more robust than nfs, and unix based. But sure go ahead and try to give microsoft credit for it.. this is the most uninformative thing I have read all year. Windows XP can also mount NFS, AFS, and DFS drives... as can unix and linux and osX.

    --
    "Jazz isn't dead, it just smells funny" ~Frank Zappa
    EdelFactor
  176. Re:Not without heavy *use* of other resources by thegrassyknowl · · Score: 1

    Don't you worry about BLANK, let ME worry about blank!

    --
    I drink to make other people interesting!
  177. Those drives don't have good duty cycles by Tenareth · · Score: 1

    PC hard drives don't have long lives under heavy load, if you started using them more often your failure rate would go up considerably.

    And considering how cheap these cheap drives are, it's really not worth the effort.

    --
    This sig is the express property of someone.
  178. BitTorrent to distribute Virtual Machines by erexx23 · · Score: 1

    I used BitTorrent to distribute virtual machines across a LAN.

    Not only does this make use of extra space on the hard drives,
    also the network and spare CPU cycles of every PC available to distribute the files.

    VM's can be huge and not easily restored or cleaned after use.
    Point to Point copies of a VM can take a very long time and are not guaranteed to be 100% CRC free.

    A "restored VM" or a torrent that has been restarted on a file that has been used
    or changed can be reverted very quickly back to it original crc/hash and is 100% perfect.

    Essentially I was able to take a job that would take 2-3 hours to complete down to about 35-40 Minutes.
    The job required copying multiple 8-12GB VM's accross 20 machines with 100% accuracy
    Point to point copies are not 100% perfect.
    A Torrent copy is.
    I cant tell you how important this is for a student who needs their VM to work 100%
    and not crash in the middle of a week becuase there was an unknown error resulting from a strait P2P download.

    In addition a "restoration" of the VM would be a simple restart of the Torrent download.
    This would only take about 10 minutes.
    So if a student made a mistake, restoring that VM would only take minutes and not require replacing the whole VM with a new one.
    This is becuase all it does first is a crc/hash check
    and then only downloads the bits needed to restore the file to its original state.
    Its an instantly restored VM.

    No FTP, No Media, No open shares.
    Just a simple Torrent file server with a HTML page for all the torrents.
    The clients are their own trackers.

    Just an example.
    I do not know exactly how this could be applied to other uses.

  179. 9mm vs .45 by Firethorn · · Score: 2, Interesting

    Don't forget that at those sizes, a .45 is nearly 30% larger in diameter, and has far more mass. A 9mm will normally have a 124 grain bullet with a velocity of 1150 ft/s, 364 foot-pounds of energy. A .45 can be shooting 230 grain rounds at 900ft/s for 414 ft-lbs of energy.

    Despite all this, I think that when it comes down to the army, it's mostly because of ammunition selection. Troops are issued non-expanding FMJ ammunition, which leads to 9mm over penetrating and under performing. The 1911, chambered in .45 was designed for FMJ ammunition from the outset. The larger and slower .45 round will use more of it's energy in a body, causing more damage. A 9mm HP will out stop a .45FMJ - but US soldiers are forbidden expanding ammunition. A .45HP will stop more often than a .45FMJ, but the difference is nowhere near as large as the difference between a 9mm HP & FMJ.

    As for the rifle comment, I have to agree. Consider the 'poodle-shooter', the .223/5.56 round our military uses in most of it's rifles. 1300 ft-pds of energy in a 60-70 grain bullet traveling at over 3k ft/s. Sufficient velocity that the round will often fragment when it strikes a target.

    --
    I don't read AC A human right
    1. Re:9mm vs .45 by WillAdams · · Score: 1

      .45s are so highly-regarded in Iraq that one soldier was able to sell his Kimber Desert Warrior, a couple of stainless steel Wilson Combat magazines and a couple of hundred rounds of 230 grain FMJ .45 ACP for $3,000 when he rotated out.

      William

      --
      Sphinx of black quartz, judge my vow.
  180. What about security? by Anonymous Coward · · Score: 0

    Seriously, no confidential data or even users data can be stored in such a system. It would make stealing data very easy, since every computer on fx a campus would contain a portion of the whole. Now if it was a secure corporate building, it could be feasible if you trusted every employee, but who does?

    Nah, rip out all the disks, and put them in a datacenter. Then run thin clients on the computers...

    Anonymous Coward

  181. A little late to be commenting now, but.. by Anonymous Coward · · Score: 0

    I manage a network for a small medical office in my spare time. They use a LinkSys NFS box for their two networked databases, and have about ten overpowered workstations around the clinic for office work and instrument applications. Lots of extra disk space.

    I made them buy the "pro" edition of xxcopy (for the network support) and wrote scheduled backup scripts for the NFS. Xxcopy has reasonably fine-grained file comparison, so I can back up changed files without taxing the drives too much. I made a separate admin account on each machine just to run the backups, and scheduled the scripts under the new accounts. It ain't cron, but it does the trick.

    The backups run once at noon, once at night, and there's also a snapshot of the NFS taken every Friday. As a result there are about 24 backups of the working databases at any given time.

    This way they didn't have to buy a dedicated backup system, and they get much more redundancy than a single backup solution would provide.

  182. Re:vista? - DFS by RedK · · Score: 2, Informative

    The poster asked how to use the wasted space on all the Desktops in his business by pooling them as one big hard drive. So yes, we are in fact looking for ways to make 1 big hard drive, not just share files, and yes, we're pretty sure he's not running a Windows Server Family Operating System (tm).

    So you can count DFS as a big NOGO.

    --
    "Not to mention all the idiots who use words like boxen."
    Anonymous Coward on Monday August 04, @06:49PM
  183. Please don't by mnmn · · Score: 5, Interesting

    Please do not use the space for anything else. Do not try to actively use the space.

    The reason is the obscenely large amount of power required to use the space given a few gigabytes requires the whole machine to be running, and uses it's CPU which can't be less than 21Watts itself.

    It's actually cheaper to get a 1TB drive and use it elsewhere than use the power on so many desktops (or worse, servers). Even with the desktops in use by active users.

    --
    "Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
  184. been done already? by reiisi · · Score: 3, Informative

    limitations?

    And, if you're claiming some kind of market race, you might want to check for relevant dates concerning ZFS

    Of course, if you're just trolling, ignore me.

    --
    Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
  185. Distributed backup by argent · · Score: 1

    The question is, what kind of data would you want to store in a file system where huge chunks of it are likely to be unavailable or just gone at any time?

    I'm thinking, backups.

    Let your desktops provide redundant backups of all your other desktops. Each night, the computers that are still up, would each make multiple copies of themselves on several of their neighbors... copies of everything but their backup directories and those system files that Microsoft makes it unreasonably hard to backup. A copy of the registry, a copy of the profiles, copies of installed programs, copies of all those files that make this system different from that one. Each would select systems that didn't have recent backup copies of themselves, and then at the end of the night they would prune the least useful backups... say, three redundant week-old backups of one of them, and a month-old backup of another... and report to the master control program what they had done.

    Now if you lose your system, you bring up a standard install during the day, log in, and it will become your computer, find its most recent backup, copy itself back from its helpful neighbor and decrypt itself with your password, and by the time you get back from the morning staff meeting all will be well.

    I donate this idea to the public domain.

  186. Re:Not without heavy *use* of other resources by Anonymous Coward · · Score: 0

    Tools are resources. We've come full circle here, folks!

  187. Re:Typical IT guy by Dr.+Cody · · Score: 1

    Wow, fitting username.

  188. What by Kamineko · · Score: 1

    What's wrong with just leaving the disks as they are for heaven's sake?

  189. A Torrential Idea by stereoroid · · Score: 1

    Use Torrents to distribute large files across the corporate network: - administrator sets up a torrent tracker server, and a torrent client on each PC; - administrator seeds the file; each client that needs the file downloads it, getting faster as more peers come online; - it needs some admin tools to keep the clients going: cleanup of old files if disk gets too full. (Feature request? Tracker could tell client which of its hosted files are least in demand?)

    --
    (this is not a .sig)
  190. GFarmFS? by click170 · · Score: 1

    Has anyone ever tried or come across GFarmFS? I literally stumbled across their page by accident by I've read all the documentation I can get on it, I'm interested in implementing it myself. It seems to offer most if not all of the feature you want, maybe it's worth a look.

  191. Is there a distributed file storage system by mrmeval · · Score: 1

    that is cryptographically secure, secure, and GPL'd?

    --
    I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
    1. Re:Is there a distributed file storage system by laymil · · Score: 1

      http://cleversafe.org/
      I did some benchmarking of it for a project a while ago. While it is slow and still appears to be in its infancy, the product does work.

  192. unused space? don't use disk at all by Anonymous Coward · · Score: 0


    Mem: 1027696k total, 919316k used, 108380k free, 31248k buffers
    Swap: 2048248k total, 0k used, 2048248k free, 554324k cached

    A typical system with Opensuse 10.2 and KDE might have the memory usage as above. At the moment there is no need for swap and hence no need for hard drive once the OS and apps are loaded from the network. Alll applications will fit within the main memory, if you pick the applications that are not bloated. For workplace, diskless workstations are the way to go. Store all data centrally for simpler management. Set up a gigabit ethernet network.

  193. Don't use them. by Ant+P. · · Score: 1

    Take all the hard drives out and run everything from a server using two or three of them. You'll save a few kilowatts.
    If you really need local storage put some solid-state drives in.

  194. Re:vista? - DFS by Mysticalfruit · · Score: 1

    Seagate makes a storage product that works on top of DFS called "Storage X" that lets you do this... you can take a whole bunch of servers and combine them.

    Well, that's what their literature says... I have no idea about the real product.

    --
    Yes Francis, the world has gone crazy.
  195. Use them for backups by Rhys_Lewis · · Score: 1

    If you have rsync running on each one at different intervals and use hard links to create efficient differential backups (yes - it is possible in windows), then you could have masses of copies of your data stored between them all. That way you could spend less on backing up your file server (you will still need some off-site backup), and probably have better recovery options for lost files.

  196. FreeLoader : Scavenged Distributed Storage System by ksamer · · Score: 1

    I guess FreeLoader answers your call. Freeloader is a distributed storage system designed to harness the unused storage space of LAN-connected commodity desktops. A dedicated space manager maintains metadata such as node status, chunk distribution, and files attributes. The FreeLoader project is under active development. And can be found through the following link http://www.ece.ubc.ca/~samera/projects/freeloader/

  197. Hadoop on Demand by deerpig · · Score: 1

    Yahoo is developing something called Hadoop on Demand which might work. Hadoop is the Amazon clone of Google's GFS (Google File System). Hadoop on demand is supposed to allow you to use unused volumes on any machine to create an ad hoc hadoop cluster. I'm not sure if it's been released yet, or if it works with Windows. But it would be cool it it did.

  198. Re:vista? - DFS by vikstar · · Score: 1

    Artificial barriers? See Hanlon's razor.

    --
    The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.
  199. Fink Mirror? by HoneyBeeSpace · · Score: 1

    I recently saw that fink is looking for mirror space. Donate to them?

  200. Many Problems by smwny · · Score: 1
    I have not played with this stuff, but I can not see this working. Here are the problems I can think of:
    1. All of these machines would need to be running 24/7. (Solved be repeating data.)
    2. Each machine would have downgraded performance.
    3. If one hard drive breaks it can break any number of files. (Solved with checksums repeating data)
    4. A user may need to use the disk space.
  201. solution in search of a problem by gr8scot · · Score: 1

    What would be a productive use for these terabytes of wasted space? ... Something that could offer the fault-tolerance and ease-of-use of ZFS across a network of PCs would be great for small-to-medium organizations. There might be some costs to save in some places, but the fact that you begin by asking for, not telling about, "a productive use for these terabytes of wasted space" suggests that at most, striping empty shared partitions into a monster drive would be "good," not "great for [some] small-to-medium organizations." I think providing end users with imaging software, and reminding them that giga = 9 zeroes if they're being miserly with their space and discarding important files, is as ambitious as I'd like a network admin to get about utilizing unused disk space, if you worked for me. Storage is cheap.
    --
    All 19 hijackers were known terrorists 09-10-2001. Lack of FBI intelligence does not justify warrantless wiretaps..
  202. Hadoop Distributed File System by owenomalley · · Score: 2, Interesting

    You could put a Hadoop Distributed File System (HDFS) on them. HDFS allows you to use the storage as a single file system that is stable and reliable. We have multiple 2000 node clusters with petabytes of user data on them. Because the blocks are each replicated to 3 hosts, if a node goes down, your data on that node is not lost.

  203. been looking for the same thing.. by pjr.cc · · Score: 1

    Haven't heard of the existence of it yet, but been looking for just the same thing... iirc though, google's fs sounds almost close on the money for what we'd want and its a shame we cant get it. There are alternatives like afs, coda, etc (although, i find some suggestions of dfs people have been making quite strange). But at the end of the day what you really need is something like:

    1) the ability to spawn off say 10gb on users computers all over the place (maybe 100's to 1000's of machines)
    2) machines can be on or off at any time
    3) data must not be lost because a machine went off (or at least, data must be kept in tact if a machine is switched off)

    This is not even close to what dfs is capable of, nor would managing 100's to 1000's of machines with it (even if it were capable on xp/vista) be even remotely manageable. Likewise AFS and Coda dont really lend themselves well to the job. I started working on something like this where a program running on a machine would grab 10% of the free space, create a "disk" file, then talk to some central computer telling it that it was available for use. I never really got far though because theres alot of exception handling to do (and meta data to track). At the end of the day you cant guarantee the availability of any one "block" because unless you mirror to every computer then you might be able to guarantee 1 machine is on at any one time. Unfortunately, that approach is almost pointless because the data might as well just sit on 2 physical volumes on a storage array rather then across all your machines.

    Which is not to say its a bad idea, but it'd be good to find a "desktop storage agregator" or some such!.

    1. Re:been looking for the same thing.. by RevStor · · Score: 1

      We recently released a product that does just what you described by aggregating desktop disk space easily. Check us out at www.revstor.com.

      --
      RevStor, LLC - The revolution in storage http://www.revstor.com
  204. Re:Not without heavy utilization of other resource by SleepyHappyDoc · · Score: 1

    Assuming a recent CPU with hardware virtualization, could you have one partition (say, 20GB) with Windows for the user, and another partition running something else to serve up the remaining hard drive space, with a hypervisor running them both at the same time, invisible to one another?

    I thought this was the kind of scenario that virtualization was intended for.

    --
    Stasis is death. Embrace change.
  205. Easy (in theory) by tzot · · Score: 1

    Devote some disk space from each machine by creating empty files of the desired size (in W98 workstations, create many 1 GiB files). Write a simple Python (or whatever) that allows the devoted space to be treated as a remote device (elementary control requests and read, write operations in block sizes, whatever that block size may be).
    Consolidate on a Linux machine, write a little more code using LUFS to connect to these remote devices and build soft raid devices (md-raid, evms, whatever). Make sure that you distribute your subvolumes as evenly as possible among the available remote devices. Create a filesystem on the mega-device. Share (NFS, SMB).

    --
    I speak England very best
  206. Manage != own by ediron2 · · Score: 1

    My advice?

    Don't.

    40 gigs free, 100 desktops == 4 terabytes. That's roughly a grand now for a homebrew system, adding in the drive controller or surrounding boxen. DIY, on your own hardware, write it up and we'll all link and blog and rave about the cool hack. Well, we'll be less impressed now that drivespace is at $200 a T, but we'll at least nod approvingly when we see it on your resume.

    Outside data introduces risks. Inside data has risks like HR or payroll or company secrets disclosure. Network and power utilization go up expensively. Someone will demand data recovery beyond your ability to provide it. Someone else will complain that your data got corrupted when some end user turned off a system midwrite.

    Put another way, imagine trying to make a business case for this to the CEO of your company. If that doesn't turn off your urge to do this, start your way up the food chain until you get every superior's approval or adequately shot down to drop this fool's errand.

    Did I mention you can get fired for this?

  207. Old product - what became of it? by badmemory333 · · Score: 1

    I remember about 5-8 years ago a product that I thought had "orange" in its name (or maybe an orange logo) that used the concept of spreading out one's documents across PC hard drives on the LAN for the purpose of backups/fault tolerance. Does anyone remember what that was called or what happened to it? Basically it would break up the file into chunks and store them on different clients, with encryption, etc. I'm not sure if it ever made it to v1.0.

    1. Re:Old product - what became of it? by badmemory333 · · Score: 1

      OK here it is, as noted by someone @ REAL DEAD product...: "Medley97 is a virtually zero-administration, plug-and-play network operating system that creates a pooled network drive and disk cache from unused disk space and free memory on workstations. Available: Now (well if you live in 1997, that is). $695 per server MangoSoft Corp. (888) 88-MANGO; fax (508) 898-9166 info@mango.com or www.mango.com

  208. Re:Mod grandparent down by Anonymous Coward · · Score: 0

    It might have been a "joke", but it's not funny. Therefore, mod grandparent down.

  209. Re:vista? - DFS by Anonymous Coward · · Score: 0

    Once again, the lowly system admins must come crawling to the developers for a solution to their problems. Learn to write code and roll your own.

  210. Re:vista? - DFS by PingPongBoy · · Score: 1

    DFS - Distributed File System. Just create a share with each of these and POOL IT with a DFS system

    It sounds useful, but really it's more trouble than it is worth to treat separate computer drives as one volume, in a large office. DFS would be useful for a network of computers in one place isolated from other meddling people, like a rack of servers.

    Notwithstanding the low price of just buying a terabyte disk should you need the space, trying to make a hundred computers serve up a lot of bitty disk space is really silly in terms of the cost-benefit. The first problem is someone power cycling their computer or disconnecting it. Then there's the slow access over a long cable. Already that's cause for lots of running around making sure the computers are up. Then there's the problem of someone just moving a computer somewhere else, and you could spend days trying to find out what happened, especially if that somewhere else happened to be out-of-town, or just into storage when an upgrade is obtained. Further keep in mind that the small drives are getting to be old drives and liable to zonk out.

    So, if your environment is lots of small files, just share the drives to use for redundant backups of files that are small yet important, and don't lose sleep looking for data in a suddenly unavailable drive. Also encrypt because you may well lose track of where the file went or what file exists. If you need to handle larger files and need more space surely the company will be able to afford larger capacity drives, as they are already using so many computers.

    --
    Know your pads. One time pad: good for cryptography. Two timing pad: where to take your mistress.
  211. Re:Not without heavy *use* of other resources by Anonymous Coward · · Score: 0

    A big problem is that so many people write things and say things because somebody else did, and they think it's cool. Unfortunately, these phrases or expressions quickly become overused, and that makes them annoying and ultimately meaningless.

    For example, how many times have you read people misusing the phrase "per se"? Often it is written as "per say". But what does it actually mean? Most people don't really know. Then there is the New English version, "in and of itself". What, is "by itself" not good enough? Why does it have to be "in and of" itself?

    "Period" is not a sentence. Don't spell out the punctuation in your sentence. It is redundant and silly. English has plenty of words that can be used to emphasise something. Why not try using one? There is no need to spell out your punctuation, comma exclamation mark! See? It's silly.

    Similarly, when speaking English, it is silly to waggle your fingers around to indicate quote marks. There are words available which can express the desired intention with no need for hand movements. And saying "quote unquote" before quoting somebody is absurd. Surely it'd be better to simply say "quote" before the quoted material, and possibly "unquote" afterwards, if necessary.

  212. GoogleFS by unicode · · Score: 0

    Google could open source GFS

  213. Re:Not without heavy utilization of other resource by donaldm · · Score: 1

    I am sure some kind of linked filesystem would be possible. In most practical situations, I think this idea would be a non-starter. I fully concur with your statement. If you had say 100 PC's with say 160GB each machine and had say 60GB free you could use that 60GB and get 6TB. This sounds great except you now need the software to do this (we are talking about MS Windows and the company has to pay for it) and an Enterprise backup solution (not cheap either) and that is assuming each PC is connected to a Gigabit network and considering most PC's have on average disks that run at 5400 rpm (laptop) and 7200 rpm (desktop) your disks become your limiting throughput and also a single point of failure and that does not even take into account someone switching off their PC especially if the said company has an energy policy.

    This discussion is nothing new and was discussed back in the 1980's and the conclusion was "if you don't like wasted disk space then get a centralised server and thin clients" and it equally applies today. If the business requires some of it's workforce to have laptops then connecting them into a distributed file-system is not worth the trouble because of their portability.
    --
    There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
  214. Oh yea. by killmofasta · · Score: 1

    The idea of PORN, or something akin to it.
    Best dirsrubuted file system? None! Use Bittorrent, and fill the drives with backups. Do a fit match up... start from large to small. When something gets nocked off, you have just to get a torrent running, and in a few minutes Volia! ( but it will cause a bit of slow down, with all the traffic suddenly. )

    Just be sure to close ALL torrent traffic to the outside, or someone else will 'share' your files... try looing for win.ini on the internet, you will get a huge suprise...

    be sure also to AUDIT YOUR ENTIRE NETWORK FOR ACCESS RIGHTS, and all the most common mistakes...sharing 'C' and Administrator account...

  215. Wow.... by SoulDrift · · Score: 1

    Dude, relax.

  216. Don't forget to consider by Z00L00K · · Score: 1

    that when a disk starts to reach it's capacity the performance degradation impact on the operating system can be considerable. This because the seek times are increasing and the fragmentation is also increasing.

    --
    If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
  217. back to basics by Anonymous Coward · · Score: 0

    input -> process -> output
    everything got a virtual platform
    and
    move things dynamically
    got a big mainframe? got a big storage?
    then get ur workstations 'image'-wise and diskless
    just give dumbs dumb terms

  218. Write once as a backup/archive? by Peter+(Professor)+Fo · · Score: 1
    Setting up a file retrieval system across such a network would be a bit of a hassle and fraught. What about looking for an application that involves writing large chunks of data, occasionally, to dusty discs just in case it might be needed later.

    An application that springs to mind is backing up 'the bits that don't get backed up' possibly as an image. (Ok, lots of issues here, but I'm only thinking aloud.) You might set up a couple of PCs that fetched the whole of the company web site if their copy was more than two weeks old, or another three that stole files from the database server hourly, daily and weekly.

  219. defensive much.. comedy gold! by Anonymous Coward · · Score: 0

    ... that was unintentionally hilarious... 'porn!' was a default expected response but then you got all defensive about it. It really makes it look like you've got something to hide :D

    plz torrent kthxbai.

  220. Re:Not without heavy *use* of other resources by g0at · · Score: 1

    Amen, brotha!

    Now, why the heck did you post as AC? You know, the uh, cool kids judge you by your cojones revealed by posting under a username. :p

    -b

  221. Donate to anonymous internet! by Anonymous Coward · · Score: 0

    ... and let Freenet run on it: www.freenetproject.org

    "In computer science, Freenet is a decentralized, censorship-resistant distributed data store originally designed by Ian Clarke. Freenet aims to provide freedom of speech through a peer-to-peer network with strong protection of anonymity. Freenet works by pooling the contributed bandwidth and storage space of member computers to allow users to anonymously publish or retrieve various kinds of information."

  222. Slows down participants; try with RAM instead! by MessyBlob · · Score: 1
    All that extra stuff, I hope is being written to the slower parts of a disk. Otherwise, even relatively well defragged drives will suffer slowness because of the extra piecemeal data.

    Theft of a machine could pose a security problem, if the data is not fragmented enough. What if only some computers are switched on when the data is needed? It would need to be massivel redundant to work properly.

    Try something else: sharing the network's RAM; that would make a significant performance difference.

  223. Re:Typical IT guy by RockDoctor · · Score: 1

    thats not stupidity - just human nature.

    Let's get this clear - are you implying that there's no difference between stupidity and human nature (on average).
    How cynical.
    How realistic.

    --
    Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
  224. Multiple backup by thalassinos · · Score: 1

    Backup daily PC1 to PC2, PC2 to PC3, and so on...

    Use something like rsync.

    Encrypt backups.

    For additional redudancy, you can use a scheme like backing up PC1 to PC3 to PC4, PC2 to PC4 to PC5 and so on.

    This does not give you the virtual file system that you were hoping for, but at least it puts to partial use the unused space.

  225. Re:Typical IT guy by Anonymous Coward · · Score: 0

    Dude, chill.

    He was joking about what you can use the disk space for, not what it is currently used for.

  226. Not free, but does what you need by Anonymous Coward · · Score: 0

    SANMedlody at www.datacore.com

  227. Power Consumption? by simpl3x · · Score: 1

    And, how much power do these boxes use over a year, and how does this relate to the cost of a new hard drive?

    It seems wasteful in terms of time as well as infrastructure. I do understand the desire to utilize unused resources...

  228. Re:vista? - DFS by EvilRyry · · Score: 1

    Some other people have mentioned it, but I'd just like to make it clear. MSDFS sucks at life. First, I believe only windows server can participate as a server, so there goes the desktop idea. Even with windows servers its quite inflexible, quirky, and unstable.

  229. Re:Not without heavy utilization of other resource by AndrewRUK · · Score: 1

    I think the big hurdle is partitioning off a part of each hard drive so that the user can't access it, so what they don't know about they can't be angry about losing.
    In a Windows domain, it is simple to make a drive on users' machines hidden via group policy (specifically, the Hides these specified drives in My Computer setting.) Combined with appropriate file & folder permissions, this can create a partition that Windows never shows to the user and doesn't let them access even if they do find out about it.
  230. Pooling of storage resources by Anonymous Coward · · Score: 0

    I'm a systems analyst for a major retail firm. Currently, I'm managing around 500 machines that support all the various applications we implement (Mainly AIX, Linux, Solaris and about 80-90 Windows 2003 Server boxes). Short answer- look into EMC storage using NAS.

    Also- migrating your physical servers to VMware instances will enable you to effectively allocate storage in the form of virtual disks. The idea is to keep your storage in disk-arrays, adding space where you need it as you need it. This eliminates the inefficiencies of multiple physical disks that are allocated to a specific machine.

  231. Re:Not without heavy *use* of other resources by arodland · · Score: 1

    Just because the bad authors who do this have succeeded in making the term utilization standard does not change the AC's point in any way. Of course it does, because it's a fucking useful word, and it has a distinct shade of meaning. You can't blindly replace "utilize" by "use" in a given sentence and leave the meaning intact. You should use the best word available (taking into account tone, target audience, and medium); and whenever you're tempted to criticize others' diction without understanding, you should shoot yourself in the head instead. The urge will go away on its own.

    Side note: one advantage of "utilize" is that it inflects decently. It's still deficient, but not as bad as "use".
  232. Re:Not without heavy *use* of other resources by smallfries · · Score: 1

    Wow thanks for the lecture, it's a shame that you don't know about the language to pull it off properly. I guess the fact, than unlike you, I write for living has taught me a thing or two about the language that we use.

    There is no distinct shade of meaning: utilization is a *synonym* of usage. Do you know what a synonym is? Just so that you are aware - not one person has suggested blindly replacing "utilize" by "use". But the usage of the word utilization is completely unnecessary, it does not have a distinct meaning that is missing from the word usage.

    It is used by pompous asshats who think it makes them sound more knowledgeable, when in fact it merely signals them out to the crowd they wish they were in. Do you know where you stand now? Or has my utilization of the language been overly taxing for your tiny brain?

    --
    Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
  233. Re:vista? - DFS by Anonymous Coward · · Score: 0

    "DFS doesn't actually allow you to pool your disparate storage. It acts as a generic namespace that allows you to have multiple replicas of the same data, and keep your users from actually knowing where the stuff is kept."

    Oh! so instead of a new revolutionary concept coming from Redmond's factories is just a reinvention of AFS which has had a long running open source variant named OpenAFS, isn't it? (so good for the troll few messages above).

  234. Re:vista? - DFS by turbidostato · · Score: 1

    "trying to make a hundred computers serve up a lot of bitty disk space is really silly in terms of the cost-benefit."

    Only if there's no "prepacked solution" since the disk space is *already* payed for. The problem is not "I'm going to buy a lot of desktops with 80GB disks while local clients only will use 10GB so I can use the spare space" but "I can't get desktops with less than 80GB despite the fact they'll only use about 10GB, is there any way I can leverage my already payed for infrastructures?". It might be no proper solution, but the question seems quite legitime nevertheless.

    "The first problem is someone power cycling their computer or disconnecting it."

    Problem solved some decades ago: the solution is named "RAID".

    "Then there's the slow access over a long cable."

    Unless his LAN uses some kind of ADSL for local transit, it can't be a problem, since the bandwith usage for "using a lot of resources from my desktop that are on a central server" is just the same than "offer a lot of resources from my desktop to a central server".

    "Already that's cause for lots of running around making sure the computers are up"

    Only if there's no RAID over the solution; only if you try to use that spare space as "live speedy mass storage". What if that space is used for "near-line" backup storage or old data rarely needed? The service can even be coupled with some WoL environment so if the data is on a turned off system it will be automatically powered-on.

    "Then there's the problem of someone just moving a computer somewhere else, and you could spend days trying to find out what happened"

    Of course nobody says such a solution will be "as easy as turn on you recently bought computer with Windows Vista Home Edition". As any other "Enterprisy Solution" it will need to be properly engineered but, again, problems don't seem to be unsurmountable at first glance, let's see a simple operational flux:
    1) Space-locators recieve a unique ID, so you can move computers and, as long as they are reacheable through the network, you can move them all you want.
    2) The box cannot be connected; it might be turned off; WoL will try to start up it.
    3) The box won't turn on even with the aid of WoL; well, no problem: data is RAIDed so it will be served out of checksums and the "lost" node will be replicated somewhere else.
    See? All of this would be absolutly transparent to the sysadmin as long as the solution is properly engineered.

    "Further keep in mind that the small drives are getting to be old drives and liable to zonk out."

    That's again an already assumed problem. For one, "server" drives will be old, small and flakey in the near future too, so you better already have a plan for this situation; for two, ask Google or any other corporation with massive volumes of data in a "grid": even if they were "server quality" once you got a lot of disks you have to plan -and engineer, not for the case a drive dies but for an environment where a significative percentage of disks are continously off-line (due to breakeages, power out or whatever).

  235. Re:Not without heavy utilization of other resource by turbidostato · · Score: 1

    "Assuming a recent CPU with hardware virtualization, could you have one partition (say, 20GB) with Windows for the user, and another partition running something else to serve up the remaining hard drive space, with a hypervisor running them both at the same time, invisible to one another?"

    So in order to take advantage of some gigs of data that costs peanuts you are going to sacrify expensive RAM and CPU cycles? That indeed would be "Not So Intelligent (TM)".

  236. That's what we do by RevStor · · Score: 1

    That's exactly what our software does. Check it out at http://www.revstor.com./

    --
    RevStor, LLC - The revolution in storage http://www.revstor.com
  237. Re:Not without heavy utilization of other resource by SleepyHappyDoc · · Score: 1

    Does your average office drone really need the whole power of a modern processor to bang out documents in Word? The most basic computer you can get from Dell or Lenovo or some other OEM has lots of RAM and CPU cycles to spare.

    --
    Stasis is death. Embrace change.
  238. Re:Not without heavy utilization of other resource by turbidostato · · Score: 1

    "Does your average office drone really need the whole power of a modern processor to bang out documents in Word?"

    Yes. Of course they won't need the CPU on a 100% basis, but they certainly need from time to time, like when the open said word documents.

    " The most basic computer you can get from Dell or Lenovo or some other OEM has lots of RAM and CPU cycles to spare."

    I don't think so. Specially RAM; office computers are not usually on spare of it.

  239. Allmydata "Tahoe" by n6mod · · Score: 2, Informative

    I do some work for Allmydata, which an online storage provider. Their next-gen storage technology is open source and nearly perfect for this application. It's a bit green at this point, but coming along nicely. http://www.allmydata.org/

    --
    You have violated Robot's Rules of Order and will be asked to leave the future immediately.
  240. Re:Typical IT guy by gr8_phk · · Score: 1

    But thanks for being an ass.
    No problem. The first part of my post was just venting about a general frustration I've been having with the IT policies at a certain big company I work for (and others, but not you in particular). OK, and a bit of that continued through the rest of it. It's really unfortunate that I did that, as I think my on-the-spot ideas for dealing with our email problem aren't too bad.

    This company caps email at 120 Meg, and whenever I get a warning that I'm reaching my quota I end up searching email by attachment size. People have bad habits of putting spreadsheets inside word docs, and all sorts of neat ways to clog things up. I just like to keep old email. I've only been there 6 months and have had to clean out the inbox a few times. Other places I've worked have automatically deleted old mail after 60 days and left it up to the users to "archive" it in a place that requires extra effort to get at. This is really frustrating given all the free space I have locally (as you pointed out).

    Sorry for being an ass, but venting on rare occasions is good for the soul. Better to direct it into slashdot than people in the real world :-) Ummm my real world that is.
  241. Proper Tool by maz2331 · · Score: 1

    The proper tool, IMHO, for quickly stopping a charging HDD is a .458 Winchester Magnum, with 450-grain soft point bullets. Great for putting 3/4 - 1" holes through all platters, with the side benefit of sending the drive a further 10 yards downrange, just in case a follow-up shot is required. Of course, I've never had to fire twice. One round seems to do the trick.

  242. Re:Typical IT guy by kernspaltung · · Score: 1

    I totally understand. I realize that the IT-people-are-power-drunk-jerks stereotype didn't appear out of thin air. I try my hardest to surprise my fellow employees by actually being nice, helpful, and non-autocratic. As for attaching spreadsheets and presentations to emails and then forwarding them to fifty people instead of copying them to a public directory, I'm completely with you.

  243. Re:Not without heavy utilization of other resource by SleepyHappyDoc · · Score: 1

    I'd have to disagree with you on the RAM bit. A friend of mine recently got 10 systems from Lenovo for the office, and each one had 2x512MB RAM, which I'd consider heavy overkill for the kind of workload they used. Perhaps that's an unusally large amount of RAM for a system to come with.

    Besides, even a pretty close system would probably be able to spare the 32MB or so the virtualized SAN-node OS would need. It would probably eat up less than the onboard graphics will on those systems. We're not talking about running two full desktops at once.

    --
    Stasis is death. Embrace change.
  244. Data grid by Kolargol00 · · Score: 1

    What about a data grid?

    --
    XML is like violence. If it doesn't solve the problem, use more. Junta
  245. I've done this. Here is my feedback. by advid.net · · Score: 1

    I've merged free disk space on different windows PCs into one big samba share.
    Steps:

    On each windows PC:
    1) Create an ad-hoc account on each PC
    2) Set up a shared folder on each PC with full control by the previous account
    3) Create a 2GB file (or more) in each folder, with a serial number in the filename (I've written a small executable for this)
    On a linux box:
    4) Mount the different windows shares
    5) Set up a loopback device for each big file in the shares
    6) Set up one device using all this space with device mapper. You can have the equivalent of RAID0, RAID1, ...
    7) Make a FS on the device
    8) Mount it
    9) Set up a share
    Et voilà!

    I have two scripts to "mount" and "umount" such a device. The "mount" script does the 4-5-6-8 steps and the "umount" the opposite in reverse order.

    If a windows PC needs to quit the group, put the big file on another windows member, that's easy.

    Warning: it was so slow to make a ext2/ext3 (I don't remember which one I choose) that I've made a FAT32 file system, it was much faster to create (I still wonder why).

    Of course I have very low performances with such a setup but a least I have a *lot* of space in one device.

  246. You can't archive the entire internet by MooseTick · · Score: 1

    "The distributed database could be constantly updated from the original sources, and the distributed storage then becomes in effect a giant cache that contains the entire internet."

    As most people here know, possibly 80% of the internet is secured by passwords or some other mechanism. Google, Yahoosoft, or whoever only provide a small insight to what is out there. Archiving what mcdonalds.com chooses to present isn't that valuable.

    "Now we could employ the distributed computing software to datamine that cache and we could have searching independent of Google or Yahoo or M$FT."

    Effective datamining is more complicated than GREP. Google,Yahoo, and even M$FT have a lot of PhDs and tech gurus that have been working on optimizing search for 10+ years using servers dedicated for that effort.

    Perhaps a more realistic and practical use of that space would be to redundantly backup his company's servers. He could make encrypted bite sized backups and park them on unused blocks of the desktops available. This still wouldn't be easy to implement but is theoretically doable.

    1. Re:You can't archive the entire internet by TropicalCoder · · Score: 1

      possibly 80% of the internet is secured by passwords or some other mechanism

      Well you raise a good point there, so archiving the entire internet is out. But as you say, archiving what MacDonalds.com chooses to present isn't that valuable. Then we could just turn it to another purpose. How about a completely independent, non-commercial web that is only for people, and this web is accessed by P2P. Something like a giant social web site. Of course, it will soon fill up with commercial messages and spam, and end up just like the Web, so I don't know... only speculating on what use the resources I mentioned could be - the idea of unlimited storage and CPU cycles. Must be some good use we could put it to. Any ideas?

  247. Where does it end by Disoculated · · Score: 1

    Next we'll be talking about harnessing unused RAM on all those workstations.

    Unless for some reason there are extra disks on these hosts, it's not worth the effort of trying to access/lock/manage security/etc for storage that you'd have to access across a network, especially when users could change/reboot hosts and the increase in nodes overexposes you to failures.

    Just buy disks of a size/speed/cost that's appropriate to give them room for anything that needs to be installed locally and set up a SAN or NAS for saving their files, where you can manage bulk storage most effectively. If you're feeling that you can invest a lot of time in the name of storage efficiency, give users individual iSCSI virtual drives on a NAS.

  248. Re:vista? - DFS by Pictish+Prince · · Score: 1

    The above-average Windows user's
    You mean "above average for Windows user's?"
    --
    Only his tendency toward a dazed stupor prevented him from screaming aloud.
  249. Re:vista? - DFS by Pictish+Prince · · Score: 1

    Gary Larson is still alive, although turning in his grave is a distinct possibility.

    --
    Only his tendency toward a dazed stupor prevented him from screaming aloud.
  250. This may fill the bill... by DrHex · · Score: 1

    Take a look at Allmydata Tahoe. I think it will do what you're looking to do. It also sounds robust as well. Hope this helps.

    --
    Scientia et Potentia
  251. DIBS by blacksqr · · Score: 1

    Distributed Internet Backup System (DIBS) is a free, cross-platform python-based solution.

    http://web.mit.edu/~emin/www/source_code/dibs/index.html

  252. freenet? by apostle5406 · · Score: 1

    What about using freenet, possibly in a closed miniature of the full network? Install a node on these machines with all this idle space and assign the lion's share of it as that node's datastore. It's got the advantage of effectively combining all available storage and keeping stored content encrypted.