Slashdot Mirror


Bush Administration's E-Mail Deluge May Overload Archive System

Lucas123 writes "The Clinton administration generated 32 million e-mails. Bush's administration has generated 50 times as much data — 140TB, 20TB of which is email — which soon will have to be archived through a new government-built records management system. The new system may not be up to the task because the technology behind it may not be able to handle the sheer volume of data along with the fact that the Bush administration has been slow in providing the National Archives and Records Administration (NARA) with needed information about the records, according to a Computerworld story. Questions have also been raised about millions of missing e-mails from between March 2003 and October 2006. 'It wasn't until this summer that an intensive effort began to share information,' said Ken Thibodeau, director of NARA's Electronic Records Archives."

169 comments

  1. It is Clinton's porn stash! by Anonymous Coward · · Score: 5, Funny

    The other 120 TB was probably just Clinton's porn stash that the Bush administration found while purging off records.

    1. Re:It is Clinton's porn stash! by Ethanol-fueled · · Score: 4, Funny

      Probably 120 TB because of the larger image sizes needed to accommodate fat chicks.

  2. Number of emails generated. by Ice+Wewe · · Score: 5, Insightful

    "The Clinton administration generated 32 million e-mails. Bush's administration has generated 50 times as much data -- 140TB, 20TB of which is email -- which soon will have to be archived through a new government-built records management system.

    Well, to be fair, email wasn't quite as popular during Clinton's administration as it is now. Then again, the 400GB of e-mails that the Clinton administration must have generated (if it is 50 times less than 20TB) must have been rather hard to store when he left office.

    1. Re:Number of emails generated. by GMonkeyLouie · · Score: 3, Interesting

      Well, most of that 400GB from Clinton's administration was dirty pictures of interns. In all seriousness, though, I don't think the problem will be finding a way to store all that data. The real kicker will be finding information you need in it. Seems to me like the best way to hide relevant and/or damaging e-mails would be to have them stored right alongside truckloads of chain letters.

    2. Re:Number of emails generated. by Anonymous Coward · · Score: 0

      The current administration has generated 50 times more data in total, *including* e-mail. That does not necessarily mean that they generated exactly 50 times more e-mail data.

    3. Re:Number of emails generated. by xouumalperxe · · Score: 1

      My knee-jerk reaction was the same. Then I realized the Clinton reference is probably there to provide some term of comparison.

    4. Re:Number of emails generated. by hedwards · · Score: 2, Interesting

      It isn't storage and it isn't finding it, the problem is preserving it long enough to look through and index it. I'm sure that Google and companies that do similar work have the technology to do it. I'm also quite sure that for the right price the Federal government could obtain software to do most of the heavy lifting.

      The problem is that the Bush administration deliberately migrated only partially to a new system leaving it in a state of constant risk for bit rot and corruption. It's hard to say how much of it has already been lost due to incompetence.

      And remember this is tax payer dollars and a Republican President, I'm sure he's OK with us writing a check for millions upon millions of dollars to correct his inept decision.

    5. Re:Number of emails generated. by geekmux · · Score: 2, Funny

      Well, to be fair, email wasn't quite as popular during Clinton's administration as it is now.

      Good point. I mean hell, Al Gore had just invented the Internet the year prior. Cut the guy some slack.

    6. Re:Number of emails generated. by Anonymous Coward · · Score: 0

      It took half a dozen responses before someone asserted that Bush himself ineptly decided, yet deliberately, to blah blah blah.

      Nice strawman. He said "Bush administration".

  3. First, not enough emails... by Anonymous Coward · · Score: 4, Insightful

    ...Now too many many emails.

    Whining is Washington's most favorite thing to do.

    1. Re:First, not enough emails... by Anonymous Coward · · Score: 1, Insightful

      Whining is Washington's most favorite thing to do.

      What Bush is about to do is like giving a teacher a stack of books from the library and saying "here's my research paper".
      What Bush has already done is cut out chapters from those books, so that his teacher won't be able to get the full picture.

      Thus you both have too much and not enough.

  4. Let's Have a War on Feudalism by mfh · · Score: 0, Offtopic

    There are always going to be people complaining about something, but usually that day-to-day stuff can be handled quite easily enough. When your organization is making the right decisions however, typically everyone remains quiet and they are quite happy.

    --
    The dangers of knowledge trigger emotional distress in human beings.
  5. Text only, no html by Teun · · Score: 5, Insightful
    Start by mandating text only mail.

    No more fancy signatures and html crap will cause a 60-80% drop in volume if not more.
    Mandate the Usenet way with replies after the original, (it will) teach people to cut irrelevant repeats.
    Stop the addition of stupid and ineffective disclaimers.

    Teach the use of (ftp) servers for sharing large documents, no more Microsoft sized attachments, send a link.

    --
    "The likes of Facebook and WhatsApp are free to those whose privacy is of zero value."
    1. Re:Text only, no html by ai3 · · Score: 4, Insightful

      I would rather buy another hard disk than waste precious time editing the mail I'm replying too, in most cases it simply isn't necessary. For Usenet it's a different story as many people read it, so it's worth the effort.

    2. Re:Text only, no html by Neoprofin · · Score: 5, Informative

      If it anything like our corporate mail server I would bet you the number one space filler is people making minor changes on documents then reattaching them and forwarding them back to the same 50 people who just got the previous version of the document, repeated over 100 iterations as the email soon becomes a 2GB mess.

    3. Re:Text only, no html by tkelechogi · · Score: 1

      "I'm sorry, Madam Secretary of State...can you please delete your previous email and resend it as plain-text without a signature? Only then will we inform the Secretary of Defense of the escalating issue. Thanks."

    4. Re:Text only, no html by EvilRyry · · Score: 2, Informative

      Some database driven mail servers like Citadel, Exchange, Zimbra and probably Domino support only storing the message and attachments once no matter how many people it was sent to.

      It goes a long way in preventing the attachment * user mess.

    5. Re:Text only, no html by malkavian · · Score: 5, Insightful

      Longer email threads seem to end up forwarded and brought to the attention of many people you never expected at the outset.
      Judicious editing of the emails to include only the relevant sections for the replies, giving the context of the emerging thread of conversation means that someone being brought up to speed with that segment of the conversation doesn't need to trawl through masses of irrelevant junk to get at the meat of the issue.
      I tend to do it as an efficiency gain, rather than taking storage space into account. All comes back to that quote you hear people come out with after sitting through a bad movie "Well, that's an hour of my life I'll never get back". It may only be a few minutes at a time, but they mount up over time. Plus, crafting things to cut to the heart of the matter puts things into sharp perspective, and means people are far less likely to digress, saving even more wasted time.

    6. Re:Text only, no html by Teun · · Score: 2, Informative
      When not on Slashdot we're expected to read the message we reply to.

      Deleting the bit that's already answered, not relevant or whatever can hardly be called 'editing', it has more to do with comprehension.

      One of the worst things for the latter is a typical corporate Outlook mail exchange (I know that word...) with at the bottom text that hasn't been read for the last ten replies.

      --
      "The likes of Facebook and WhatsApp are free to those whose privacy is of zero value."
    7. Re:Text only, no html by Randall311 · · Score: 1

      That will never happen in the real world. People always and forever will reply above the original. You can't force the way people reply to emails, it's personal choice.

    8. Re:Text only, no html by Teun · · Score: 1
      But maybe I'm the decision maker at the end of the chain...

      Various departments work together on a proposal and eventually I am involved for the final say.

      It's not uncommon I ask for some information on how they have come to the proposal and I'm confronted with these weird -read-from-bottom-to-top- conversations, they're not conductive to a smooth process.

      --
      "The likes of Facebook and WhatsApp are free to those whose privacy is of zero value."
    9. Re:Text only, no html by Teun · · Score: 1

      Madam Secretary would likely not know how to change from text to html, the issue starts with mail clients that are defaulting to rich text or whatever.
      Besides, the mail server can strip all html before storing, all it needs is support from a corporate policy.

      --
      "The likes of Facebook and WhatsApp are free to those whose privacy is of zero value."
    10. Re:Text only, no html by Anonymous Coward · · Score: 0

      Start by mandating text only mail.
      Mandate the Usenet way with replies after the original, (it will) teach people to cut irrelevant repeats.

      A better solution is to use a USENET-type service for written communication. I miss the days (back in the early to mid 1980s) when USENET was still the best place to ask and get answers for a wide range of topics. Now it is uselessly clogged with idiots and spammers.

      Why do I need to see a spreadsheet embedded in my email must less have attachments referenced in the message but no longer linked to the email because the email was forwarded? A simple URL-style link would be fine.

    11. Re:Text only, no html by Anonymous Coward · · Score: 0

      I would rather buy another hard disk than waste precious time editing the mail I'm replying too, in most cases it simply isn't necessary.

      Except the truth is, you'd rather waste other people's time instead of your own. Editing is for the readers, not for saving disk space.

    12. Re:Text only, no html by kevin_conaway · · Score: 4, Insightful
      Did you stop paying attention to email in 1995?

      No more fancy signatures and html crap will cause a 60-80% drop in volume if not more.

      I know you hate it when your mom or the boss' secretary at work sends out a cutesy formatted email but some people can actually use HTML email effectively in lieu of sending a document or a link

      Mandate the Usenet way with replies after the original, (it will) teach people to cut irrelevant repeats.

      Irrelevant repeats for you may be important context for someone else.

      Stop the addition of stupid and ineffective disclaimers.

      Often times, those disclaimers are required by law. Most people don't add them for fun or to make themselves feel important.

      Teach the use of (ftp) servers for sharing large documents, no more Microsoft sized attachments, send a link.

      FTP? Are you serious? Sending documents by carrier pigeon is more secure and reliable than FTP

    13. Re:Text only, no html by Reality+Master+101 · · Score: 4, Insightful

      Sheesh. I call this phenomenon "technological puritanism". All tech must be ugly! 80 columns should be enough for anyone! Fixed-width fonts were good enough for my granddaddy, they were good enough for me, and they should be good enough for everyone! Words are worth a thousand pictures! Get off my damn lawn!

      Nothing personal, but if people like you were in charge of the world, we'd all be living in gray, cast concrete cubes. Think of the efficiency! No more wasted paint. You can just make a bigger house by stacking the blocks and adding a ladder.

      Most of us *like* color, pictures, paragraphs, and most of all, convenience. Use FTP when I can just add an attachment that goes directly to the source? Give me a frickin' break. No one gives you respect points when you prove how miserably you can live.

      Let's put this in perspective... that 120 terabytes costs 12,000 dollars in hard drives. Retail at Fry's. The entire output of the Bush Administration costs less than what they probably spend on coffee in a month.

      P.S. And, yes, this is from someone who used a teletype in high school, and was ecstatic when we got a 300 baud modem (whoa! It's almost 3 times faster than the ol' 110!) and a Televideo terminal. Those days were not better.

      --
      Sometimes it's best to just let stupid people be stupid.
    14. Re:Text only, no html by msromike · · Score: 0

      That's what top posting is for, to save you and the reader time while at the same time keeping a reasonable record of the conversation. Unnecessary editing wastes more money than it saves.

      There is absolutely no reason not to top post and to ping pong unedited emails around. That may change in the future with the way the economy is going. If wages drop and storage and bandwidth prices rise, and that is still a stretch, it could be a problem. There is a good chance that bandwidth and storage prices rise instead of the steady drop that is taken for granted.

      As the dollar inevitably becomes devalued against the yuan the prices of imported hard drives will rise. By that time we will have lost the ability for that industry to quickly ramp up.

    15. Re:Text only, no html by DarkOx · · Score: 1

      Excahnge's underlaying database and storage engines support pointers to single copy attachments and even the message bodyies themsevles but if you don't have some third party product that does that then Exchange stores multiple copies.

      --
      Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    16. Re:Text only, no html by damn_registrars · · Score: 1

      Teach the use of (ftp) servers for sharing large documents,

      Considering how terrible ftp is for security, do we really want to teach government officials to use that? Not that email is good either, but ftp almost asks to be compromised.

      And I could just see our government IT officials trying something brilliant with ftp like changing the port number with each successive POTUS to match their number in the order of US presidents.

      --
      Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
    17. Re:Text only, no html by geekmux · · Score: 1

      Start by mandating text only mail.

      No more fancy signatures and html crap will cause a 60-80% drop in volume if not more. Mandate the Usenet way with replies after the original, (it will) teach people to cut irrelevant repeats. Stop the addition of stupid and ineffective disclaimers.

      Teach the use of (ftp) servers for sharing large documents, no more Microsoft sized attachments, send a link.

      Holy timewarp batman, I thought I had just picked up my E-mail etiquette guide from 1993 when I read this.

      These all would work very effectively if not for one minor technicality. Common Sense is also known as Kryptonite in Government. They try and stay away from that stuff.

      As far as your disclaimer comment, perhaps if we got rid of stupid and ineffective lawyers who bring pointless lawsuits to a courtroom where stupid and ineffective judges actually allow said lawyer to continue....Well, you get my point.

    18. Re:Text only, no html by Anonymous Coward · · Score: 0

      I would rather buy another hard disk than waste precious time editing the mail I'm replying too, in most cases it simply isn't necessary. For Usenet it's a different story as many people read it, so it's worth the effort.

      If the 3.7 seconds it takes you to do a select-all and delete function to strip away 274 email addresses embedded in 17 headers to make an email thread much more manageable and relevant for the end user to read is your definition of "wasting precious time", I'm curious why are you posting here?

      Sorry if this was too long, I tried to keep it short for fear you wouldn't actually read it.

    19. Re:Text only, no html by Joce640k · · Score: 2, Insightful

      140 1TB Hard disks (plus another for RAID) probably costs less than a couple of government office chairs so what's the problem?

      [Most likely the fact that it's in secret, proprietary formats and spread across hundreds of PCs instead of being archived by the mail gateway]

      --
      No sig today...
    20. Re:Text only, no html by Repossessed · · Score: 1

      that 120 terabytes costs 12,000 dollars in hard drives

      Where the hell do you live that you can buy a terabye of storage at a retail store? (unless maybe you're talking about backup tape?) I can't even get that online. Overall, your cost assessment isn't significantly off though, since I can't push it higher than half a million (including all the equipment to access the data, and redundant storage), even assuming the worst possible purchasing decisions (its the government after all).

      --
      Liberte, Egalite, Fraternite (TM)
    21. Re:Text only, no html by Tubal-Cain · · Score: 1

      Deleting the bit that's already answered, not relevant or whatever can hardly be called 'editing', it has more to do with comprehension.

      Indeed. Think <quote> tags.

    22. Re:Text only, no html by Reality+Master+101 · · Score: 1

      Where the hell do you live that you can buy a terabye of storage at a retail store?

      What? Did you just step out of a time machine from 1990? Where can't you buy a terabyte of storage?

      Here's a terabyte for $119, with an external enclosure.

      --
      Sometimes it's best to just let stupid people be stupid.
    23. Re:Text only, no html by Anonymous Coward · · Score: 0

      "Where the hell do you live that you can buy a terabye of storage at a retail store? (unless maybe you're talking about backup tape?) I can't even get that online."

      Where the hell do you live that you *can't* find 1TB hard drives in retail or online? Those have been out for a year. I walked into a Best Buy six months ago and walked out with a shiny new 1TB SATA drive.

    24. Re:Text only, no html by Xolotl · · Score: 1

      Where the hell do you live that you can buy a terabye of storage at a retail store?

      Where do you live that you can't? Even Walmart sells them!

    25. Re:Text only, no html by Vancorps · · Score: 1

      Why would image based signatures take up a lot of space? You would only need to store it physical once and then the rest of the emails use links to the content when they are opened. If I recall they went to Exchange so they have single-instance storage available to them right out of the box.

      Archiving 20TB of content is not a significant problem at all and could be done in less than 30 days from scratch. All you need is your SAN of choice, mine would be NetApp where 20TB is pretty much the starting capacity and then a tape library for long term storage. Hell for 20TB you could rig up something for less than 300k that would last for quite a long time as you would obviously want to make multiple tapes of the content. At 800gigs per tape with LTO4 thats not even that many tapes. I don't understand why even 140TB is hard for them to manage. It doesn't sound like these guys would last very long in the corporate world. I put together a 90tb SAN in two months and I'd never done it before. Surely they have people familiar with it SANs already on staff?

    26. Re:Text only, no html by Anonymous Coward · · Score: 0

      Let's put this in perspective... that 120 terabytes costs 12,000 dollars in hard drives. Retail at Fry's. The entire output of the Bush Administration costs less than what they probably spend on coffee in a month.

      What an ignorant statement.
      This isn't some guy setting up his own personal fileserver.

      A Federal Agency is going to create an archival system to store 120TB. So at a minimum, that's 2x or 5x as many hard drives for redundancy, then buy a pile more as spares. Next, you have to budget in enterprise class servers, with maintanence & support contracts. Then the database software and who knows what other custom coding needs to be done.

      And ultimately, all that will be dwarfed by the man hours (at $XY per hour) it takes to sort the information.

    27. Re:Text only, no html by Vancorps · · Score: 2, Informative

      Your statement doesn't make sense. Exchange supports and automatically takes advantage of single instance storage right out of the box. What do you need 3rd party software for that disables it?

      I run Exchange on a NetApp SAN so everything gets deduped and archived to tier 2 storage if it hasn't been accessed within 90 days. Tier 2 is a lot SATA disks that are backed up to tape. It's not even an expensive solution when you start talking about the cost of enterprise storage.

    28. Re:Text only, no html by Vancorps · · Score: 1

      It might have been better had the parent suggested SFTP which would be slightly encrypted as compared to email which can be heavily encrypted on the back-end for sensitive content.

      Parent clearly isn't responsible for email for a corporate entity as much of what he says conflicts with their interests as you effectively pointed out.

    29. Re:Text only, no html by demonlapin · · Score: 1

      Judicious editing of the emails

      And you want to be the guy who has to explain what you cut out to Congress when this email ends up being part of a scandal?

      Seriously, Bush sent an email telling everyone that he knew that he wouldn't be emailing while president, in order to protect them from any chance of special prosecutors and the like. Obama is going to have to do the same. The presidential record-keeping required post-Watergate has made damned sure that nobody says anything not suitable for public consumption in any way except face to face. This is not a net benefit.

    30. Re:Text only, no html by Neoprofin · · Score: 1

      You're right. Doesn't help that there are still 50 versions in the single email, but it isn't multiplied by the number of people sent to.

    31. Re:Text only, no html by negRo_slim · · Score: 1

      Most likely the fact that it's in secret, proprietary formats and spread across hundreds of PCs instead of being archived by the mail gateway

      Of course, the government is sinister. Even something as benign as e-mail must involve inflated costs, secret programs and sheer ineptness. Get over it, there is this myth of a super competent government out to screw over it's citizens. There is no doubt the government screws us over as it tries to be an expression of our will. But to suggest it maintains nefarious purposes and isn't just a cluster fuck of red tape and retarded bureaucrats is just tiresome.

      --
      On the Oregon Cost born and raised, On the beach is where I spent most of my days
    32. Re:Text only, no html by T-Ranger · · Score: 1

      Or, as an alternative, you could actually make technology that doesn't suck and that does what users want and expect.

    33. Re:Text only, no html by MooUK · · Score: 1

      Editing of emails when replying is what was being discussed. Not editing the originals.

    34. Re:Text only, no html by MooUK · · Score: 1

      Often times, those disclaimers are required by law. Most people don't add them for fun or to make themselves feel important.

      Really? I've never seen anything firm stating they are required. Like a lot of things, it seems more like legal fashion than legal necessity.

    35. Re:Text only, no html by MooUK · · Score: 1

      Quite apart from the fact that terabyte drives are everywhere, you could also do it the same way you're storing the whole 120TB - ON MULTIPLE DRIVES!

    36. Re:Text only, no html by Urkki · · Score: 1

      In corporate e-mails top-posting is the only sensible way to go. New recipients might be added at any time, and with e-mails there's no way for them to get the old messages. And editing a message to cut out anything irrelevant while keeping everything relevant is waste of time=money, and disk space is cheap today.

      Also, to share files via e-mail in any other way except attachments lacks a standard, and no software support. To use FTP or something like that, you'll need to do everything manually (including stuff like creating user accounts for e-mail recipients). No way. This works somewhat when linking to files in shared network disks, but that's only practical with internal e-mails, ie when everybody has access to shared disks. And even then there's the problem of the e-mail being archived, while somebody might delete or change the linked file at any time.

      It's not a user problem, it's protocol and software problem, a standards problem. But regular e-mail is so broken anyway (just look at all the spam) and yet no new protocol has been accepted, so I wouldn't hold my breath for something better just to make quoting and file attachments more efficient.

    37. Re:Text only, no html by Criminally+Insane+Ro · · Score: 1

      What a terrific ad for online documents like gdocs and zoho

    38. Re:Text only, no html by the_other_chewey · · Score: 2, Insightful

      Stop the addition of stupid and ineffective disclaimers.

      Often times, those disclaimers are required by law. Most people don't add them for fun or to make themselves feel important.

      I don't know about the situation in the USA, but in most parts of the world, this is exactly the reason.
      "Because everybody else does it" is another.
      In multiple european countries, those disclaimers are entirely worthless, and even in some cases came back
      to bite those using them in court by proving that the sender was aware that some piece of information might
      end up in the wrong place.

      Disclaimers don't replace common sense or encryption.

    39. Re:Text only, no html by gad_zuki! · · Score: 1

      >that 120 terabytes costs 12,000 dollars in hard drives.

      Or a few minutes of the iraq war.

      Not to mention the proper compression can really shrink that down to size. I bet theyre just looking at the exchange data store of various servers and adding them up.

    40. Re:Text only, no html by dbIII · · Score: 1
      Top posting, which is really what is being talked about, is annoying and confusing, as per this example which should get the point across when I'm really replying to the line below.

      Irrelevant repeats for you may be important context for someone else.

      but some people can actually use HTML email effectively in lieu of sending a document or a link

      A better example of people who don't know how to use email is the subjectless and textless email with a 400kb MS Word attachment called "document1.doc" which contains the actual five lines of text to read. I really have no idea how they can get such a small amount of content to be so large, so there must be weird formatting in a template that blows up the size. It's really not the html foramtted emails that are the problem, it is the people that misuse various email formatting options in a way that annoys everyone on say text only mailing list, or people that expect professional business communication, or spam filters that interpret big attachments with no context as being undesirable content. A word document dumped to HTML may give you a good looking signature file but a few hundred kilobytes telling it what font to use for four lines of text is really going to annoy people that read emails as text and is going to get blocked by spam filters at gmail and many other places.

      As for FTP, it's more secure and reliable than http which is the usual alternative while people have mailboxes that block 5MB attachments. While https and sftp would be a vastly better idea it is hard enough to teach the person at the other end how to pick things us with FTP let alone anything else. At least with FTP you can send the user a link that includes the username and password - it works fairly well so long as you don't have a lot of people trying to get large files from you and have to keep generating usernames.

      There must be a simple way to convert multiple FTP sites to simple https sites, still support reget and work in every widely used web browser - but for now FTP is dead easy to manage.

      As for disclaimers - they are added for policy (ie. interpretaion of the law as it applies to a single organisation and a few other bits added on) and not actually the law, which is why they differ from place to place. The point of failure is not the actual disclaimer. The failure is attaching multiple copies of the disclaimer on an email that has been replied to several times. What we need in this case is better email software that determines when the appropriate disclaimer is already on the message.

    41. Re:Text only, no html by jesterzog · · Score: 2, Insightful

      If it anything like our corporate mail server I would bet you the number one space filler is people making minor changes on documents then reattaching them and forwarding them back to the same 50 people who just got the previous version of the document, repeated over 100 iterations as the email soon becomes a 2GB mess.

      In our organisation (government but non-US) we just give people a document management system and we educate people about why they should use it. If that doesn't work we point out it's policy and make them use it, because if they don't then it means that important records might go missing and we could end up in trouble if anyone officially requests the records we hold on any particular topic.

      Nobody sends attached documents, they send links to documents in the DMS. After using it for a short time everyone seems to appreciate the benefits of only having a single master copy of documents.

    42. Re:Text only, no html by Neoprofin · · Score: 1

      Or people using standardized document control systems since we already have a system for multi version documents.

      Or just removing the previous attachment when forwarding and replying.

      These are all solutions that make far too much sense when it's easier to just cause problems.

    43. Re:Text only, no html by kubrick · · Score: 1

      I agree completely.

      When not on Slashdot we're expected to read the message we reply to.

      Deleting the bit that's already answered, not relevant or whatever can hardly be called 'editing', it has more to do with comprehension.

      One of the worst things for the latter is a typical corporate Outlook mail exchange (I know that word...) with at the bottom text that hasn't been read for the last ten replies.

      --
      deus does not exist but if he does
    44. Re:Text only, no html by Chrisje · · Score: 1

      While I agree with much of what you say, that $12.000 comment was particularly silly.

      A load of raw data on top of unreliable physical media does not an archive make. I happen to work as a support engineer for e-mail and medical imaging archives, and before that I've worked with data storage, backup and recovery for 12 years of my life, and I've observed that the hard disk is irrelevant. Associated with archiving there are a couple of problems that need to be dealt with:

      - Data Life Cycle.
      If you want to store a medical image for the lifetime of a patient, and you realize a hard disk will have a data retention span of between two to five years, you will realize you need to keep that data "alive" for 68 more years.

      - Authenticity / Logical disasters.
      You will want to digitally sign and encrypt that data and make it read-only because at the end of the day your archive needs to be tamper-proof.

      - Formatting.
      100 years from now, when Adobe and Microsoft might or might not be distant memories, you will still want to be able to access and read your presidential records made in formats of those companies, so your archive will have to be able to decode/display what's stored in it.

      - Physical disaster.
      You will want that archive to be replicated in case a disaster/malice lays waste to the original site.

      - Indexed.
      You will want to be able to search that archive for certain phrases and you will want to get a reply to that query within a reasonable time-span. For this purpose, you can find grid-based archives with on-storage indexes of individual cells that can traverse their bit of data in a couple of seconds.

      - Ownership.
      Some data will go public completely, some data will only need to be accessible to certain audit officers, some data will start out as being confidential and will become public record as time goes by, and something needs to facilitate that process.

      Now, dumping a bunch of raw data on a bunch of hard-disks give you a time-bomb which is pretty useless anyhow because even if the disks don't fail you will never find the proverbial needle in the hay-stack and even if you do you won't be able to decipher what's written on the bugger.

      That's *exactly* why a 120TB archive is *not* and *never will be* a $12.000 question.

    45. Re:Text only, no html by Reality+Master+101 · · Score: 1

      That's *exactly* why a 120TB archive is *not* and *never will be* a $12.000 question.

      My point wasn't that the actual cost of archival was $12K, my point was that the actual quantity of data was irrelevant. Whether it's 1TB or 120TB, you still have all the issues you mention. The OP's point that we should care how big our emails are in the interest of conserving bytes is ridiculous.

      --
      Sometimes it's best to just let stupid people be stupid.
    46. Re:Text only, no html by Opportunist · · Score: 1

      It's not just the hardware cost. Especially when dealing with the government and similar organisations.

      Even when ignoring things like kickbacks, "friends" who have some sort of hardware business who need a fed contract and other bribery, they'll want to buy from someone who can instantly replace their hardware, who gives extended warranties (not only extended time) and so on. In other words, additional service that you usually don't want to pay for or simply don't need. I wouldn't count on 12k paying for that hard drives.

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    47. Re:Text only, no html by Overzeetop · · Score: 1

      Not if there are a lot of images in the emails, which is usually what bogs it all down. There is no efficient, automated way (of which I'm aware) to compress images attachments (including scanned documents embedded in PDFs) without losing information. For example, a color scanned document may be color for very few pages, but there's no way to automatically tell reliably in software.

      Now, if they have multiple copies of the same file, it may be possible to consolidate.

      --
      Is it just my observation, or are there way too many stupid people in the world?
    48. Re:Text only, no html by lysergic.acid · · Score: 1

      but that's just not how HTML e-mails are stored/archived. they're generally stored as MIME HTML (.mht/.mhtml), which Microsoft also uses as their "Web Archive" format to save web pages, or other similar formats (.chm, .webarchive, etc.).

      like the data URI scheme, these formats bind an HTML document with normally externally linked in-line resources, storing the e-mail as a single file. this is similar to how the MIME standard handles "multipart/mixed" content-types, such as e-mail attachments. it would take a lot of time & resources to extract all of the embedded images and attachments from over a billion e-mails and eliminate duplicate resources.

      and even if they manage to do all that, it would just make moving, copying and transferring individual e-mail messages that much more difficult since now all the in-line resources are being externally linked to via a complex directory structure that all the e-mail messages are integrated into.

    49. Re:Text only, no html by demonlapin · · Score: 1

      Yeah, I should have said to assume the originals are lost - possibly by genuine accident. It'll still look bad.

    50. Re:Text only, no html by Repossessed · · Score: 1

      No, i can buy them, I've just never seen them at a price bvelow 200 before now.

      --
      Liberte, Egalite, Fraternite (TM)
    51. Re:Text only, no html by narcberry · · Score: 1

      Many people prefer to have the full thread contained in the last e-mail. It can be tedious to dig through a full thread spanning weeks, and maybe into an archive, to find important information.

      > > But
      > > it's hard to
      > > try
      > > and read long e-mail's
      > Try our new spam-free e-mail system for free!
      > > when you have to
      > > deal with this kind of
      > > formatting too.

      --
      Modding me -1 troll doesn't make me wrong.
    52. Re:Text only, no html by Fulcrum+of+Evil · · Score: 1

      uh huh, I'd just tell congress that I was removing irrelevant stuff; that's hardly unusual.

      --
      "We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
    53. Re:Text only, no html by Shotgun · · Score: 1

      And why does a note from the boss saying "Meeting at 11" need to be pretty? Why does a document saying that Irag has terrorist need to be pretty?

      Email is for sending bits of information quickly. It's not something you post on the wall to show friends and family. Your name and phone number in a fancy font, next to a small picture of the company logo, just tells me that you've wasted a lot of time creating a .sig file.

      Yes. Get off my lawn...and do something useful for a change.
         

      --
      Aah, change is good. -- Rafiki
      Yeah, but it ain't easy. -- Simba
    54. Re:Text only, no html by pacinpm · · Score: 1

      The entire output of the Bush Administration costs less than what they probably spend on coffee in a month.

      "The entire output of the Bush Administration is worth less than what they probably spend on coffee in a month."

      Here, I fixed it for you.

  6. What the hell does the summary say? by Anonymous Coward · · Score: 0

    The Clinton administration generated 32 million e-mails. Bush's administration has generated 50 times as much data -- 140TB, 20TB of which is email

    I'm so confuzzled. 50 x 32 million emails is 140TB, except 20TB which is email. When you make lots of email messages does it eventually start spawning and budding off things that aren't email?

    Besides, only 140TB (or 20 TB)? That's child's play for any competent DB admin, never mind only about $2k worth of hardware to hold it.

    1. Re:What the hell does the summary say? by jimicus · · Score: 4, Insightful

      Besides, only 140TB (or 20 TB)? That's child's play for any competent DB admin, never mind only about $2k worth of hardware to hold it.

      Assuming that none of it's been put into the archival system yet, that means they're dumping 140TB on it in one go.

      You index 140TB on $2k worth of hardware and come back to me when you're done. Hopefully I won't have died by then.

    2. Re:What the hell does the summary say? by Kent+Recal · · Score: 2, Insightful

      Maybe not $2k worth of hardware but $200k will do. Which is still peanuts in government terms. They probably spend that amount on paperclips and toilet paper in the pentagon alone.
      Honestly, storing and indexing 140TB of e-mail is a trivial task when you can apply a six digit budget to it.

      If their "archival system" blinks at the sight of 140TB of mostly text then it doesn't even deserve the name.

    3. Re:What the hell does the summary say? by Kent+Recal · · Score: 1

      After some quick back-of-the-envelope math I'd even say heck, pass me $300k and I'll build the damn thing for you in under 6 months.
      With encryption, authentication and all the she-bang that your little governmental heart desires.

      Governments. Sheesh...

    4. Re:What the hell does the summary say? by jimicus · · Score: 1

      Yes, but whenever a subject involving storage comes up some numpty announces "I can store all that for $2k!"; I've even seen raging arguments based on people pointing out that no, it'll cost quite a bit more than that.

      Granted, in governmental terms that's still a blip on the balance sheet.

    5. Re:What the hell does the summary say? by TheRaven64 · · Score: 1

      Where are you getting 140TB of storage for $2K? The cheapest 1.5TB disks I can find are $160, which works out at $15K just for the disks, with no redundancy, not to mention a enough controllers to run 94 disks. With RAID-5 you're looking at $22.5K for the disks, plus the cost of the controllers, plus the cost of a system with enough backplane bandwidth to handle them, plus the cost of a backup system. And with that many disks, you probably want something a bit more reliable than RAID-5.

      --
      I am TheRaven on Soylent News
    6. Re:What the hell does the summary say? by TaoPhoenix · · Score: 1

      "all the she-bang that your little governmental heart desires."

      Yea, it's the Personal Services that get expensive.

      --
      My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
    7. Re:What the hell does the summary say? by Florian+Weimer · · Score: 1

      After some quick back-of-the-envelope math I'd even say heck, pass me $300k and I'll build the damn thing for you in under 6 months.

      I think they're looking for something more long-term than just six months. 8-)

      Anyway, 300K for processing an unknown number of documents in an unknown number of formats is a bit optimistic.

    8. Re:What the hell does the summary say? by Kent+Recal · · Score: 1

      Sure, it's just a ballpark figure.
      This is a government project. So in reality it won't even get started for under $5 mio, will not show visible results earlier than 2 years after kickoff and become obsolete long before completion...

    9. Re:What the hell does the summary say? by inKubus · · Score: 1

      I was thinking though, this is definitely a job for Sun's new ZFS-based SAN. Obviously you'll need more space to index it, also.

      LTO-4 tapes to hold 140TB uncompressed is probably about $14K also (about $100/TB)

      You're looking at $3-400K for storage, plus some hardware for indexing. You'd probably want something you can add storage to as you index it, which ZFS is perfect for. You could get whatever storage you need for say 3 months worth of importing, then in 3 months buy more storage at future (lower) prices. The 140TB isn't going to get imported overnight.

      Using infiniband or FibreChannel at 4 or 8GBps, it would take about 140,000 seconds or about 39 hours at full speed to transfer that data.

      --
      Cool! Amazing Toys.
    10. Re:What the hell does the summary say? by Anonymous Coward · · Score: 0

      Your ideas are intriguing to me and I wish to subscribe to your newsletter

    11. Re:What the hell does the summary say? by jimicus · · Score: 1

      Using infiniband or FibreChannel at 4 or 8GBps, it would take about 140,000 seconds or about 39 hours at full speed to transfer that data.

      Assuming you can get data off the system(s) it's currently on at that kind of speed - and even then, indexing's going to be an absolute killer because you wind up doing a lot of seeking.

  7. What's up? by Anonymous Coward · · Score: 3, Interesting

    It hasn't helped that the Bush administration has been slow in providing NARA with needed information about the types and volume of data that will need to be archived. It wasn't until this summer that an intensive effort began to share information, Thibodeau says.

    I can understand the reasoning that for national security, some information needs to be kept secret. The thing is, the more I hear of this administration's obfuscation of their communications and dealings, I can't help but wonder what in the World they are hiding.

    1. Re:What's up? by psnyder · · Score: 1

      If you're communicating about war a lot, chances are every other email has a word or 2 that may reference weapons, tactics, troop movements, etc., even if it's not explicitly stated. This is not the kind of information you want people pouring over to try and figure out, since any time dots are put together, military and political enemies will better know what to expect and how to counter.

      But you still want to be free to talk about current situations. I'm guessing not all staff wrote their emails with the thought that someone other than the recipient would read them.

    2. Re:What's up? by Anonymous Coward · · Score: 0

      Are we to assume that you would be able to provide a summary of 140TB of data over a weekend?

      If we were talking about 140GB, you might have a point, but we're talking about massive amounts of data that would take two people a year to catalog.

  8. Shadowy Government by GMonkeyLouie · · Score: 4, Interesting

    Whenever I receive news that information that we're supposed to have access to from the Bush administration has gone missing, it makes me queasy. There's so much secrecy surrounding random little things that it's started to make me paranoid. Maybe it's just me wanting to blame the last eight years on a scapegoat, but I feel like someone at the top is trying to hide something really big and succeeding.

    1. Re:Shadowy Government by Jawn98685 · · Score: 1

      Whenever I receive news that information that we're supposed to have access to from the Bush administration has gone missing, it makes me queasy. There's so much secrecy surrounding random little things that it's started to make me paranoid. Maybe it's just me wanting to blame the last eight years on a scapegoat, but I feel like someone at the top is trying to hide something really big and succeeding.

      Jeez..., ya think? Oh, wait. I get it - your were being sarcastic, right? Right?

    2. Re:Shadowy Government by StopKoolaidPoliticsT · · Score: 2, Interesting

      Perhaps you're too young to remember, but Clinton's administration had a problem with missing emails during investigations too (Lewinsky, why hundreds of FBI records on their political enemies ended up in the White House, illegal campaign donations from China, etc).

      I'd say it's par for the course and if you think just one side is doing shady stuff, it might be because you're a bit partisan.

      --
      Stop Koolaid Politics
    3. Re:Shadowy Government by GMonkeyLouie · · Score: 2, Interesting

      Well, Clinton never tried to insist that his VP wasn't part of the executive branch, never tried to put Harriet Miers on the supreme court... Actually I think the shadiest person in the administration is Cheney. He's certainly one of the only members of the 2001 Bush team left, and he keeps so many secrets! Also, he shot a man in the face one time. I love adding that to the end of my Dick Cheney rants. Is that too partisan?

    4. Re:Shadowy Government by StopKoolaidPoliticsT · · Score: 3, Informative

      Well, Clinton never tried to insist that his VP wasn't part of the executive branch,

      It's called "Unitary Executive." That is, there's only one guy in the executive branch that gets to make the decisions. It's entirely up to the President how much of a role he gives the Vice President. Under George Washington, John Adams lamented that the only thing he could do was preside over the Senate and then, he had no say on anything unless there was a tie. It drove him nuts.

      If you read the Constitution, Article II groups the Vice President in with the executive branch, but the ONLY place it provides a job description for him, other than sitting around, waiting for something to happen to the President, is in Article I, Section 3 where it says he is to preside over the Senate and break ties.

      As for the rest of your comment, every administration keeps secrets and covers things up. All of them. It's not just a Bush thing or a Cheney thing, it's a Bush, Clinton, Bush, Reagan, Carter, Ford, Nixon, Johnson, Kennedy, et al thing. Some are better at hiding it than others... and some people simply refuse to open their eyes if it is "their guy" in the White House.

      --
      Stop Koolaid Politics
    5. Re:Shadowy Government by rtfa-troll · · Score: 5, Insightful

      No; you are partisan when you think an accusation against one side can be answered by an accusation against the other side. They are both bad (they are US politicians; corruption is so endemic that it's legal and called lobbying), but Clinton's presidency ended about eight years ago and isn't something worth discussing now.

      The questions are; how to make sure Bush follows the law for what he still does? How to make sure Obama doesn't start off like Bush?

      --
      =~ s,(.*),<sarcasm>$1</sarcasm>,g if any_point_you_wish();
    6. Re:Shadowy Government by StopKoolaidPoliticsT · · Score: 1

      Did I say Clinton was a bad guy, or was I establishing that this problem has a precedent and isn't a one time thing, so perhaps we need to do something to fix it so it doesn't happen in the future.

      Just because Clinton is out of office doesn't mean his tenure in office was irrelevent. Further, a single data point is meaningless without context. Is the Bush administration solely responsible for a problem because they're evil(tm) or is it a systemic problem across all administrations that actually requires a fix?

      --
      Stop Koolaid Politics
    7. Re:Shadowy Government by mhollis · · Score: 4, Interesting

      I have understood this outgoing administration to be more than secretive. they're positively paranoid and the only administration in memory that was similar was Nixon. All internal memos have been classified first. Declassification only happens when there is a strong and abiding reason why the memo should be declassified. Contrast that with Clinton, where all internal memos are not classified, unless there was a strong and abiding reason why the memo(s) should be classified.

      When Bush announced that his administration would immediately prepare for a transition (before the 4th of November, which was election day in the US), I assumed that the first course of action was that this Bush administration would do what the last Bush administration did: [Rip] the hard drives out of their computers and tried to erase "sensitive" computer files in the White House and West Wing.

      To say that the Clinton Administration started with a "clean slate" was an understatement. Later, Clinton lawyers ignored the dangers of historical archive deletion when faced with Republican destruction of historical records. Presumably, they wanted a "pass" from future Republican administrations.

      Republican administrations tend to be very secretive. Democratic administrations tend to not. I shall expect the Obama administration shall have to purchase all new computers -- or at least hard drives -- in order to simply start up in their first week. This is a horrid waste of taxpayers' money all in the name of whitewashing one's past deeds (for good or ill).

      Due to record-keeping, we now know that Nixon did know about the Watergate break-in. And we do know that he was very interested in its coverup. Nobody can be prosecuted at this time for that (those who were found guilty have all ready served their time). I would be very interested to know if Reagan's CIA planted the stacks of AK-47s used as evidence by his administration that the attack on Grenada was justified. And we still do not know everything about the Iran-Contra affair. These historical records are worth keeping because, well after the Statute of Limitations, America gets another look at how an administration dealt with the world.

      It is a shame that any Administration is that interested in "rewriting history" in order to unfairly burnish a legacy, which in the case of "W" is hardly salvageable.

      --
      Gods don't kill people, people with gods kill people.
    8. Re:Shadowy Government by Anonymous Coward · · Score: 0

      Try reading the actual posts there, before name calling.

      The original post referred to "the last eight years". The second post referred to the Clinton administration, and said that "it's par for the course" and "if you think just one side is doing shady stuff, it might be because you're a bit partisan".

      He said the same exact thing you did. Dumbass.

    9. Re:Shadowy Government by Anonymous Coward · · Score: 0

      Don't blame just the criminal at the top. Congress sent us to war by a secret vote, just to cover their asses if things went bad. What other reason could it have been? The war was legal, right?

      Barely a week ago those fucking cowards had themselves another secret vote on a matter that had nothing to do with National Security.

      Perot was right: It's just Insiders and Outsiders, and neither respects the demands of the Constitution.

    10. Re:Shadowy Government by xant · · Score: 1

      > Maybe it's just me wanting to blame the last eight years on a scapegoat

      Your willingness to blame the problems of our country over the last 8 years on the people who were responsible for running it for the last 8 years sickens me.

      I hope you've learned your lesson about not paying attention to cause and effect.

      --
      It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
    11. Re:Shadowy Government by MillenneumMan · · Score: 1

      Lobbying is legal because it is specifically allowed in the first amendment: "...and to petition the Government for a redress of grievances". I hate corruption as much as anyone, but hate the players, don't hate the game.

    12. Re:Shadowy Government by Achromatic1978 · · Score: 1

      Interesting. There are plenty of constitutional lawyers who seem to disagree with that interpretation of things, so I wouldn't say it's at all clear cut.

    13. Re:Shadowy Government by StopKoolaidPoliticsT · · Score: 2, Informative
      Yes... the same ones that think there is a Constitutional role for the government to provide health care and retirement to the people of the United States, while clearly ignoring the Tenth Amendment.

      It's real simple, the Constitution isn't that hard to figure out.

      Here is every mention of the Vice President in the Constitution:

      Article I, Section 3:
      The Vice President of the United States shall be President of the Senate, but shall have no Vote, unless they be equally divided.

      The Senate shall chuse their other Officers, and also a President pro tempore, in the Absence of the Vice President, or when he shall exercise the Office of President of the United States.

      Article II, Section 1
      The executive Power shall be vested in a President of the United States of America. He shall hold his Office during the Term of four Years, and, together with the Vice President, chosen for the same Term, be elected, as follows:

      ...
      In every Case, after the Choice of the President, the Person having the greatest Number of Votes of the Electors shall be the Vice President. But if there should remain two or more who have equal Votes, the Senate shall chuse from them by Ballot the Vice President.

      In Case of the Removal of the President from Office, or of his Death, Resignation, or Inability to discharge the Powers and Duties of the said Office, the Same shall devolve on the Vice President, and the Congress may by Law provide for the Case of Removal, Death, Resignation or Inability, both of the President and Vice President, declaring what Officer shall then act as President, and such Officer shall act accordingly, until the Disability be removed, or a President shall be elected.

      Article II, Section 4:
      The President, Vice President and all civil Officers of the United States, shall be removed from Office on Impeachment for, and Conviction of, Treason, Bribery, or other high Crimes and Misdemeanors.

      The office is also mentioned in the Twelfth Amendment (which changes how the VP is selected), the Fourteenth (regarding who can vote for VP and that they can't have been part of a rebellion against the US), the Twentieth (defines the executive term and succession to President), the Twenty-third (giving DC electors), the Twenty-fourth (banning poll taxes), and the Twenty-fifth (succession).

      That said, none of those amendments gives the office any more of a defined role, so we must go by the Constitution itself... which is very straightforward and if there's any doubt, look at the first Presidency, which ultimately defined the office in front of the very people that drafted the document creating it.

      So yeah, those "Constitutional lawyers" that see the Vice President's role defined otherwise have a pretty bad reading comprehension problem.

      --
      Stop Koolaid Politics
    14. Re:Shadowy Government by Kagura · · Score: 1

      Your last four posts here need to be modded up highly, and fast. I've learned quite a bit from your posts. Thanks.

    15. Re:Shadowy Government by dbIII · · Score: 1

      but I feel like someone at the top is trying to hide something really big and succeeding.

      Personally I think it will be petty little being hidden corruption like in the Oliver North case. IMHO the overblown secrecy and mass email deletions weren't really to hide the sale of weapons to Iran which a lot of people (right up to the President and maybe even some Democrats) knew about, it was about hiding the embezzlement that paid for Oliver North's convertable and airconditioning Oliver North's house. We don't really know about what deals were involved with Blackwater and a lot of other situations where taxpayers money vanished into unaccountable holes - perhaps a lot was diverted into personal pockets? I don't know but I do suspect these people, paticularly since some of them were even mixed up with Nixon's administration and the scandals we've seen so far. In my opinion when Bush spoke of a "CEO inspired Presidency" he was talking about using Enron as the example, and at the time Enron looked like it was going to keep running on lies without being caught out by reality.

    16. Re:Shadowy Government by instarx · · Score: 1

      It's entirely up to the President how much of a role he gives the Vice President.

      No, not "entirely". He has to conform to the description of the Vice-Presidential role as laid out in the Constitution. He can assigne him any role in the executive, but he cannot assign him a role in the Senate. Unilaterally "interpreting" the Constitution to give the VP rights he is not specificaly granted, or defining the position as being part of the legislative branch and claiming the protections of both offices simultaneously is NOT up to the President or the VP. Doing that clearly violates separation of powers.

      The problem is that Cheney has wanted it both ways - claiming he was a member of the Senate when it suited him legally, and of the Executive when that suited him. NO administration in the past has ever made the ludicrous assertation that the Vice-President was not part of the Executive. So do not try to claim that Cheney playing games with the Constitution is just "more of the same".

    17. Re:Shadowy Government by Anonymous Coward · · Score: 0

      I'm sorry, but am I the only one who sees "Bush" as a new type of swear word? In fact, we should just start using it as one.

      Instead of: don't be an asshat, dude.
      Don't be a bush, dude.

  9. Good strategy by s4ltyd0g · · Score: 1

    For hiding all your nefarious emails in the noise.
    Half those old geezers will be dead before anyone get's around to reading them.

    1. Re:Good strategy by xant · · Score: 1

      Cheney plans to shoot the other half in the face.

      --
      It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
  10. Welcometotheclub by Bandman · · Score: 1

    Great, now they've got to deal with the same sort of things we do. Archiving every bit of email that comes into the system, and making sure it's available online for searching and retrieval.

    I'm interested in how they're going to be doing it. I've been looking at Global Relay for my own mail archiving. I wonder what they'll end up going with. I asked this a while ago on my blog, too.

    1. Re:Welcometotheclub by Vancorps · · Score: 1

      Do you know how much Global Relay costs? I'm looking at setting up a DR site and archiving content there, I would be interesting if these guys are cheaper than me doing it myself at IO Data.

    2. Re:Welcometotheclub by Bandman · · Score: 1

      I don't think I'm allowed to tell you the price they quoted me, but they're really friendly if you give them a call and ask for a quote. I don't think their prices are excessive at all. The steepest part is that there is an initial cost of purchasing a WORM (write once, read many) drive for archival purposes. The nice part of that is that the drive is yours. You take it if/when you cancel your service along with your discs.

      If you'd like, I can get you in touch with my contact there.

    3. Re:Welcometotheclub by Anonymous Coward · · Score: 0

      $10-15/account depending, plus install

  11. Nonsense, basic signed/unsigned math by Anonymous Coward · · Score: 1, Funny

    32,000,000 * 50 = 1,600,000,000 = 0x5F5E1000

    0xFFFFFFFF - 0x5F5E1000 = 0xFA0A1EFF = 4,194,967,295

    If they are using unsigned 32 bit quantities, they still have room to index at least nearly 4.2 billion more emails.

  12. (T)error messages by FreshKarma · · Score: 1

    The bulk of the data is probably screenshots of popup messages on the Presidential PC, sent to White House tech support.

    --
    The future ain't what it used to be.
  13. Just delete it all by Anonymous Coward · · Score: 0

    I think this country would rather just forget the last 8 years.

    All the truly interesting stuff was sent through outside mailservers operated by the Republican party, anyway.

    1. Re:Just delete it all by gmuslera · · Score: 1

      Forget? And miss the chance of the biggest mass trial since Nuremberg?

  14. Word/Excel Attachments by randallman · · Score: 1

    I'd bet a large part of that is uncompressed attachments and probably Word and/or Excel. Also, from Windows users I tend to get bitmaps as screenshots.

    1. Re:Word/Excel Attachments by rantingkitten · · Score: 1

      Or better yet, bitmaps pasted into a Word document, sent as a screenshot. I used to get those a lot.

      Then there were (thankfully less frequent) times when you'd get something like "Shortcut to Shortcut to New Document[1] [2].doc" with the user insisting they sent you an important screenshot in that.

      --
      mirrorshades radio -- darkwave, industrial, futurepop, ebm.
  15. ISPs by Midnight+Thunder · · Score: 1

    Although this is about white house e-mails, this sort of stuff shows how ridiculous it is trying ask ISPs to record all traffic. At least here tax payer money is being used, but an ISP simply does not have that sort of budget. I feel all to often the layman confuses IT with magic and the people in the field as magicians. We are lucky enough if manage to become a level one mage :)

    --
    Jumpstart the tartan drive.
    1. Re:ISPs by Teun · · Score: 1
      ISP's are to keep record of the subscribers contacts, not the messages them self.

      A rather big difference.

      --
      "The likes of Facebook and WhatsApp are free to those whose privacy is of zero value."
  16. The perfect trap by RealGrouchy · · Score: 1

    Just tell them that NARA needs liberating, and that a precise attack using the Bush administration's archives will save them.

    Tell them that 200-300TB of data will be necessary. They'll go in with 140TB and no exit strategy, and their e-mails will be in the archives for decades to come.

    - RG>

    --
    Hey pal, this isn't a pleasantforest, so don't waste my time with pleasantries!
  17. Open it all up by Anonymous Coward · · Score: 0

    Why not make every aspect of government open, transparent, publicly archived, and participatory?

    1. Re:Open it all up by Anonymous Coward · · Score: 0
  18. How much is spam? by houghi · · Score: 3, Interesting

    How much of that is spam? I can imagine they are not allowed to delete spam. Spam has increased, so this would mean that all of it is still there.

    The rest can mean a lot of different things. I am forced to work (otherwise no food) with 150MB excel files that I would love to put in a database and would take up at least 10 times less space. And I am not even talking about speed increase and ease of use, because somebody else has the file open, so I can not change the content.

    Or perhaps Clinton did not keep everything. Or ...

    --
    Don't fight for your country, if your country does not fight for you.
    1. Re:How much is spam? by damn_registrars · · Score: 1

      How much of that is spam?

      That is an interesting question. Though believe it or not, a large amount of spam does come from address-harvesting. Have you tried doing a google search for any email address that you receive spam at? You'll likely find that address on a website somewhere.

      Not much spam goes around by way of spammers randomly trying email addresses, and last I heard the address *@*.* does not actually send an email to every valid address.

      I can imagine they are not allowed to delete spam

      The current administration has already shown lack of concern over what they are or are not allowed to delete in terms of email. Why would they worry about email deletion regulations when spam is the issue?

      --
      Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
  19. If they hadn't gone to exchange.... by CFD339 · · Score: 5, Informative

    The Bush administration moved the White House from a Notes/Domino based system to a Microsoft Exchange based system.

    Before moving, they'd had no downtime -- even when congress was taken out for 2 days by the code red word (they were on Exchange).

    In moving, they mysteriously 'lost' all their backups for a period of time that was suspicious as hell, and now they can't scale to handle the capacity issues they face.

    In a Notes/Domino world, this kind of archiving problem wouldn't be all that hard to deal with. You'd just need enough storage for it, and create archives per week/month/year (or an archive per individual's mailbox, or whatever) to put on as much hardware as was required. I single checkbox would be all that was needed to have it encrypted as well.

    Oh well. I guess if conveniently "loosing" mail when you don't want it found is one of your design goals, than you probably want to migrate to something less reliable.

    --
    The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln
    1. Re:If they hadn't gone to exchange.... by Anonymous Coward · · Score: 0

      If anything they're only hiding the mail -- they're not loosing it on anyone

  20. Yet they 'lost' the important/incriminating ones.. by toby · · Score: 1

    n/t

    --
    you had me at #!
  21. Arabic spam by Storydor · · Score: 1

    Something tells me that most of the data stored was rated as 'spam'. The messages were written in arabic and had some bomb images and something like 'we'll destroy you in the name of allah'

  22. Dear staff by keraneuology · · Score: 4, Insightful

    It has come to my attention that as I prepare to leave office my previous instructions to make all email and other documentation available to the shredder was incorrect. The correct policy is to make everything available to the archiver. If you have any concerns please feel free to pick up a copy of the standard presidential pardon boilerplate from my secretary's desk. Thank you, W

    --
    If the g'vt kept the data on you that google does you'd better believe you'd be calling it "doing evil"
    1. Re:Dear staff by Mex · · Score: 1

      "PS: Tacos rule

      - George W. Bush, The Decider "

  23. How is that too much? by Anonymous Coward · · Score: 0

    So they have to archive 140TB of data? A corner computer shop has enough hard drives in stock. How is that hard?

    1. Re:How is that too much? by Frosty+Piss · · Score: 1

      Hell, I've got that much porn. On a RAID, naturally.

      --
      If you want news from today, you have to come back tomorrow.
    2. Re:How is that too much? by Criminally+Insane+Ro · · Score: 1

      Stay away from Sony's Peta-file system (pronounce it out loud), lol

  24. So that's the excuse now for "losing" data by meist3r · · Score: 1

    Self-proclaimed "Most Advanced nation on earth" that doesn't have enough hard drives ... my ass.

  25. Reply all by Princeofcups · · Score: 1

    See what happens when you keep sending that same Excel spreadsheet back and forth to the whole distribution list?

    --
    The only thing worse than a Democrat is a Republican.
  26. Top posts by CarpetShark · · Score: 1, Redundant

    Actually, 139TB is redundant data from endless logs of previous emails being top-posted over. Come on... you didn't expect bush's administration to actually be able to quote properly and tell youtube spam from important government work, did you?

    1. Re:Top posts by wisty · · Score: 1

      Or 200 copies of "FederaL Buget 2002.xls", forwarded back an forth.

  27. bulk by Frank+Grimes · · Score: 1

    I blame bulky MS-Word documents. If everyone used gzipped, utf-8 text files, you could save fifty terabytes right there.

    --
    CfkRAp1041vYQVbFY1aIwA== RV/hBCLKKcSTP5UFK3kqsg==
  28. Not very tech savy by Anonymous Coward · · Score: 0

    Part of the problem is the technical knowledge of some of the older members of his staff. The problem came to light when VP Dick Cheney demanded a shredder for his e-mails.

  29. Not the same thing. by Frosty+Piss · · Score: 4, Insightful

    Perhaps you're too young to remember, but Clinton's administration had a problem with missing emails during investigations too (Lewinsky, why hundreds of FBI records on their political enemies ended up in the White House, illegal campaign donations from China, etc).

    Yes, but there is a magnitude of difference in importance between lost emails about blow jobs and a little dirty money, and emails about the loss of privacy and civil liberties of US citizens, torture of POWs, and the various other nastiness that GWB et al are suspected of. Much different.

    --
    If you want news from today, you have to come back tomorrow.
    1. Re:Not the same thing. by Anonymous Coward · · Score: 0

      A little dirty money? The money buys influence. Dirty money from China means China can influence the White House. It doesn't matter which party is in office, either of them taking orders from a foreign country is scary.

  30. Junk mail. by Anonymous Coward · · Score: 0

    I guess they had too many secretaries forwarding emails of pictures/jokes/power-point-shits ranging from 5 to 35 megs, to every of their co-workers. lmao

  31. Google by Anonymous Coward · · Score: 0

    Isn't this why we have Google? Come on government, don't reinvent the wheel. Support our industries and contract Google to do this archival and indexing for you.

  32. The problem is Law, and convention, not volume by omb · · Score: 2, Insightful

    As with almost all problems where electronic/internet technologies bump into real life issues eg privacy, non-repudiability and simple confidence it is because the Law has not kept up with technology, and that in the USA is the responsibility of the Congress. Writing was thousands of years old, and the printing-press more than 300 years old when the Constitution was adopted in September 17, 1787. The drafters understood the technology.

    Today we are blessed with ignorant self serving legislators who do not, and are far too happy to follow hard-case makes bad law hurd thought, eg children, porn, paedophilia, drugs and terrorism. The courts have long held that you can read post-cards, but that if your letter-in-an-envelope is opened then a felony is committed or the information is normally in-admissible.

    For this to work people have to start encrypting and signing their e-mails and the Congress and the SCOTUS must enforce identical rules for electronic and hand-written communication.

    Specifically you can not go out and discover the entire contents of someone's library and papers in a law suite, and expect to go on a search-engine enabled fishing expedition.

  33. Desired Outcome on a Platter by RulerOf · · Score: 2, Informative

    Riiight... Blame it on Exchange.

    Seriously, if "conveniently [losing] mail" was the goal of the transition, they could have moved from Exchange to Domino and gotten the same effect.

    Forget not, throwing storage (read: money) at any system tends to fix the problem given a competent staff. You don't make a very compelling argument.

    --
    Boot Windows, Linux, and ESX over the network for free.
  34. 3xSun Fire X4540 Server by Anonymous Coward · · Score: 0

    Thez are not saying that misallocating a resource which costs less than a few missiles fired will hinder the reasonable transition of the american gouvernement.

  35. re: re: re: re: re: re: re: no subject by swschrad · · Score: 1

    > check out the blue dress! forward to everybody you know.

    we need to save 140 Tb of THAT ?!?

    --
    if this is supposed to be a new economy, how come they still want my old fashioned money?
  36. Re: Juducious editing FTW! by TaoPhoenix · · Score: 2

    Yea, and there's an aesthetic feel to it too. If I'm in a 20 reply discussion, I like to edit out anything more than 2 exchanges old, and I change the subject title every two mails.

    Nothing annoys me more than 20 mails titled "re: call"

    --
    My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
  37. Citadel to the rescue? by Anonymous Coward · · Score: 0

    A single Citadel server can replace dozens of MS Exchange servers. The BerkeleyDB used by Citadel can store up to 256 TB.

    However, I guess that using a single server for the President's email would be to cheap and would be rejected in favour of a multi-billion dollar email system the size of a Google farm.

  38. Google by Anonymous Coward · · Score: 0

    Ask Google to archive it.

  39. Not proven by this result by Anonymous Coward · · Score: 0

    This result has zero downtime, system change then lots of downtime and losses.

    It could be done the other way around, but then they had already started on Domino and so moving to Domino was impossible as an excuse.

    And saying that they could have gone the other way to lose email doesn't prove that they didn't lose email deliberately this time either, so why you said it is a mystery to me.

  40. Should've used ZFS by An+dochasac · · Score: 2, Funny

    At the current presidential email growth rate, NTFS isn't gonna cut it for Obama.

  41. Throwing storage won't solve Exchange's issues by CFD339 · · Score: 2, Interesting

    There's an inherent architectural difference between storing mail in a database built on Microsoft's JET technology, and one which stores its data in something that is (although distinctly odd) very much like an xml data store. The Domino architecture makes segmenting the archive into manageable parts by date, by person, or by any combination thereof much simpler.

    Essentially, the Domino architecture results in exactly what you describe -- throw more storage space at it and you can keep storing more data. The Microsoft architecture does not.

    --
    The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln
    1. Re:Throwing storage won't solve Exchange's issues by dbIII · · Score: 2, Informative
      The theoretically competant staff with perfect hindsight and unlimited budget could just alias all the emails to also go to a decent system that actually works properly (and can be backed up easily) - however release cycles with MS Exchange are fairly short and the advertising is so good that people could be convinced that it is a half decent system THIS time. Backups have been a horrible problem with MS Exchange for years and it just seems bandaids have been placed over the problems to keep things going in most situations instead of actually getting it into a reliable state.

      For some ideas of how badly things can go take a look at an MS Exchange users mailing list archive. The thing is so fragile that it doesn't take a lot to lose emails. Those who wish to make personal attacks should consider that I'm talking about more competant and experience MS Exchange admins than myself. I personally haven't touched it since 2002 where the users thought it was reliable because there were really three servers doing the job that any other mail system could be doing on a single server, and where multiple backups and/or mirroring got around the problem of backups being unreliable.

      Apparently the current version is not a steaming pile of crap, however I only have MS advertising and fanboys to assure me of this and I also doubt that the White House actually upgraded to the current version. They are most likely running whatever version was installed at the start of the administration. If it's MS Exchange version 5.5 all bets are off and all emails could collapse into a hole of random data and the backup tapes might only contain the statement "file in use, unable to copy".

  42. in related news by nimbius · · Score: 1

    netapp just posted a surprisingly upbeat 2009 earnings forecast.

    --
    Good people go to bed earlier.
  43. Just join the dots by Colin+Smith · · Score: 1

    It's simple. Here's how to play.

    For, all of the top people in an administration, do:

    Find out who they worked for. Then find out who owns or runs that organisation. Draw lines between the names to represent associations. Then simply count the number of associations each of the names gets.

    For example. The shiny new Timothy Geithner worked for:

    Kissinger Associates -> Which is a member of Council of the Americas -> Which was set up by David Rockefeller.

    or ...

    he's a member of the:
    Council on Foreign Relations -> which David Rockefeller was a director of.

    After you do that a few times with different people on both the democrats and republican sides, you find a small set of names start racking up larger numbers of associations with people in the administrations. The more "hits" they have, the more influence they are likely to have with that government.

    You'll start to see the nature of the real politics going on. The political parties are just a sideshow.
     

    --
    Deleted
  44. Let the Internet Archive and Google do it by Animats · · Score: 1

    Just ship copies of the raw files to the Internet Archive. 140TB isn't that much; they put a petabyte in a rack.

    Once the file formats have been translated, just point Google at the starting URL and wait a day while it indexes everything.

  45. Moore's Law: 16x in 8 years by SamuraiMike · · Score: 2, Informative

    It's been eight years since the Clinton administration. This is 4x the doubling period based on Moore's Law. While Moore's Law relates to transistor density, Wikipedia says that it's roughly similar to gains in disk storage. So in the last eight years, we could estimate disk storage gains of 2^4 = 16x. This doesn't get you all the way to 50x, but it cuts out a big chunk of the gains.

  46. Well yeah... by RyuuzakiTetsuya · · Score: 2, Funny

    Of course that happens when you embed the 1600x1200 raw image of dick cheney giving everyone the finger with each email

    --
    Non impediti ratione cogitationus.
  47. why insightful? by thegnu · · Score: 1

    Funny, maybe.
    First, they broke the freaking law. Now, they've put the people who need to handle their data in a difficult situation by not being cooperative.

    They're not mutually exclusive, IMO

    --
    Please stop stalking me, bro.
    1. Re:why insightful? by Anonymous Coward · · Score: 0

      Did I mention Slashdot is also exceptional at Whining, bitching and moaning?

    2. Re:why insightful? by Anonymous Coward · · Score: 0

      lol
      -thegnu

  48. Massive Typo by dbIII · · Score: 1
    Please read the above as:

    Personally I think it will be petty little corruption

  49. Very compressable by Anonymous Coward · · Score: 0

    But it's OK. The Clinton administration email was full of avis (movies), so it didn't compress very well. The Bush administration's email is much more along the lines of mpgs (phone conversations), so it will get great compression!

  50. 20x1TB hard disks is all it takes by Anonymous Coward · · Score: 0

    Or less using bigger disks. Claims that this is impossible is ridiculous when an average home user can have several terabytes of storage space in a single pc.

  51. Re: Juducious editing FTW! by skroops · · Score: 1

    I can see from your e-mail address, @yahoo.com, the reason why you have a problem with 20 mails titled "re: call"

  52. Not to worry, Bush has a solution by Douglas+Goodall · · Score: 1

    President Bush has a solution to the problem. He and Vice President Cheney help reduce the problem by using other email systems. It is ok because Cheney says he isn't in the executive branch, so it's ok. HeE is doing everything he can to use alternative channels to communicate because he has the highest regard for the National Archives and he doesn't want to make more work for them. I think he is very considerate because he has done quite a lot to reduce their workload. We will look back on President Bush and say he astounded us and that we had no idea how inventive he was and the lengths he would go to so that the people wouldn't be burdened knowing every petty little detail of his work, after all, he is the President.

  53. Re: 20 mails by TaoPhoenix · · Score: 1

    Well, that too, but then I don't get into chats with Spammers.

    It's actually a work thing except the title would be "re: invoice" (and always that vague) before totally drifting into totally different topics.

    --
    My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
  54. Re: Juducious editing FTW! by narcberry · · Score: 1

    I prefer e-mail threads to maintain a consistent subject, especially if people are editing out old entries on the e-mail thread.

    With the sheer quantity of e-mails we receive, the number of mailing lists, and the ridiculous number of rules I require to keep my e-mail sorted, I don't need people like you making it harder. If you have something important to say, make it easy to find.

    --
    Modding me -1 troll doesn't make me wrong.
  55. Re: Juducious editing FTW! by TaoPhoenix · · Score: 1

    Okay, for my reply I'll keep the same header.

    Such are preferences. What's easier for one is harder for another. My intent was "to make the new point easy to find" for me.

    --
    My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine