Slashdot Mirror


Online Document Search Reveals Secrets

An anonymous reader writes "New Scientist is reporting that many documents published online may unintentionally reveal sensitive corporate or personal information, according to a US computer researcher. Simon Byers, at AT&T's research laboratory in the US, was able to unearth hidden information from many thousands of Microsoft Word documents posted online using a few freely available software tools and some basic programming techniques." Update: 08/16 19:06 GMT by H : The story is originally from Crypto-gram, not New Scientist.

271 comments

  1. crypto by Feyr · · Score: 1, Informative

    funny how the lastest cryptogram treats of exactly the same subject, i just received it an hour ago

    http://www.schneier.com/crypto-gram.html

    1. Re:crypto by peculiarmethod · · Score: 1

      why exactly is that funny, again?

      p

      --
      ** "It's not my job to stand between the people talking to me, and the ones listening to me." -- Pego the Jerk
    2. Re:crypto by xv4n · · Score: 4, Funny

      No one can tell, man. That post is encrypted in itself.

    3. Re:crypto by broken.data · · Score: 1

      Funny.. the article f*cking says that!!!

      Oh wait.. this is /.

    4. Re:crypto by randyest · · Score: 2, Insightful

      Well, not sure about what the OP through was funny, but I sure do think this is, from the article:

      "It is feasible that an individual may include their social security number on copies of a resume sent to prospective employers, but delete it from the version put online to guard against identify theft," Byers writes.

      Who in their right mind puts their SSN in any version of a resume??!

      --
      everything in moderation
    5. Re:crypto by jovlinger · · Score: 1

      cryptogram-filter

    6. Re:crypto by Waffle+Iron · · Score: 2, Informative
      I would never put my SSN on a resume, but the last time I made a resume I ran the .doc file through 'less'. Sure enough, most all of my edit history was in there.

      I exported it to RTF then reimported before saving it again as .doc. This erased other people's access to my thought processes, and it reduced the file size by 80% to boot.

      In the end it didn't matter much, though. I usually include a plain text version of the resume right in my email as a backup along with the .doc attachment. On interviews, I've noticed that most people just print the plain text version. If I really didn't need to make the word doc, and people are too lazy to print it, why do companies insist you send it in .doc format anyway?

    7. Re:crypto by chipperdog · · Score: 1

      Why do you even send a .doc (application/msword) resume? If you want to send a "pretty" printable resume, send it as pdf (encoded as mime-type application/pdf), you can create a pdf by piping the print job through 'gs -sDEVICE=pdfwrite -'....Make sure you stick with the standard fonts so it renders the same on the recipent's computer

    8. Re:crypto by BrokenHalo · · Score: 1

      I waw the article at New Scientist earlier yesterday, and my immediate reaction was the thought "and this is news?". Let's face it, anybody who has ever run a .doc file through a hex dump or even a text editor (this is not new) should be aware that there's a lot more information than appears on the screen in MS Word. And I fail to see how Microsoft's incorporating DRM into their files is going to improve matters. Their claim that all the editing history being present in the file is "useful" is specious. It is only useful to someone who wants to spy on you.

    9. Re:crypto by McAddress · · Score: 1

      I am not sure about resumes, but I remember that I had to put my SSN on some of the essays for my college application.
      I even remember having a teacher that insisted on us putting our SSN on our CS homework so she would be able to identify us easier. She even used to hand out it back by calling out the last 4 digits.
      Stupid idiot.

    10. Re:crypto by blibbleblobble · · Score: 1

      "why do companies insist you send it [your resume] in .doc format anyway?"

      Well, if you haven't spend hundreds of dollars on a word processor, you're obviously not serious about this "computer interweb" thingy... and who would want to employ someone who can't even use Microsoft Word?

      </sarcasm>

      I did once get an email from a supposedly "technical" company saying they couldn't use my .txt resume, could I send them a .doc version. Dumbasses.

      Of course, all the linux-migration success-stories seem to mention the Personnel and HR deparments as being the "you'll pry microsoft licenses from our cold dead hands" ones who insist on their own thing regardless of everybody else. No matter what the rest of a company is like, you can guarantee that HR will be computer-illiterate.

    11. Re:crypto by TaranRampersad · · Score: 1

      I say stick HR departments on LTSPs. The resumes are centralized, and if they REALLY want to play solitaire, they can buy their own damn PDAs.

      They can use OpenOffice, and they WILL like it. It's a job. Waste money on your own time. :D

    12. Re:crypto by psm321 · · Score: 1

      Resumes for government jobs require SSN (at least in my experience).

  2. Nothing New by JRHelgeson · · Score: 4, Insightful

    Just go into the document properties section. This is why I publish everything to Adobe Acrobat before posting online.

    --
    Good security is based upon reality and common sense. Common sense is a function of having common knowledge.
    1. Re:Nothing New by Sky-217 · · Score: 4, Informative

      In the article they mentioned that this applies to pdf files too...

      "For example, in 2002 the Washington Post published a version of a letter sent by the Washington sniper in Adobe PDF format. Names and telephone numbers were visibly blacked out, but still found embedded in the file."

    2. Re:Nothing New by I8TheWorm · · Score: 1

      Of course, that makes the text unsearchable by google.

      --
      Saying Android is a family of phones is akin to saying Linux is a family of PCs.
    3. Re:Nothing New by Frymaster · · Score: 4, Funny
      In the article they mentioned that this applies to pdf files too...

      which is why you should use latex! nobody understands that stuff. security through obscurity!

    4. Re:Nothing New by rf0 · · Score: 1

      cat filename | strings.

      Edit in vi
      Run over custom script to add basic HTML

      Works for me

      or Just use LaTex

      Rus

    5. Re:Nothing New by gblues · · Score: 5, Informative

      That is because the people who published the PDF were idiots.

      Acrobat has a number of commenting tools. What the Washington Post staff did in that case was use the Highlight tool, set the color to black, and use it to draw over the names.

      Only problem? The highlighter is an object that is drawn on top of the text object it is attached to. The underlying text is not modified at all. In fact, if you watch closely, you can see the name for a split second before the renderer draws the highlights.

      If the Washington Post had used the TouchUp Text tool to delete the names, the information would not have been leaked.

      Nathan

    6. Re:Nothing New by mistered · · Score: 1
      But that just shows all the text in the document. Most of the strings returned will be what's supposed to be public, which is therefore not interesting. The technique mentioned is basically the same, but eliminates the public text, leaving only the good bits.

      Anyway, you don't need the cat -- strings filename does what you want.

      --
      Enjoy your job, make lots of money, work within the law. Choose any two.
    7. Re:Nothing New by merlyn · · Score: 1

      That'd be a useless use of cat. Simple "strings < filename" would work. Probably don't even need the <.

    8. Re:Nothing New by neitzsche · · Score: 1

      That's overkill. How about

      $ strings filename.doc

      It works for me.

      --
      "God is dead." - Frederik Nietzsche
    9. Re:Nothing New by dvdeug · · Score: 1

      That is because the people who published the PDF were idiots.

      Acrobat has a number of commenting tools. What the Washington Post staff did in that case was use the Highlight tool, set the color to black, and use it to draw over the names.


      And that makes you an idiot? Not tech savey, maybe, but that's the exact thing you'd do in releasing hardcopy, and unless you think in terms of the internals of a computer, there's no reason you'd think twice about doing that.

    10. Re:Nothing New by jovlinger · · Score: 1

      And under infra-red light, you can see trough the black highlighter (well, ink at least. I'm guessing this applies to dried ink as well). If your job is to redact documents, and you do that poor a job of it, you may not be an idiot, but you are incompetent.

      Or is competence no longer a job requirement?

    11. Re:Nothing New by aengblom · · Score: 2, Interesting
      That is because the people who published the PDF were idiots.
      And that makes you an idiot?
      Yes, we were idiots. I work for the Post in a limited degree and we now have a sheet of paper on a quite visible bulletin board describing how we were idiots.

      The .com folks who would post such a document are well aware to checkout if blacking it out was done correctly....now.
      --


      So close and yet so far from the world's perfect ID number
    12. Re:Nothing New by GigsVT · · Score: 1

      LyX is really easy, and really good at the kinds of documents LaTeX is designed to create. Try it sometime, you'll be surprised. Do read the help docs that are in the help menu if you get stuck though.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    13. Re:Nothing New by Anonymous Coward · · Score: 0

      UUOC (useless use of cat

    14. Re:Nothing New by Snaller · · Score: 1

      Publish to text, then we can all read it.

      --
      If Google really cared they would fix Android Chrome to reflow text, instead of discriminating
  3. WHAT?!?? by zedmelon · · Score: 5, Funny

    From the article:

    • "He says hidden information can "incredibly useful" in improving the functionality of the software. "But if some of that data is sensitive, there have to be ways of ensuring that it isn't distributed where it shouldn't be," he says."

    I just created a Word document, blah.doc and put some text into it. I made sure I had a couple of undo points. I closed it and opened it back up, I couldn't undo SHIT. So where the hell am I being granted this mysterious "convenience?"

    I know that the guy stressed the fact that Micrsoft isn't alone in this disctinction, but this is just another example of why Microsoft SUCKS.

    I put the doc in a samba share and viewed it with vi. I found the path to the doc, the original name, my userid on my laptop, and the company name. All were hidden from the simple searches like this:

    s.l.a.s.h.d.o.t...o.r.g

    WTF?!?

    Oh, WAIT a minute! This is also from the article:

    • "The next edition of Office 2003 will include tools that will allow users to remove personal information from a document. It will also include new "information rights management" that will let an author specify who can read or forward a document."

    WHEW! I feel so much better. Please disregard the first six paragraphs. Thanks.

    --
    Mom says my .sig can beat up your .sig.
    1. Re:WHAT?!?? by zedmelon · · Score: 1

      And yes, I know. I'm a fucking idiot for assuming ANYTHING or taking anything for granted when it comes to good ol' Billy.

      --
      Mom says my .sig can beat up your .sig.
    2. Re:WHAT?!?? by brondsem · · Score: 1
      "He says hidden information can "incredibly useful" in improving the functionality of the software. "But if some of that data is sensitive, there have to be ways of ensuring that it isn't distributed where it shouldn't be," he says."
      I just created a Word document, blah.doc and put some text into it. I made sure I had a couple of undo points. I closed it and opened it back up, I couldn't undo SHIT. So where the hell am I being granted this mysterious "convenience?"

      You only have the convenience while the file is open. If you could undo after you re-opened a file, these "hidden secrets" wouldn't be hidden at all!

      I put the doc in a samba share and viewed it with vi. I found the path to the doc, the original name, my userid on my laptop, and the company name. All were hidden from the simple searches like this:

      s.l.a.s.h.d.o.t...o.r.g

      It's probably unicode, which uses multi-byte characters, and vi displays each one seperately.

      --
      "a quote" -me
    3. Re:WHAT?!?? by zedmelon · · Score: 5, Insightful

      "You only have the convenience while the file is open. If you could undo after you re-opened a file, these "hidden secrets" wouldn't be hidden at all!"

      Exactly. I knew that to begin with, but I did it and then vi'd the file to confirm. If I delete text from a document, that means I don't want that text in the document. Neil Laver says "...hidden information can "incredibly useful" in improving the functionality of the software."

      So my main point is, if I am being supposedly CONVENIENCED by this "feature," HOW is the software helping me by storing these things in my document?

      --
      Mom says my .sig can beat up your .sig.
    4. Re:WHAT?!?? by Koyaanisqatsi · · Score: 1

      All were hidden from the simple searches like this: s.l.a.s.h.d.o.t...o.r.g

      It's not hidden. It's unicode (double-byte), just that;

    5. Re:WHAT?!?? by wortelslaai3434 · · Score: 5, Funny

      As a sidenote...

      I. .t.h.i.n.k. .y.o.u.r. .s.e.e.i.n.g. .u.n.i.c.o.d.e. .t.e.x.t.

    6. Re:WHAT?!?? by zedmelon · · Score: 1

      MOD PARENT UP

      pahahahah! That ruled!

      Okay, no, I didn't know that was unicode, and I'm sure that security gurus will scoff at my use of the word "hidden," but what I meant by that is still true:

      You can't just search for a string you're seeking. In vi, you can't go "/slashdot.org," because the string won't turn up.

      Same thing in most apps, even if they WILL let the user search for "hidden text."

      --
      Mom says my .sig can beat up your .sig.
    7. Re:WHAT?!?? by tfinniga · · Score: 1
      I just created a Word document, blah.doc and put some text into it. I made sure I had a couple of undo points. I closed it and opened it back up, I couldn't undo SHIT. So where the hell am I being granted this mysterious "convenience?"

      The convenience is in large files - I think the idea was that you store a diff on the end of the document, rather than rewriting the entire document to disk. You can disable this by unchecking the "Allow fast saves" option.

      Just in case you were actually interested in having your question answered.

      --
      Powered by Web3.5 RC 2
    8. Re:WHAT?!?? by BennyTheBall · · Score: 1
      just created a Word document, blah.doc and put some text into it. I made sure I had a couple of undo points. I closed it and opened it back up, I couldn't undo SHIT. So where the hell am I being granted this mysterious "convenience?"

      My guess is that they ar refering to the "track changes" feature. It basically highlights delted text, rather than actually deleting it. I believe it also uses different colors for every user who modified the document.

      Personally, I find this feature more annoying than useful but I can see how people could take advantage of it when a lot of users will update the same doc

    9. Re:WHAT?!?? by tzanger · · Score: 1

      s.l.a.s.h.d.o.t...o.r.g

      actually you were probably viewing unicode text in 8-bit charset. Not hidden, just obscured. :-)

    10. Re:WHAT?!?? by whoever57 · · Score: 1

      I just created a Word document, blah.doc and put some text into it. I made sure I had a couple of undo points. I closed it and opened it back up, I couldn't undo SHIT. So where the hell am I being granted this mysterious "convenience?"

      You have to turn on the "Track Changes" (under "tools") feature and then make some changes (then save it, etc.)

      --
      The real "Libtards" are the Libertarians!
    11. Re:WHAT?!?? by zedmelon · · Score: 1
      Hmmm interesting. I never realized that.
      BennyTheBall also has a good theory below.

      Thanks, guys.

      Just in case you were actually interested in having your question answered.

      Heh. Yeah, I know that's a legitimate concern on /.

      Also I found this post by 26199 quite interesting. Richard Stallman's reasoning for the dispensing of Werd documents for good. I like it.

      --
      Mom says my .sig can beat up your .sig.
    12. Re:WHAT?!?? by Pieroxy · · Score: 1

      Just a small post to notify you that (...).y.o.u.'.r.e. (...) is the correct wording.

      Thanks

    13. Re:WHAT?!?? by cicho · · Score: 1

      What you did was probably not enough to trigger the behavior the article describes. The article doesn't go into fine points of Word file format, and it's not only the undo function that presents a problem. Word's files are "compound documents" in MS-speak, actually a nifty idea, where a single file has a structure that resembles nested directories and files on disk (except the 'directories' inside a compound document are called storages and 'files' are called streams). The way Word optimizes writes to these files is that it doesn't constantly rewrite the whole file when you made a small or even a substantial change. If you replace a small chunk of text with a larger chunk, the old small chunk may well remain in the file and only be marked as deleted, or unused (much as files on disk when you delete them), while the new chunk is appended to the end of the file (or wherever it fits). There's a process built in to 'compress' such files by removing those unused spaces, but exactly when and how often Word does so I've no idea. So after *many* edits you may find that some old text remains in the physical file, although it's not accessible to you or Word.

      You may also notice that Word files don't grow by single bytes, but by bigger chunks. I suppose that Word simply grabs some unused disk space when the file needs to grow, but it does not clear that space, so a Word file could potentially contain any random data that happens to remain unerased on the disk space marked as free.

      --
      "Only the small secrets need to be protected. The big ones are kept secret by public incredulity." - Marshall McLuhan
    14. Re:WHAT?!?? by noctrl · · Score: 1

      I just created a Word document, blah.doc and put some text into it. I made sure I had a couple of undo points. I closed it and opened it back up, I couldn't undo SHIT. So where the hell am I being granted this mysterious "convenience?"

      Yeah broder!

      Is there a way to reactivate "undo" in a saved .doc ??? !

    15. Re:WHAT?!?? by Anonymous Coward · · Score: 0

      More specifically, UTF-16. Unicode comes in a variety of encodings.

    16. Re:WHAT?!?? by malfunct · · Score: 1
      The features were added for buiness convienience, unfortunately at the time security and privacy implications of the features were not a high priority. Now we are reaping the rewards.

      So basically in some ways you want to be mad at MS (and any of the other companies that did similar things) but in other ways you have to give them some credit for trying to do things that users wanted. Its a case where you can't please everyone so you have to figure out who is most important to please. More importantly you have to second guess and see if the feature that the user requested is actually something they want, and whether they are willing to deal with any of its side effects. Its a pain.

      --

      "You can now flame me, I am full of love,"

    17. Re:WHAT?!?? by 42forty-two42 · · Score: 1
      All were hidden from the simple searches like this: s.l.a.s.h.d.o.t...o.r.g
      That's UTF-16 unicode encoding. Microsoft Office uses it by default, as do all sort of other windows programs.
    18. Re:WHAT?!?? by FingerDemon · · Score: 1
      Is there a way to reactivate "undo" in a saved .doc ??? !
      Yeah, I just tried this. I typed a message with "sensitive info" in it and saved it. Then deleted that part and saved the file again. Reopen that word doc in even something as crappy as notepad and you can look for "sensitive" and find it. After copying and pasting this into a different word file, I can't find it in Notepad. (Not that notepad is so terribly sophisticated that it rules out "sensitive info" still being around.)

      My wife works at a law firm and they had major problems with this, since the features were turned on by default for their install. That may have been a choice of their IT dept., or Microsoft, I don't know. I thought they said that even copy and paste to a new doc wasn't secure enough, but I'm not sure of the details on that.

      I don't think it is Microsoft being so crappy that they coded this feature in, like it was a bug. I think it is an incredible feature... that they didn't clearly inform the users about or give nearly enough control over.
      --

      "Contrarily the lookaside buffer might not be the panacea... "
  4. Prediction by JessLeah · · Score: 2, Insightful

    This will become a common way for 'big' corps to spy on 'small' corps (and individual users?), to find new ways to both screw them over, and appear 'omniscient'. They'll never (or rarely) get called on it. Meanwhile, anyone who tries to reveal information discovered in this way which is incriminating towards said big corps will get sued for being "hackers" and/or "terrorists".

    1. Re:Prediction by TopShelf · · Score: 2, Insightful

      This is "Insightful"??? Yeesh!

      I had no idea that the sloppy handling of non-displayed data in output files (not just Word, mind you), and their publication on the web was actually Another Way For The Man To Keep Us Down...

      --
      Stop by my site where I write about ERP systems & more
    2. Re:Prediction by Spunk · · Score: 2, Insightful
      You don't think that it's possible?

      I recall an article (possibly here) about companies using this "feature" on job applicants to read what was in previous versions. For example, you overwrite this letter

      Dear IBM,
      Thanks for the Linux job offer. Gimme $60,000 and I'm yours.
      Love, Spunk
      with
      Dear Microsoft,
      You guys are the best. I'll take that C# coder job for $70,000.
      Love, Spunk
      It would be easy for MS to see that you are asking IBM for $10,000 less. Letter-writing skills notwithstanding, I don't expect this would help your negotiating position.
    3. Re:Prediction by TopShelf · · Score: 1

      The point is, there's nothing about this issue that has anything to do with big corporations vs. individuals/small companies. It's a simple matter of documents posted on the web containing more info than their posters' probably realize. You could just as well address those letters to "Joe's Local ISP," and the same thing could happen...

      --
      Stop by my site where I write about ERP systems & more
    4. Re:Prediction by Spunk · · Score: 1

      Aha, I see what you're saying now. True.

  5. It's been said hundreds if not thousands of times: by NightSpots · · Score: 5, Insightful

    It doesn't matter how good your corporate security is if you don't train your users (including managers) in basic security practices.

    Lots of people put sensitive documents in public webspace, primarily because they don't know any better. Eventually the cost-benefit analysis will be done, and corporations will pay to have their users trained. Until then, this type of thing will continue to happen.

  6. P2P has be doing htis for a long itme by genner · · Score: 1

    Idiotic use of P2P software has lead to stuff like this more than once. People often overlook the document tab on Kazaa.

    1. Re:P2P has be doing htis for a long itme by ComaVN · · Score: 2, Interesting

      Indeed. Search for system.dat, user.dat or pwl on Kazaa, there are always some files found.

      Although I cannot guess how many of those are honeypots.

      --
      Be wary of any facts that confirm your opinion.
    2. Re:P2P has be doing htis for a long itme by Chess_the_cat · · Score: 1

      Even better, search for *.eml.

      --
      Support the First Amendment. Read at -1
    3. Re:P2P has be doing htis for a long itme by mccrew · · Score: 1

      From the be-careful-what-you-search-for department:
      Search for *anything* on Kazaa, and there are always some files found. However, upon closer inspection, you'll see that they really are hidden pornography URLs, viruses, and other poison payloads, definitely not what you were really looking for.

      --
      Hey, Windows users, there is no such thing as "forward" slash, there is only slash and backslash.
    4. Re:P2P has be doing htis for a long itme by Reziac · · Score: 1

      Aside from the "poison files" another reply mentions, there was once a bug in kazaa that when it was running on a box using an oriental language version of Windows (not sure which one, but Thai was affected for sure), ALL files would be "shared" whether the user intended it or not. AFAIK it was never addressed as a bug, but I deduced its existence from the fact that I never saw it on other non-English boxes (so it wasn't just misconfiguration by non-English speakers), and the consistency with which I saw examples.

      --
      ~REZ~ #43301. Who'd fake being me anyway?
    5. Re:P2P has be doing htis for a long itme by ComaVN · · Score: 1

      I never downloaded it, but I'm pretty sure a 8MB system.dat file is NOT a virus.

      In general however, you are of course correct.

      --
      Be wary of any facts that confirm your opinion.
  7. I thought this was common knowledge? by 26199 · · Score: 4, Interesting

    Well, it is amongst people who object to being mailed Word documents, anyway. They're just a really bad format for publishing information in.

    See Richard Stallman's 'no-word-attachments' article, for example...

    1. Re:I thought this was common knowledge? by broken.data · · Score: 3, Interesting

      This is not limited to Word. This trick has been around for ages with PDF and everything else I can think of.

      Hell, this is how slashdot figured out that the Microsoft Switch was a fake.

    2. Re:I thought this was common knowledge? by worm+eater · · Score: 1

      You can't count on anything being 'common knowledge' among many of today's office workers.

      Example:
      "My document won't open..."
      "Type of file is it?"
      "Huh?"
      "Which application was the file made in?"
      "Microsoft."
      "Which Microsoft application?"
      etc.

      The user should be somewhat responsible for knowing how to use the machine, just like with any other machine. However, many of today's operation systems and applications were designed so that the user doesn't have to think or know about what they are doing. So be it. Who is going to think for them? Microsoft? Obviously in this case, Microsoft failed to think through what they were doing 'for the user.' This is a common problem with them... AutoCorrect in Word drives me nuts. I know I can turn that off, but many features like this cannot be turned off.

      --
      Maybe partying will help...
    3. Re:I thought this was common knowledge? by Aidtopia · · Score: 2, Funny

      My friend go so tired of people on his team sending him word docs, that he learned TeX and started sending his replies that way. When he feels really nasty about it, he sends the .dvi files.

    4. Re:I thought this was common knowledge? by tchapin · · Score: 1
      I got a copy of this article and put it on the web in a format that us Windows users can read.

      Todd

      --
      -- !todd erases a red dot! I steal music on the internet.
    5. Re:I thought this was common knowledge? by romfordofficesupplie · · Score: 1

      I've not used Word for Windows for some years and have only recently been using Word for Mac OS X. I wonder if there is any kind of set-up assistant (or wizard) that allows you to set all of these options.

      Any time I install Office, I have to go through those steps, disabling auto-correction and the assistant. Would be nice if there was some kind of setup-assistant to walk you through these options (including disabling the hidden information).

      Anyone know if this feature already exists?

  8. An Important Question by linuxislandsucks · · Score: 3, Interesting

    How many word processing progreams do place hidden meta data within theri formats?

    For example does OpenOffice/StarOffice and other open source programs have the saem security problem?

    --
    Don't Tread on OpenSource
    1. Re:An Important Question by Anonymous+Shepard · · Score: 1

      I don't think so. You can easily unzip an OOo .sxw document using WinZip, then open the individual xml files to see what they contain (and even edit them). I have done so; I haven't really checked for sensitive hidden data in them, but I don't believe there is anything unnecessary in there. And of course, an OOo file is just about a third the size of a Word .doc file.

      --
      I have a life. I really do. I've just chosen to ignore it.
    2. Re:An Important Question by __past__ · · Score: 4, Informative
      The OOo file format it just a bunch of zipped XML files, you can easily look for yourself. Deleted text is not in it, as it seems. Unless you turned on version tracking, of course.

      It does, however, save things like when the document was last printed, how often it has been edited and by whom, etc. unless you tell it otherwise. It's easy to get rid of the data (there is a huge "Delete" button in the properties dialog), but not many people will be aware of it.

      So, basically, if you don't know what you are doing, you could give out more information than you want to with you OOo files.

    3. Re:An Important Question by shaitand · · Score: 1

      copyleft = freedom
      bsdtype license = free to steal

    4. Re:An Important Question by Bullet-Dodger · · Score: 1
      copyleft = freedom
      bsdtype license = free to steal

      copyleft = freedom for code
      bsdtype license = freedom for coders

      Which is more free is a matter of opinion.

    5. Re:An Important Question by shaitand · · Score: 1

      copyleft = freedom for code and THE ONES WHO CODE IT
      bastype license = freedom for coder who would like to make that code their own.

      As to which is free in the sense of no cost for those who want to use it, without a doubt it's BSD. Those who release gpl'd code aren't working for free, they want their payment in code instead of cash. Considering human nature, I find the gpl to be an ideal that is definattely in closser harmony with a man, and at the same time still an ideal of what will benefit mankind.

      But as you said, it's a matter of opinion which is better.

    6. Re:An Important Question by WWWWolf · · Score: 1

      Interesting. I wish I had known this before I sent a short story I wrote for people to review. Now they see exactly how much coffee I needed to pull that together. =) But at least I'll probably also see their changes...

      And I think it's good that it stores the changes, though I would much prefer it if there would be some kind of external revision control mechanism - like this article shows, storing revisions in the file itself is not good, safe or even desirable. Has anyone tried to design a RCS/CVS-like system for OpenOffice.org? Since it's zipped XML, It wouldn't even be particularly challenging...

    7. Re:An Important Question by __past__ · · Score: 2, Informative
      At least the 1.1 betas have an option to save as "flat xml". The format is basically the same as the zipped one, but uncompressed and in a single file (binary files like embedded images seem to be base64 encoded).

      In principle there is no problem using that with any version management system, CVS, RCS, Subversion etc should work fine with it. You'll be more happy to have an XML-aware diff at least, though - my simple test doc ended up with all content in a single long line.

  9. Well... by CGP314 · · Score: 3, Funny

    Simon Byers, at AT&T's research laboratory in the US, was able to unearth hidden information from many thousands of Microsoft Word documents posted online using a few freely available software tools and some basic programming techniques.

    Are you going to share that info or what?

    Throw it up on freenet man!

    1. Re:Well... by rkz · · Score: 1

      well just do this i suppose.

      antiword -s blah.doc > /tmp/a.txt
      antiword blah.doc >/tmp/b.txt

      compare sizes(a.txt, b.txt);
      If different;

      diff a.txt b.txt

  10. in html by SHEENmaster · · Score: 1

    http://www.schneier.com/crypto-gram.html

    To do this yourself, just type:
    <a href="http://foo/">bar</a>

    --
    You can't judge a book by the way it wears its hair.
    1. Re:in html by Feyr · · Score: 0, Offtopic

      i know, i just don't care to provide clickable links.

      beside, i DID get first post didn't i? :)

    2. Re:in html by zedmelon · · Score: 1
      Heheh, yeah, and you probably oughta send an apology to the admins at schneier.com. I just subscribed to their newsletter with a sendmail -v and watched the progress, and it was S-L-O-W.
      ;)

      Fire when ready.
      Slashdotting in progress, sir.

      --
      Mom says my .sig can beat up your .sig.
  11. infrastructure data? by at_kernel_99 · · Score: 2, Offtopic

    An accomplished searcher can learn much about the world we live in, as slashdot reported some time ago.

    An interesting reminder, to be sure, given yesterday's blackout.

    Makes a guy wonder just how much is still available regarding key electrical and telephone infrastructure. Emergency power capabilities of broadcasters (radio, television, mobile phone). Gas lines, in the parts of the country that have them. Water systems. There's likely a bunch of data out there, ready to be mined.

  12. LaTeX by ParadigmLA · · Score: 4, Funny

    Everyone should just be forced to use LATeX and then there won't be any hidden information. . .

    1. Re:LaTeX by GarvMaster · · Score: 5, Funny

      Because 99.9% of the world would go back to pen and paper

    2. Re:LaTeX by Anonymous Coward · · Score: 0

      \documentclass{article}
      \begin{document}
      \sectio n{Beyond normal people, shirley}
      Yeah, those scientists who manage to use \LaTeX must be fucking geniuses.
      I have no idea how they can use a programming language to write documents
      and still stay sane. How on earth can they spend time fiddling with all
      that unimportant gobbledygook when they should be writing the text and
      thinking about the ideas they want to get across, is completely beyond me.
      \end{document}

    3. Re:LaTeX by Anonymous Coward · · Score: 0

      Would that even work?
      "\sectio n{Beyond normal people, shirley}"
      My guess is your text document will have a compiler error!

    4. Re:LaTeX by WWWWolf · · Score: 1

      Ah, but LaTeX supports % comments. Which is unfortunate, because when people know they can leave Notes For Themselves, they sometimes write stupid things in them.

      Ask around if anyone's ever found odd comments from corporate source code, or worse, released source code...

      (A simple example: "Fuck it, I do it in perl. Fucking sh drives me nuts." - 3Delight 0.9.6 Linux install script)

    5. Re:LaTeX by Anonymous Coward · · Score: 0

      Why? whats the difference if word is spitting out its normal format or Latex if the front end is the exact same.

  13. OMG by Anonymous Coward · · Score: 3, Funny
    Stupid people messing stuff up? I'm SHOCKED!

    How long until someone blames Microsoft, I wonder...

    1. Re:OMG by psoriac · · Score: 1

      You forget, this is Slashdot. Blame of Microsoft is understood to apply whenever possible.

      --
      I browse Slashdot at +3, Funny
    2. Re: OMG by Black+Parrot · · Score: 1


      > Stupid people messing stuff up? I'm SHOCKED!

      I, for one, welcome our new fucktard overlords!

      --
      Sheesh, evil *and* a jerk. -- Jade
  14. Re:It's been said hundreds if not thousands of tim by TMB · · Score: 5, Insightful

    Sure, but they point they're making is that it's not intuitively obvious to most people that there could be text in a Word document other than what appears.

    So a relatively security-conscious person who just doesn't know anything about Word file formats could easily publish something online on purpose without knowing that there is (invisible) sensitive information in it, even if they'd never put that information in a public place on purpose.

    [TMB]

  15. Here's An Idea by msblaster.exe · · Score: 0

    Some needs to script a open source program that searches for this information on the web. Could you imagine knowing all the financial information about microsoft. We could use this information against companies we dont like and the open source revolation will spawn to mass amounts.

    1. Re:Here's An Idea by shaitand · · Score: 1

      Microsoft has alot of money, don't you think they can afford to buy a decent wordprocessor instead of word?

      oh wait...

  16. True story. by oni · · Score: 4, Interesting

    A sysadmin once sent me a form letter type thing with my new password in it. The username/password was a spreadsheet object and I was able to open it to see everyone's passwords. He changed them all when I pointed this out. BTW, why do people send email messages that just say "see attached file" and the attached file is a memo with some trival content that could have been the text of the email??

    Anyway, I have to admit that I was also burned by word. I was in the habit of opening the last memo I wrote from the recent documents list and using it as the starting point for newer ones. At some point, I put a bunch of policy statements on a CD and was later told that everyone was reading the hidden text. Doh!

    This was back in the days of office 97 I believe. I'm not sure if Office 2k or XP still have this feature/bug.

    1. Re:True story. by DrSkwid · · Score: 2, Informative

      why do people send email messages that just say "see attached file"

      because they select "send document" form the file menu and get a blank email with the document attached

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    2. Re:True story. by homer_ca · · Score: 3, Interesting

      Saving Word to HTML gets rid of the hidden text, but it does still save Author information. I got this HTML spam where he saved a Word file to HTML and sent that as the message. Sure enough, the dumbass's real name was in the source as the author.

    3. Re:True story. by Li0n · · Score: 1

      Even worse is when people send me a 4 MB email because they attached a word document with a pasted BMP screenshot of a full 1280x1024 desktop to see the text "404 page not found".

      --

      ~
      ~
      :wq
    4. Re:True story. by Anonymous Coward · · Score: 0
      BTW, why do people send email messages that just say "see attached file" and the attached file is a memo with some trival content that could have been the text of the email??

      Uh, I dunno... because it's faster than cut and paste? Because it preserves metadata?

    5. Re:True story. by bailster · · Score: 1

      In response to these kinds of file size/convenience/privacy concerns, some people use various shortcuts to get Word-generated text into emails sent through Outlook. (Also, if you do that, the recipient can listen to the email's contents using remote access systems, etc., that may not work with attachments.)

      Question: if you cut and paste (MS) Word text into an (MS) Outlook email message, how much of the hidden/deleted data will follow along into the email? Does the answer depend on the "format" of the email (html, etc)?

      What about if you use the "send as email" feature in Word?

      --
      ...
  17. Dang... by DarkBlackFox · · Score: 4, Funny

    Remind me not to save my importand documents to C:\My Documents\Porn\Annual Budget Report.doc anymore.

    1. Re:Dang... by Anonymous Coward · · Score: 0

      Here at our office we thought about hax0ring some people drives with porn pics and change them to pictures of praying nuns.

      It's kind of a defacement... just the other way :-)

  18. Tools by rf0 · · Score: 1

    See Google. It can read word/pdf etc. Sure there is a mountain of information there if you look

    Rus

  19. Job Recruiters by Anonymous Coward · · Score: 5, Interesting

    I have received two such word documents from two seperate job recruiters. The actual companies looking for the employee were hidden in the document, as well as contact information for the person at the company. Screw the middle man

    1. Re:Job Recruiters by Anonymous Coward · · Score: 0

      Hey, sometimes even on job search websites it is worth looking at the HTML source of a job from a recruiter - there can be extra info in there hidden...

    2. Re:Job Recruiters by bird · · Score: 3, Funny

      Back in 1997, we were interviewing my putative replacement, and one fine fellow sent us a Word resume and cover letter. In the cover letter, he shared with us the delightful sentiment that-- while he was interviewing several other places (1997, remember), we were his current top choice.

      A colleague on the review team who didn't use Windows turned to strings(1) to get the data from these documents, which yielded us the information that a *lot* of this guy's other prospects were also his current top choice. Maybe it was true every time he wrote it, but... I hate to think... could he have been trying to *manipulate* us?

    3. Re:Job Recruiters by Anonymous Coward · · Score: 0

      I got two .DOC files from headhunters. The first one I couldn't read because I didn't have the correct version of Wurd. The other was about 100K -- 5K for the text and 95K for the company's logo graphic. 100K doesn't sound like much, but I live in the country and I have a slow dial-up connection, so it pissed me off.

      Idiots. Use a resource editor to rename "Notepad" to "Microsoft Word Lite", and make 'em use that.

    4. Re:Job Recruiters by Sparkle · · Score: 0

      Indeed. Underscores why m$word files should not be used on the web, besides the compatibility issues.

      A few years ago when I took my last job, the job offer came in m$word format. Lacking the correct product, I used strings(1) on it to see what it said and was able to see 4 or 5 previous offers to 4 or 5 previous candidates. When I learned what my offer really said and expressed disappointment to my headhunter, I was able to get a better starting bonus. Very nice.

    5. Re:Job Recruiters by tealover · · Score: 1

      Wait a minute. Your colleague "innocently" used strings to get the data because he didn't use Windows...and accidentally became privy to private information ?

      I hope you guys didn't offer him the job because he's better off not working with sneaky bastards like you guys.

      --
      -- You see, there would be these conclusions that you could jump to
  20. their fucking spawn of satan track changes by Unknown+Poltroon · · Score: 0, Troll

    is probably whats doing this. GOD it sucks. Theyve manged to make it more confusing and less useable in XP than ever before. You ever tried to tell a user what to click on in a toolbar? WTF happened to putting the goddamn command in the toolbar???

    --
    All Troll + "offtopic" mods are meta moderated as "Unfair", because you abused the system.
  21. What exactly's the big deal here? by GillBates0 · · Score: 0
    Using an ordinary online search engine and a random selection of keywords, Byers was able to find more than 100,000 Word documents including business documents and individual resumes.

    No really, what IS the big deal? So supposedly, he did an online search, and did some text-extraction from Word docs, which Google helpfully does for you anyway, and came up with some "secrets" which were published online anyway, thus contradicting the term itself. Google also indexes PDF, DOC, PPT and many other formats anyway.

    Moreover, if the information was indexed, it was either put online intentionally (either because it wasn't secret data, or out of malicious intent), or unintentionally. The latter case was probably because of poor sysadmining/webmastering, which isn't a big secret anyway.

    Sorry for the sorry rant, but it's yet another friday evening with nothing to do.

    --
    An Indian-American Hindu committed to non-violent thought/speech/action alarmed by the global explosion of radical Islam
    1. Re:What exactly's the big deal here? by Li0n · · Score: 1

      It isn't the text but the metadata what's this all about.

      For example, the article mentions the case of the Washington sniper. Some data of a published PDF was blacked out but if you opened the file in a text editor, you could still see the fields.

      --

      ~
      ~
      :wq
    2. Re:What exactly's the big deal here? by William+Tanksley · · Score: 2, Informative

      Incorrect. You didn't read the article.

      He did the search, as you said, but he didn't use Google's conversion; instead, he looked directly inside the DOC file, where Word keeps a bunch of information for its own purposes -- stuff that was deleted, stuff that was just in the wrong memory location when the save happened -- whatever.

      He found legitimate docs, with legit contents; but they also contained some stuff that the authors didn't intend to publish.

      -Billy

    3. Re:What exactly's the big deal here? by B3ryllium · · Score: 1

      I haven't read the article - nor do I intend to - but I gather that the information in question is NOT displayed when viewing the document in Word, but has been "deleted" by the user. Word preserves the deleted text, it seems, as part of the undo feature - and yet, the undo feature doesn't work if you have freshly opened a file.

      Sort of like deleting password emails from Outlook Express, and then having someone retreive them from the .mbx file despite the fact that you deleted them.

    4. Re:What exactly's the big deal here? by ratfynk · · Score: 1

      So you load a word .doc it contains some nice little mail macros and bingo because you are stupid enough to use an unpatched illegal copy of word 97, (which is what one hell of alot of small business owners do); MS word to open it you get slammed. End of story. Go buy longhorny and help microsoft control stupid users! Pay Pay Pay

      --
      OH THE SHAME I fell off the wagon and use sigs again!
    5. Re:What exactly's the big deal here? by Anonymous Coward · · Score: 0

      No shit...

      133 000 hits

  22. Helpful Hint by cgreuter · · Score: 4, Funny

    Remember kids: strings is your friend. If you happen to get a job offer in the form of a Word document and the HR drone who sent it to you wasn't careful, you can often see the version that got sent to other candidates and, more importantly, how much money they were offered. It can do wonders for your bargaining position.

    1. Re:Helpful Hint by Anonymous Coward · · Score: 0

      how? How?! HOW!??!?!?!

    2. Re:Helpful Hint by Anonymous Coward · · Score: 0

      By using strings, you fuck.

    3. Re:Helpful Hint by Anonymous Coward · · Score: 0

      C:\> strings.exe document_name

    4. Re:Helpful Hint by Reziac · · Score: 2, Informative

      Or for us DOS folks, there's XRay, last seen floating around Simtel (xray102.zip or something like that, in /textutils). It does a nice job pulling text strings out of any binary, and redirects handily to a file or your fave viewer (frex, LIST). I've used it to retrieve the complete content from a Word document that was hopelessly corrupted, and to see what fun was to be had in another document's "deleted" space.

      XRAY is also handy for pulling text out of executables. Frex, a brief rant about upper management, found lurking inside an .EXE from an old version of Paradox. :)

      Or if you're used to looking at raw binaries, skip the middleman and just use Buerg's LIST, as I do. :)

      --
      ~REZ~ #43301. Who'd fake being me anyway?
  23. passwords.txt by mindsuck · · Score: 1

    Oh, so I'm not supposed to save all my important passwords in plaintext in a clearly marked "passwords.txt" file in my webserver for easy access?

    Oh damn.

    --
    --- I w00t, therefore I'm l33t.
  24. I'm seeing a weird problem... by Anonymous Coward · · Score: 0
    I think it might be a new virus attack.

    It seems that goatse is down.

    Can anyone check to see if it's still working?

    thx!

    1. Re:I'm seeing a weird problem... by Anonymous Coward · · Score: 0

      It is not down! I have been reloading it, just to make sure. I IM it to all my friends, and they all can see it... are you sure there is a problem with goatse or with your computer? Let me check once more... yep, it is online. Once more... yep! online!

    2. Re:I'm seeing a weird problem... by Anonymous Coward · · Score: 0

      You're full of shit. I have it set as my homepage.

  25. Re:Slow news day by zcat_NZ · · Score: 1

    Anyone wants to guess what's the second most dangerous animal for human beings?

    Other human beings?

    --
    455fe10422ca29c4933f95052b792ab2
  26. Not just documents by I8TheWorm · · Score: 3, Informative

    It doesn't pertain to just documents. I've seen code samples posted to sites like experts-exchange where DB connection strings still had UID and PW data in them. Seems people don't re-read before they post very often.

    --
    Saying Android is a family of phones is akin to saying Linux is a family of PCs.
  27. Clippy did it by sbillard · · Score: 5, Funny

    It looks like you're trying to post a document on the web.
    Would you like to...
    1. Divulge corporate secrets?
    2. List your passwords?
    3. Remove KB823980 and open port 135?


    It looks like your trying to close Clippy.
    Would you like to...
    1. Shit in your hat?
    2. Put fist through bling bling flat panel?
    3. Go home for teh weekend?

    1. Re:Clippy did it by Anonymous Coward · · Score: 0

      Clippy was sure annoying, but at least he spelled "you're" correctly.

  28. Re:Slow news day by Anonymous Coward · · Score: 0

    It's probably something like a jellyfish. But I'm going to guess that it's those vile bugs that live inside the beards of Linux Communist Hippies.

  29. eh? by DrSkwid · · Score: 2, Interesting

    google indexed PDF documents, it even turns them into HTML

    of course you could always try http://searchpdf.adobe.com/

    Now there's a way to search through more than a million summaries of Adobe(R) Portable Document Format (PDF) files on the Web. Your search results will allow you to see the summaries before deciding to view the original Adobe PDF.

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    1. Re:eh? by I8TheWorm · · Score: 1

      Crap! Maybe I should use google more often then.

      --
      Saying Android is a family of phones is akin to saying Linux is a family of PCs.
    2. Re:eh? by DrSkwid · · Score: 1

      once would work

      8)

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
  30. google by FFON · · Score: 0

    how did he get paid so much to his research?
    i'm a doofus and came up with this search
    "index of" secret .doc
    google it baby

    --
    .cig
  31. Check this out... by Geminatron · · Score: 4, Informative

    View some of the past word docs you've received in a hex editor...

    Near the bottom there is often information from other documents of the sender that they were recently working on. I don't know why it saves this. Maybe something to do with the undo buffer?

    At work I used to look at internal memos that would be sent out on a weekly basis and find out all sorts of other stuff that was going on.

    1. Re:Check this out... by Anonymous Coward · · Score: 0

      Quite possibly uninitialized memory.

    2. Re:Check this out... by Reziac · · Score: 1

      Word thinks it needs some parts of its document format padded out to a certain size. To this end, it grabs random junk from RAM, other documents, or even the swapfile.

      I've seen a case where someone whoopsidentally send a .DOC to a mailing list. Upon decoding it, I discovered not only the intended innocent joke, but also some bad porn written by the sender's boss, grabbed from gods know where (not originally part of this document).

      --
      ~REZ~ #43301. Who'd fake being me anyway?
  32. Repost or just a big DUH! by jriskin · · Score: 0, Redundant

    You mean word documents/[INSERT alternative MS product here] may contain crap you didn't think they did?! NO SH*T...

    This has been a problem and reported for years now...

    It's called Save As...

    Oh? You say end users can't be trusted to understand technology and or can't be trusted to dispose of or not reveal sensitive information? Another...DUH!

  33. OH NO! by SatanicPuppy · · Score: 2, Insightful

    NOT MY PERSONAL INFO! NOOOOOOOOO!

    This isn't just nothing new, it's old news. Wasn't this how they caught the guy who wrote the melissa virus? When that little popup window from MS Office came up asking for their personal info, did they just think Office was trying to get to know them better, in order to be their friend?

    It's just silly pressmongering. Those dumbasses have to come up with a terrifying computer factoid every day, or the ignorant compu-phobes they prey on might come to their senses.

    Just my opinion.

    --
    ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    1. Re:OH NO! by Gonoff · · Score: 1

      I read the contents of that when It started arriving on our fileservers. It was pretty obvious that the guy who wrote it wasn't particularly worried about covering his tracks.

      --
      I'll see your Constitution and raise you a Queen.
    2. Re:OH NO! by Anonymous Coward · · Score: 0
      Wasn't this how they caught the guy who wrote the melissa virus?


      Can't say for sure, but this is how slashdotters found out the identify of the firm and the person who drafted the "Microsoft Switch" campaign. Through the word document that was linked.
  34. Old News! by ivanmarsh · · Score: 1

    The fact that MS "productivity" products store user information in the files they produce hit the news very shortly after MS Office '95 came out.

    It seems no one much cared back then because MS has obviously left this serious security flaw in their software.

    Imagine that?

    1. Re:Old News! by ivanmarsh · · Score: 1

      Thought I'd follow that up with a link from the hourse's mouth:

      http://msdn.microsoft.com/library/default.asp?ur l= /library/en-us/dnword2k2/html/odc_ProtectWord.asp

    2. Re:Old News! by ratfynk · · Score: 1

      The reason is obvious why they leave the security flaws intact. They are going to need to sell LONGHORNY big time to show corporated software sales growth to the stock holders. Their Xbox is still costing them money every time someone buys one, they are not making huge enough money from educational certification extortion any more, they are having a hard time keeping up with the number of software violation of patent law suits. They realy need to eat Symantec's lunch again without getting sued to the nines. They need Longhorn to sell in the Billions as soon as it is released, otherwise their paultry 30% margins will start to dry up. Their margins were over 40% before the 2000 dot com balloon blew up. They might have to settle for normal corporate profit levels if Longhorn takes to long to weasel its way into every computer.

      --
      OH THE SHAME I fell off the wagon and use sigs again!
  35. My 2c.. and a terrible pun. by zcat_NZ · · Score: 4, Interesting

    It's only going to get worse; google's really expanded on the number of File types it indexes and caches.

    One of my clients was recently caught out when google indexed private metadata she didn't know was still there, so I can well understand the gravity of this situation.

    --
    455fe10422ca29c4933f95052b792ab2
    1. Re:My 2c.. and a terrible pun. by randyest · · Score: 2, Interesting

      Whoa, that's very cool. I love it when I learn a new google goodie.

      If you didn't try that 'gravity' link in the parent, check this out. Google calculator -- takes input in standard algebraic format, and knows some variables and units too (such as "G" being the universal gravitational constant, "mass of earth", and "radius of earth"), so you can just use the variable name and google fills in the values, converts units as needed, and gives a numeric result. Nice.

      However, unless I'm doing something wrong or they're stil updating, the known variables seem rather limited. ( population of china ) / ( surface are of earth) didn't work. Neither did ( 1 barleycorn ) / (1 mm).

      Anyone have tips on this new google gem?

      --
      everything in moderation
    2. Re:My 2c.. and a terrible pun. by inerte · · Score: 1

      Anyone have tips on this new google gem?

      Look, look:

      http://www.kuro5hin.org/story/2003/8/14/21307/51 89

    3. Re:My 2c.. and a terrible pun. by pherris · · Score: 1

      Just to build on this thought try searching google for +"internal use only" filetype:doc or +"internal use only" site:.gov filetype:doc.

      --
      "And a voice was screaming: 'Holy Jesus! What are these goddamn animals?'" - HST
  36. Basic programming techniques, eh? by Xeth · · Score: 1

    Is that all it takes to hack into Microsoft's file servers anymore?

    --
    If your theory is different from practice, then your theory is wrong.
  37. i have my own special program that does this... by jkitchel · · Score: 4, Funny


    it's called http://www.google.com and you search by "top secret documents filetype:doc".

  38. Is this news to anyone? by Anonymous Coward · · Score: 0

    I mean, c'mon. Anyone with half a brain can open a M$ Word document in a plain text editor and without and work whatsoever find out what the SMB name of that computer is, on what drive the OS is installed, what printer and what the name of that printer is and so on and so forth.

    If this is newsworthy, so is telling "If you press F3 in Explorer, or if you start DirectPlay, your computer will try to connect to Microsoft servers to do stuff behind your back".

  39. It's easy... by inertia187 · · Score: 4, Informative
    This is the easy way:
    "Index of" "Name Last modified Size Description"
    Then you add file extensions or other things. For example:Anyway, as you can see, it's pretty effective. Sometimes admins wise up, and all you have is the Google cache. But sometimes they don't, and you get to look. Thanks Google!
    --
    A programmer is a machine for converting coffee into code.
    1. Re:It's easy... by Anonymous Coward · · Score: 1, Funny

      Wow... all i have to say is wow. so i figured it'd be fun to go find kids grades and i added "grades" to your "my documents" link. guess what popped up in google? Some teacher's affair. This is kind of scary when you think about it....

    2. Re:It's easy... by CSG_SurferDude · · Score: 1

      OMG!!!

      I'm ashamed to say that I never even thought of that one.

      Thanks for the it, Now I have a chance of completing my bootleg collection of Pink Floyd albums ;-)

    3. Re:It's easy... by tfazzone · · Score: 1

      here's another one to try in google: internal use only confidential

    4. Re:It's easy... by Reziac · · Score: 1

      On substituting "hidden" for "secret", one of the first results was this amusing bit:

      Index of /Courses/S03/ECSE-4961/Finding Hidden Things

      Well, yeah, that was the whole idea... ;)

      --
      ~REZ~ #43301. Who'd fake being me anyway?
    5. Re:It's easy... by nicky_d · · Score: 1

      Of course, once porn vendors latch onto this (they seem to have been onto the "index of" part for a while), going through the results will take long enough to invalidate the technique. Unless you add something like "-blowjob", I guess. Ah, Google wins again. Don't forger you can specify the above search strings as "intitle:" phrases to focus the results a little more.

    6. Re:It's easy... by inertia187 · · Score: 1

      Actually, the porn vendors have to be careful to keep things well behaved. For example, if they use a real Apache directory listing, they have to stock it full of html files, which are easy to spot an ignore. But if they use a copy of an Apache directory listing and modify it so that it can do popups, there might be a way to get google to filter it. For instance, an Apache directory listing has no business having a element. I just don't know if Google will let you filter based on HTML elements.

      --
      A programmer is a machine for converting coffee into code.
    7. Re:It's easy... by mcrandello · · Score: 1

      I always added "Parent Directory" in to make sure I'm getting a real index. May help weed out porn pages as someone above me suggested may happen.

  40. I hate to state the obvious but, by pair-a-noyd · · Score: 2, Insightful

    how many incidents will it take before people realize that ALL Microsoft products are insecure?

    What will it take? What happens when a script kiddie hacks a hospital and shuts down the life support systems in ICU? Or just juggles the meds for the patients so that everyone in the hospital gets the wrong meds?

    Or perhaps they glitch the Air Traffic Control system and airplanes rain down from the sky and tens or hundreds of thousands of people die??

    Before the last war in Iraq started they showed the "state of the art" US command center just across the border in a big tent.

    Tens of dozens or more, soldiers and dozens upon dozens of PC's. You could clearly see on the displays that they were *ALL* running Windows.

    I though, "Oh shit, the security of this country is being placed in the trust of the worst product ever..."

    Those PC's I saw were NOT Tempest, for one, and then add the Windows factor in plus the state of war and you're asking for serious trouble.

    Windows will at some point cause a massive catastrophe and cause great loss of life and property. You can bet on it.

    This country is far too dependent upon computers to operate. When the computer goes down, well, sit on your hands for awhile...
    I remember the days before computers, everyone got things done just fine. Now no one knows how to function without them..

    1. Re:I hate to state the obvious but, by lyingidle · · Score: 1

      Ok,

      I have worked in Hospital's IS departments and have also spent some time in the Army, and in neither place was any critical system run by Windows machines. What you do see are lots of cheap Windows boxes using terminals to access UNIX machines.

      Think about it. Currently it's alot easier to support a bunch of Windows desktops than Unix or Linux desktops. More people are familiar with them. Most users know how to launch applications in windows (Double click the big "E"), all they need to learn now is one more app, THE ONE RUNNING ON UNIX, being presented to them in a friendly Windows context.

      Done

      - Learn from the mistakes of others. You can't live long enought to make them all yourself.

    2. Re:I hate to state the obvious but, by GigsVT · · Score: 1

      alot easier to support a bunch of Windows desktops than Unix or Linux desktops

      You picked a bad day to say that, after thousands of us spent last week running around putting patches on Windows machines.

      A simple shell script would have done it all automatically had those clients been anything other than Windows.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
  41. Re:Mleh by shades66 · · Score: 1

    The next edition of Office 2003 will include tools that will allow users to remove personal information from a document. It will also include new "information rights management" that will let an author specify who can read or forward a document."

    So microsoft are going to add tools to remove what shouldn't be there in the first place? Can't they just fix their software to not include it in a first place! What is next? INTERNET EXPLORER 2008 will have a new feature that allows the user to stop virus's being automatically executed. Order now at $399 to protect your computer from our shoddy software!

    --
    ---- There are 10 types of people in the world. Those that understand binary and those that don't
  42. Don't worry by ratfynk · · Score: 2, Interesting

    Gates and co will take care of all your sensitive info, very soon. With the help of the DMCA Sen. Fritz and MS servers we all will be so secure that no one other than MS and the right Government agencies will be able to unlock your lock online .docs. So smarten up bow to Redmond and pay up suckers! Its upgrade or lose mania time again can your business not afford the wonderfull new security thats coming? Good luck getting your secretaries to use anything other than MS orafice!

    --
    OH THE SHAME I fell off the wagon and use sigs again!
    1. Re:Don't worry by ratfynk · · Score: 1

      Wow I pulled a good spelling error SECRETARIES, a new job classification, for employees who do all your dirty work and keep their mouths shut!

      --
      OH THE SHAME I fell off the wagon and use sigs again!
  43. Update: Because he confessed... by finallyHasANickname · · Score: 1
    ...that he did that in public, a swarm of black helicopters arrived immediately in front of the laboratory. He was promptly hauled to jail for violating the DMCA.

    At the courthouse, all people who use Google and who got caught were also standing in a queue to await indictment proceedings before a federal grand jury...

    Microsoft blames the slowup on commodity protocols and recommends MS-Jailer 2.0 (just released) to speed up the whole process "with only one degree of separation".

    Scott McNealy's face turned red, and he proclaimed, "Look. Those people are standing out there in the heat, which is about all you should expect with the power efficiency levels of throwback 32-bit CISC technology throwing the book at them and with Microsoft's jailing software countercompetitively tied to the kernel, which is in C++, not C, not Java, not standard, not open like..."

  44. Another way to find "secret" data by cnb · · Score: 4, Informative

    How many people actually protect their website
    statistics?

    Adding a simple /stat/ or /stats/ or a variation
    with a combination of "web" or the name of any of
    the common statistic generation programs gets you
    access to the statistics of a *lot* of websites.

    Then from the stats you could find any "hidden"
    data which is not linked on the site including
    internal company documents, girlfriend's nude
    photos or mp3s.

    Alternately you could just google for the
    statistic reports of sites and get there
    more easily.

    This is another case of ill informed or lazy
    users not following what should be a simple
    security policy which could cause serious
    repercussions.

    For those who want to know how to protect
    yourself, read this link.

  45. Word documents are stupid. by rice_burners_suck · · Score: 1
    This is WONDERFUL!!! This information should be pointed out to those annoying people who email you those annoying Microsoft Word documents, when the content could have been presented just as effectively (or more so) in plain ASCII text.

    But instead of explaining it all technical and telling people how they can strip private information, you should use Microsoft's own techniques of FUD against them by telling people that Microsoft Word files contain all their private information and that information is gathered into a database by a ring of 1337 h4x0rz around the world, who then use the information to steal your credit card numbers.

    People are so stupid that they will actually believe that.

    1. Re:Word documents are stupid. by bsd+troll · · Score: 0

      But not stupid enough to listen to you, fortunately.

  46. Re:flamebait ?!?? by zedmelon · · Score: 1

    Kiss my ass. How is this flamebait?

    --
    Mom says my .sig can beat up your .sig.
  47. MS Word got Tony Blair busted in the WMD case by leoaugust · · Score: 5, Informative

    Tony Blair got busted in the WMD case because of the names of the people who revised the WMD Documents were still in the Word file. Now, it seems, that the Downing Street only puts PDF files on the web - and has removed all the MS word documents that were already there ....

    Tools reveal secret life of documents - Documents like in Word save too much Info - Blair Episode

    By Mark Ward

    July 03, 2003

    The UK Government was just the latest in a long line of organisations that has learned to its cost just how much information can be gleaned from innocent looking files. Earlier this year it issued a document called the 'dodgy dossier" about Iraq's concealment of weapons of mass destruction that was written using Microsoft Word. Every Word document remembers who made the last few revisions to it. The log reveals the names of four of the people who prepared the Iraq document for publication and the government Communications Information Centre that some of them work for. It was this log that Number 10 press chief Alastair Campbell had to explain to the House of Commons Foreign Affairs Select Committee in late June as part of its investigation into the Iraq dossier's history. Some of this information can be seen simply by right-clicking to view the properties of the downloaded document in a file listing. Utility programs can get even more information from Word revision logs.

    The life stories of the documents we create are becoming increasingly important as the scrutiny of industries and governments gathers pace. Every time you write or edit these files you leave a trail of information revealing what you did and when you did it. With the right tools it is possible to extract this data and work out the trail of authors and workers who created a document. That is why we should all use opensource and open data formats - so that we can humanly read what all we are "putting" into the document. The Word version of this document has now been removed from government websites but copies of it are still available elsewhere on the net.

    Unabridged and unedited article at

    http://news.bbc.co.uk/2/hi/technology/3037760.stm

    --
    To see a world in a grain of sand, and then to step back and see the beach where the sand lies ...
    1. Re:MS Word got Tony Blair busted in the WMD case by Anonymous Coward · · Score: 0

      You trust the BBC?

      BBC Leak Blows Al-Qaeda Sting Operation

      They are obviously more concerned about getting attention then in things like accuracy and responsibility.

    2. Re:MS Word got Tony Blair busted in the WMD case by Anonymous Coward · · Score: 0

      I always said a responsible press does exactly what a foreign government tells them to do.

    3. Re:MS Word got Tony Blair busted in the WMD case by Anonymous Coward · · Score: 0
      You trust the BBC?

      BBC Leak Blows Al-Qaeda Sting Operation

      They are obviously more concerned about getting attention then in things like accuracy and responsibility.


      I would have said, "they more concerned about accuracy and responsibility then furthering the agenda of a fascist goverment". Do you really think, that this guy (an east end barrow boy no less) should just be "disappeared" off of the streets with nary a word said in the press? That this was a clear case of entrapment is hardly relevent the really scary part of this story is that dolts like you are prepared to trade in liberty for an illusion of safety.
    4. Re:MS Word got Tony Blair busted in the WMD case by meringuoid · · Score: 1
      They are obviously more concerned about getting attention then in things like accuracy and responsibility.

      Irresponsible? Possibly... it's conceivable that someone might have escaped being disappeared to Gula^H^Hantanamo as a result of the BBC breaking the story so early. But inaccurate? As far as I am aware everything in the BBC's report was true.

      Personally, I'd trust a news agency that broke the news regardless of whether or not Bush liked it, rather than one which only published news that the Party considered helpful.

      --
      Real Daleks don't climb stairs - they level the building.
  48. The British experience - government stupidity by pyrotic · · Score: 2, Informative

    Have to post a link to this famous example, the dodgy dossier. There was a writeup here. If you're thinking of making the case for war, don't release Word documents to the press - unless they're very very docile.

  49. Images too! by Anonymous Coward · · Score: 0, Informative

    Cat Schwartz, of TechTV fame, discovered that cropped JPEG images may also contain uncropped thumbnail images (warning: PG-13 content). There's some debate whether the images in question came from Photoshop or from a thumbnail image stored by the digital camera, but it was a humbling oversight in either case.

    1. Re:Images too! by Anonymous Coward · · Score: 0

      I wish she had never found that out. There's no telling what we could have seen in other photos.

    2. Re:Images too! by Anonymous Coward · · Score: 0

      According to the EXIF information stored in the jpg, the camera was a Nikon D100. The thumbnails are created when you take the picture.

      Nice camera, but nicer tits.

  50. if anything, the opposite by siskbc · · Score: 2, Interesting
    This will become a common way for 'big' corps to spy on 'small' corps (and individual users?), to find new ways to both screw them over, and appear 'omniscient'. They'll never (or rarely) get called on it. Meanwhile, anyone who tries to reveal information discovered in this way which is incriminating towards said big corps will get sued for being "hackers" and/or "terrorists".

    Aside from the paranoia overtones, I still disagree. The tools for doing this are on the web. Right now. So in other words, a weapon has been released that is free and easy to use. If anything, this will help small, poor companies with no resources for industrial espionage get a little information out of people who don't know any better, including their large-company rivals. All they have to do is hire one of the celibate wonders that read slashdot, and they're in business.

    --

    -Looking for a job as a materials chemist or multivariat

  51. DMCA violation? by notcreative · · Score: 4, Interesting

    By using tools that break the "encryption" on, for examply, the Washington Post .pdf file mentioned in the article, isn't the researcher violating the DMCA? Isn't his whole project bragging about doing this, a la 2600?

    I hope he remembers a few packs of cigarettes in order to buy himself a few nights of sleep in the Big House.

  52. Re:It's been said hundreds if not thousands of tim by shaitand · · Score: 0, Troll

    Solution: stop installing word on your corporate network. Stop allowing users to install software. No more problem.

  53. Re:Prediction (x1488) by Anonymous Coward · · Score: 0

    Objection, your Honor, this is pure speculation. The prosecution has not established sufficient grounds that this witness is qualified to predict what dominant and powerful techniques will be used in the future.

    SUSTAINED!

  54. Re:WHERE ARE THE WMD'S by Anonymous Coward · · Score: 0

    Thank you, Saddam.
    Or, if you're not Saddam himself, you are at least his personal cock sucker.

  55. Re:Slow news day by Anonymous Coward · · Score: 0

    Actually, the truth is that the mosquito kills more humans each year than any other animal. Go look it up.

  56. Didn't I already write about something similar? by NewtonsLaw · · Score: 3, Interesting

    This isn't really new -- check out this story I wrote for CNet/ZDNet over a year ago.

  57. This not anything new. by gnuforpresident2004 · · Score: 2, Interesting

    This type of thing happens all the time but just with digital media but with other media also. People go through others garbage recreating shredded documents, camcorders catching people in the act, carbon paper, copying machines. You always need to be careful when dealing yours and others information.

  58. Newsflash: People shoot themselves in feet by darthwader · · Score: 3, Informative

    ... then suffer foot wounds.

    At the risk of being moderated Troll and Redundant,
    Why are these people posting Word Documents online?

    The Word Wide Web is not the Microsoft Wide Web.

    Post in plain ASCII text, or HTML if you feel the need to pretty it up.

    People keep using tools that are far more powerful and complex than they need, then they screw up, and blame the tools. Pick a simple tool to do a simple job, and you don't need to worry about your ignorance of the tools you are using causing you problems.

    --
    I hate it when I make a joke and I get modded "+5 insightful". Mod the stupid comments "funny", not "insightful", pleas
    1. Re:Newsflash: People shoot themselves in feet by Anonymous Coward · · Score: 0

      Why are these people posting Word Documents online?

      Because they know how to use Word and perhaps a simple CMS, they don't know a lot of HTML. How would the average Wordphile go about putting a document with a chart in it on the web? Organisations with underdeveloped IT policies and staff are condusive to this kind of behaviour.

    2. Re:Newsflash: People shoot themselves in feet by chrislee35 · · Score: 1

      Typing in Word is convenient for most people because of simple speel-checking :). Then if they want to publish online, they have to export it, right? Well if you export from Word, then it add tons of meta-tags with possibly unwanted information in there as well, and Word does a crappy job of exporting to text. Besides, people don't like looking at boring monospaced text documents. HTML just looks better, right?

    3. Re:Newsflash: People shoot themselves in feet by ishmaelflood · · Score: 1

      Well they'd probably hit the "Save s Web Page" command in the file menu.

      If you don't mind the size it is pretty painless, and often works.

  59. It comes down to one question by Alethes · · Score: 1

    Is security the responsibility of the software of the users? Should we point the finger at that horribly insecure software that shouldn't allow this sort of thing to happen or the ignorant users who put the sensitive data in the document? Both?

  60. Re:Slow news day by Anonymous Coward · · Score: 0

    No, it's an insect. We discount the people from the list.

  61. UK govt caught out by g_attrill · · Score: 3, Interesting

    This has happened to the UK government several times. The latter link shows whose sticky fingers were on the infamous "dodgy dossier".

    Gareth

  62. Re:Slow news day by Anonymous Coward · · Score: 0

    Mosquitos just transfer viruses and existing diseases known to the humans. Mosquito byte by itself is harmless for humans, unless there's also a virus injected into the human blood. You're right in terms of numbers, but let's say we leave the mosquitos out of this list.

  63. Re:WHERE ARE THE WMD'S by Anonymous Coward · · Score: 0

    As Saddam's personal advisor, I'd advise you to crawl back into Dubya's stinking ass.

  64. Re:Slow news day by Anonymous Coward · · Score: 0

    Ok, time is up.

    The most dangerous animals (counted by the number of humans, killed globally within the recent years):
    1: Snakes.
    2: Wasps and bees.
    3: Alligators/crocodiles.

    See the post above explaining why mosquitos is not exactly correct.

  65. Re:Slow news day by Anonymous Coward · · Score: 0

    let's say we leave the mosquitos out of this list

    Okay. It's just that they are so small and easy to pick on.

  66. It Must Be A "Technological Measure" by John+Hasler · · Score: 1

    So who is going to be the first to claim that running a Word document through strings violates the DMCA?

    --
    Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
  67. Just cat it by Anonymous Coward · · Score: 0

    In some cases all you need to dig up the sensisitive information is to "cat word.doc". Assuming you have some *n*x utilities lying around. *n*x should be outlawed.

  68. You belive the FBI ? ? Re:MS Word got Tony Blair by leoaugust · · Score: 1

    Well, interesting that you should hit BBC where it hurts, an in some cases maybe they deserve it. But, I think this "story" that you talk of is not one of those. Look at my take of the story behind the story behind the story .... The Blair-Bush team is more devious than you might want to believe ...

    ALL Quotations below are from the MSNBC article ... my comments are in the [TT] [/TT] format ....

    At the end of the day, officials say, the Lakhani case remains a story about potential threats and not the real-life terrorists they had once hoped to nail. For all the hoopla over the case, the official confirmed, it was essentially a government-arranged sting that never involved any contact with actual terrorists. If there was no contact with a "real" terrorist, how were they supposed to trap these "real" terrorists ??? But, wait, it gets even better .. the terrorists in the case were essentially all actors -undercover informants playing associates of terrorists for the purposes of making a case against Lakhani, an international arms dealer who, according to Christie, has a history of alleged criminal activity. So, now a person who tends towards a certain kind of behavior is now enticed ... Does this bring to mind the word "entrapment?" They noted that Lakhani, in taped conversations with undercover operatives, expressed his hostility for the United States, his sympathy for bin Laden and his willingness to work with terrorists to supply them with the weapons they needed to shoot down airliners. Is this hostility really surprising when it is known that majority of the world population thought Mr. Bush's actions have been consistently unjust and immoral? We didn't want this to get out before we could determine whether this guy would cooperate or not.? So, now you tempt him even more. Promise the fix to the junkie. Even if the tendency to additction is genetic.

    And what better thing to do to tempt the suseptible by playing both sides - being the buyer, and being the supplier. It sort of reminds me of the "Truman Show." The FBI and Department of Homeland Security agents then arranged to involve the Russian FSB, that country's security service. When Lakhani flew to Moscow last month to actually inspect the missile he would be buying, he met with two FSB agents posing as missile suppliers. Looks like a scene from a C grade movie. The snookered Lakhani then arranged for the missile to be shipped from St. Petersburg and made a commitment to the two FSB undercover operates to buy 50 more such weapons as well as a multiton quantity of C-4 plastic explosives. Snookered Lakhani ... yes, snookered .. you know what snookered means ? It means Fooled or duped

    When you wake up, ask yourself ... could you be living in times scarier than these ....

    --
    To see a world in a grain of sand, and then to step back and see the beach where the sand lies ...
  69. Re:OMG-Crimes of the century. by Anonymous Coward · · Score: 0

    Microsoft is to blame for:

    Bubonic Plague.
    Baldness.
    The Hair Club For Men president, getting run over.
    Sonney and Cher breaking up.
    Sonney smacking into that tree.
    The Solid Gold Dancers sucking.
    Michael Jackson turning white.
    Global Warming.
    The previous Ice Age.
    WMD's in Iraq.
    Dogs and Cats living together.
    That birthmark on Gorbachev's forehead.
    Plastic fake vomit.
    Fake switch ads.

  70. morons almost choking on smoke&mirrors ?pr?.. by Anonymous Coward · · Score: 0

    execrable.

    fauxking ?pr? ?firm? 'information'/blame filtering(tm) is worse than useless. in fact, it's a crime against humanity. those who practice deception for a living, will take their place among the walking dead.

    the lights are coming up now (no pun intended).

    the stars were quite visible for folks who almost never see them, last night. don't forget to glance skyward tonight/often.

    you can pretend all you want. our advise is to be as far away from the walking dead contingent as possible, when the big flash occurs. you wouldn't want to get any of that evile on you.

    as to the free unlimited energy plan, as the lights come up, more&more folks will stop being misled into sucking up more&more of the infant killing barrolls of crudeness, & learn that it's more than ok to use newclear power generated by natural (hydro, solar, etc...) methods. of course more information about not wasting anything/behaving less frivolously is bound to show up, here&there.

    cyphering how many babies it costs for a barroll of crudeness, we've decided to cut back, a lot, on wasteful things like giving monIE to felons, to help them destroy the planet/population.

    no matter. the #1 task is planet/population rescue. the lights are coming up. we're in crisis mode. you can help.

    the unlimited power (such as has never been seen before) is freely available to all, with the possible exception of the aforementioned walking dead.

    consult with/trust in yOUR creator. more breathing. vote with yOUR wallet. seek others of non-aggressive intentions/behaviours. that's the spirit, moving you.

    pay no heed/monIE to the greed/fear based walking dead.

    each harmed innocent carries with it a bad toll. it will be repaid by you/us. the Godless felons will not be available to make reparations.

    pay attention (to the weather, for example). that's definitely affordable, plus you might develop skills which could prevent you from being misled any further by phonIE ?pr? ?firm? generated misinformation.

    good work so far. there's still much to be done. see you there. tell 'em robbIE.

  71. Re:It's been said hundreds if not thousands of tim by Anonymous Coward · · Score: 0

    Since when is .la Los Angeles? Wtf??

  72. mynuts won, again by Anonymous Coward · · Score: 0

    just kidding robbIE.

  73. Heh by dodell · · Score: 2, Funny

    He says hidden information can "incredibly useful" in improving the functionality of the software. "But if some of that data is sensitive, there have to be ways of ensuring that it isn't distributed where it shouldn't be," he says.

    Apparently they need to use some of the software he used to get a conjugation of the infinitive "to be" back into their text.

    1. Re:Heh by Anonymous Coward · · Score: 0

      That's ok, when you run the documement through 'strings', everything looks fine.

  74. In Capitalist America... by Anonymous Coward · · Score: 0

    In Soviet Russia, you make document with Word.

    In Capitalist America, Word documents you!

    --AC

  75. Online what's that? by qat · · Score: 1

    Rights online? What? Where did THEY come from?

    --
    Pls No Negative Modding!
  76. Here's the doc by Pentagram · · Score: 1

    The Word version of this document has now been removed from government websites but copies of it are still available elsewhere on the net.


    Here's a copy of the document. Should save anyone else the trouble of googling for it </karmawhore>.

    1. Re:Here's the doc by Anonymous Coward · · Score: 0

      Can't find anything in that doc other than "MKahn". Does say 4 revisions tho. What am I missing? Strings doesn't seem to return anything useful

  77. Sanitizer by Spunk · · Score: 1

    Certainly, the best solution would be not to use proprietary formats.

    But for those who don't want to change, is there a "Word sanitizer" tool available? Something that will convert one Word doc to another, minus the hidden text?

  78. I stopped reading after "Microsoft Word documents" by sproketboy · · Score: 0, Flamebait

    This is old news. Anyone who would post a DOC file on the web is a dunce anyway. My 2 cents. :)

  79. up to parent directory leaks by whovian · · Score: 1

    Finding personal information in the document metadata is one thing, but finding the documents is another.

    I still find user accounts on which if you do a manual "up to parent directory" and the user has no index.htm{,l} file, you often get a fully navigable listing of their entire html directory.

    Sometimes you find personal files that were never directly linked to, nor intended to be.

    --
    To-do List: Receive telemarketing call during a tornado warning. Check.
  80. Word doc Cleaning Program? by superyooser · · Score: 2, Interesting

    Does anybody know of a program that can clean up deleted info in Word docs? I'm thinking of something like Ad-Aware that scans for certain files, shows you possible security issues (supposedly deleted text, metadata in document properties, etc.), and asks you what action it should take (wipe out/edit text, delete file, etc.).

  81. Microsoft's article on reducing MS Word metadata by 200_success · · Score: 5, Informative

    It has been known for a long time that metadata are hidden within Microsoft Word documents. Microsoft even has Knowledge Base article 237361 explaining how to reduce the amount of metadata appearing in MS Word 2000 documents. Here's an excerpt:

    This step-by-step article explains various methods that you can use to minimize the amount of metadata in your Word documents.

    Whenever you create, open, or save a document in Microsoft Word 2000, the document may contain content that you may not want to share with others when you distribute the document electronically. This information is known as "metadata". Metadata is used for a variety of purposes to enhance the editing, viewing, filing, and retrieval of Office documents.

    Some metadata is easily accessible through the Microsoft Word user interface; other metadata is only accessible through extraordinary means, such as opening a document in a low-level binary file editor. Here are some examples of metadata that may be stored in your documents:

    • Your name
    • Your initials
    • Your company or organization name
    • The name of your computer
    • The name of the network server or hard disk where you saved the document
    • Other file properties and summary information
    • Non-visible portions of embedded OLE objects
    • The names of previous document authors
    • Document revisions
    • Document versions
    • Template information
    • Hidden text
    • Comments
    • Metadata is created in a variety of ways in Word documents. As a result, there is no single method to remove all such content from your documents. The following sections describe areas where metadata may be saved in Word documents.

    I'll bet there are more, but they won't disclose them.

    It's a pity that more people don't just save as RTF. It's just as good for most uses, and it's a less obscure format.

  82. Why Word Does This by spectecjr · · Score: 5, Informative
    I just created a Word document, blah.doc and put some text into it. I made sure I had a couple of undo points. I closed it and opened it back up, I couldn't undo SHIT. So where the hell am I being granted this mysterious "convenience?"

    You're not.

    There are two ways of saving a word document:

    • Fast Save
    • Full Save


    Fast Save dumps the binary from memory into the file. Full Save compacts the binary image, and reorders it. This takes time.

    Word's text stream is stored using a piece table. One of the benefits of a piece table is that if you keep the meta information about the text, you can get nearly infinite undo. The way it does this is by having an original data stream, and an appended data stream. Whenever you add data to the file, it gets added as a chunk to the end of the appended data stream. Whenever you delete, the meta table is updated to remove the text from the stream, but otherwise the text itself is left unaffected.

    As a result, text is never removed from the document. A Fast Save (which is the default) under Word dumps the Piece Table as-is (there is probably some compaction over time to remove the no-longer-used data, but it probably only occurs above a given threshold of used to unused text). A full save deconstructs the piece table's meta information, and turns it back into one contiguous stream of data.

    It's all just a function of the way the text is stored while it's being edited. Different editors have different mechanisms; some store data based on lines, and some store it using a gap buffer. But ultimately, the problem exists because Word uses a piece table, and it dumps the entire table to a file by default.

    It's actually a sensible way of handling the text data. However, whoever designed the Fast Save algorithm probably didn't consider the ramifications of the text still being stored in the document. The best workaround? Wipe the unused sections of the piece table. But then you might as well return to using a Full Save, as you'll be ditching the performance benefits anyway.

    Simon
    --
    Coming soon - pyrogyra
    1. Re:Why Word Does This by Psychic+Burrito · · Score: 2, Insightful

      What I don't understand is why Microsoft even does this distinction between fast and full save when it would be possible to create a single save mode that is both fast and full, bear with me for a moment:

      At the moment the user hits "save", "fast save" is faster because Word doesn't has to do any re-interpreting of what is already in memory. This step is what makes full save slower. But the re-interpreting doesn't has to happen at the moment the user hits "save", it can happen all the time while the user is editing his document. During editing, the performance of the machine is largely unused anyway. And when the user hits "save" in this better version of Word, the application can just save the interpreted data to disk, which is even faster than "fast save", since it's less data!

      Any comments? Thanks! :-)

    2. Re:Why Word Does This by Anonymous Coward · · Score: 1

      This doesn't sound bad to me, and I write software for a living.

      But I think you are missing the point. When you are developing a large complex software project, there are always hundreds of different trade-offs and design decisions to be made everywhere. In general, each design decision needs to be examined for exactly the gotchas that caused this article to be written in the first place. That simply is not done at Microsoft. Not that other software deveopers don't skip this kind of analysis, but at Microsoft, it seems to be a standard way to do anything! And the result is the insecure, hacked-up, unstable piece of crap software mess that Windows and Windows software has grown into.

      Everyone on /. seems to be hung up on the monetary cost of Windows and Office software. I am far more concerned with the hidden dangers mentioned in this article.

      As for the stupidities mentioned in the article about PDF files, this is a problem with training. Too much computer software "training" is a rote description of "point here, click there" with absolutely no understanding of what the underlying processes are.

    3. Re:Why Word Does This by blibbleblobble · · Score: 1

      "However, whoever designed the [MS-Word] Fast Save algorithm probably didn't consider the ramifications of the text still being stored in the document."

      I have to say, emacs is probably one of the worst offenders in this regard: there must be loads of people with the source-code to their web pages available as index.php~ or #index.php# which they've forgotten to delete.

    4. Re:Why Word Does This by Anonymous Coward · · Score: 0

      You sir, are full of shit.

  83. SCO vs. Microsoft lawsuit in the works already! by Anonymous Coward · · Score: 0
    Not sure if you read the article, but certainly Microsoft does deserve part of the blame. How much - it will be up to the courts to decid.

    McBride's still working on his angles here with his team of lawyers, but you can be assured mr. bill bigpockets will pay for this.

    Something to do with proprietary, copywrighted Xenix code comments being published in Word documents on a Microsoft.com server...

  84. Not Just for Fast Save Anymore? by ewhac · · Score: 1

    Back in the days of Word 5.1a (the last good version), I recall hidden data only getting saved if you used Word's "Fast Save" feature. Since Fast Save wasn't measurably faster, I turned it off. Is this no longer the case? (A quick look through the preferences panel in my copy of Word reveals a Fast Save option; it's turned off.)

    Schwab

  85. Best way to avoid this by foniksonik · · Score: 1

    The absolute best way to avoid this happening.

    Copy your final text from your working draft into a brand new document. Yep good ol' copy and paste. You will only copy the selected text. All the auto-save data and edit history will not be copied into the new document. If your document has charts/graphs/placed images, etc. You will need to do a select all to be sure you got it.

    If you always do this for final drafts you won't ever have a problem again. If in doubt of whether your current copy is clean.. just do it again, then delete the old copy.

    Try it out. If you need to confirm... go ahead save as plain text, or whatever. It works as advertised.

    --
    A fool throws a stone into a well and a thousand sages can not remove it.
    1. Re:Best way to avoid this by Reziac · · Score: 1

      Not exactly.

      What gets saved out as plaintext doesn't include the "hidden" stuff. So it's not a good test of what's under a .DOC's covers.

      What your paste-to-new-document method does, is get rid of the "deleted" stuff. However, it does nothing for padding (random junk which can be grabbed from anywhere, including the swapfile) that may occur when you save the new document, nor for your UID string. And it assumes that your normal.dot (or whatever file Word now uses as its empty-document template) hasn't accumulated any "deleted" data.

      Some of the template .DOTs that ship with Word contain leftovers straight from Microsoft. Nothing too entertaining -- just a few employees' names, working paths, and default printer IDs.

      --
      ~REZ~ #43301. Who'd fake being me anyway?
  86. Slightly OT: just read Crypto-gram by bmckeever · · Score: 1

    If you're interested in these articles, read them on the Crypto-Gram newsletter instead of waiting for /. ers to read it and post them here.

    --
    Your favorite .sig sucks
  87. DRM by Anonymous Coward · · Score: 0

    Don't worry, happy, happy not-stalinist-totalitarian-at-all-honest DRM will make sure this can't happen in future!

  88. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  89. MS Word by drakewyrm · · Score: 1

    Unfortunately, a great many companies require documents to be in MS Word format. I have heard horror stories about people being required to submit MS Word resumes for jobs working with open/free software. What /is/ the attraction of that particular file format, anyway?

    I know. Vendor lock-in. Still hadda ask.

    --
    Batou: Hey, Major... You ever hear of "human rights"? Major: I understand the concept, but I've never seen it in action
    1. Re:MS Word by little_fluffy_clouds · · Score: 1

      I hardly see that as a horror story. In my experience, the HR dept run what the HR dept wants to run (guess). The backbone of the organisation may very well be open source, but it's HR that do the employing.

      --
      What were the skies like when you were young?
  90. You can read old .doc files with Kwrite by mistermax · · Score: 1

    The ability to read deleted/old info is not new to me. Recently my girflriend was updating her Curriculum Vitae. It had been edited on several versions of Word,on a 3.5 floppy. When we opened it in OpenOffice, we got the last version she had edited. Then she reopened it in Kwrite, and we saw there an entirely different document- in fact, the same CV, but from a whole year prior to the last version. Maybe other people have noticed you can do this?

  91. duh by simontek2 · · Score: 0

    i find those all the time, i enjoy searching for files and such people do not intend for the mass public to see. you just need to know how and where to search. heck, have fun looking at Gov't spending.

    --
    SimonTek
  92. PDF by Anonymous Coward · · Score: 0

    If you make a PDF with ghostscript or PDFTex is there still metadata in them?

  93. Re:flamebait ?!?? by drakewyrm · · Score: 1

    What /are/ you ranting about?

    --
    Batou: Hey, Major... You ever hear of "human rights"? Major: I understand the concept, but I've never seen it in action
  94. There are advantages! by oren · · Score: 4, Funny

    Once, when negotating an investment deal, we got a Word document with the investment bank's comments on our proposed contract.

    They tracked changes. All we needed to do was display them... and we got juicy stuff like "if they accept either our fix for clause X or for clause Y we can still s---w them royally in scenario Z".

    Made for a very effective negotiation. For us.

    Oh, wait, the article was about the problems this raises for the document's _author_.

    Never mind :-)

  95. Word attachments by BrokenHalo · · Score: 1

    I've started getting nasty about Word attachments. I've set up my mail transport agent to send automatically send spurious responses to messages with .doc attachments with a polite policy message to the effect that their attachment had been rejected as being prepared with potentially insecure and malicious code, and if they care to send it in a sensible format I might deign to read it. If the sender isn't in my whitelist, I just consign it to the spam bin. Saves a lot of time, and some people have got the message. And if they don't get the message, I don't much care if I don't hear from them.

    1. Re:Word attachments by Anonymous Coward · · Score: 0

      Send them *roff documents. When they complain about not being able to open it, say "What do you mean? this format has been around forever! If I can open a .doc, you can open this!"

  96. I hate stereotypes by Anonymous Coward · · Score: 0

    It's articles like this that really kind of tick me off. I'm CONSTANTLY telling my customers that regardless of what they heard, Windows is not the best way to go. I'm informing them about the ups and downs of linux, and Mac. I also happily inform them that I can build their networks with a Linux main and windows in vmware which minimizes the amount of damage that can happen due to viruses. I've gone so far as to bring clients to my home to see my personal network so they can make an educated decision about if linux/vmware is the right choice for them. And I show them the mac machines I have.

    But when people read articles like this...it makes my job all that much harder. While it's true that MANY of the people in my feild in this area do just that, I've worked for YEARS to prove I'm trustworthy, things like this cause my clients to have doubts in my dedication to THEIR needs, security, and functionality and quite frankly it #$%#$% me off.

  97. Re:It's been said hundreds if not thousands of tim by MegaFur · · Score: 1

    Yes, but it's precisely because it isn't intuitive that training is required. In theory, tips along the lines of "don't-put-MSWord-documents-on-the-web" would be covered in the security thingie.

    --
    Furry cows moo and decompress.
  98. http://www.tu-darmstadt.de/~rkibria by Anonymous Coward · · Score: 0

    I use a free hex editor called frhed. The product says its home page is http://www.tu-darmstadt.de/~rkibria

    Its great for checking AND CHANGING the actual contents of ANY file.

    1. Re:http://www.tu-darmstadt.de/~rkibria by Reziac · · Score: 1

      Thanks -- it's moved to http://www.kibria.de/frhed.html
      (but a redirector is in place from the old link). Will give it a whirl; sounds nicely featureful.

      For hex editing, I'm still using the ancient and tiny CalTech FM, "File Modify" -- dated 1985! Some good programs never die. :)

      --
      ~REZ~ #43301. Who'd fake being me anyway?
  99. Internal Dutch Microsoft Marketing powerpoint file by cerberusss · · Score: 1

    I actually experienced something like this myself. My gf was working on her graduation report and while googling, she found a powerpoint file on the Dutch Microsoft site. It was about their partner strategy, i.e. who would manage their partners and in what way. I mailed it to the director of my division, who found it rather interesting. Some days later, it was gone from the MS site.

    --
    8 of 13 people found this answer helpful. Did you?
  100. Re:Slow news day by ComaVN · · Score: 1

    Snakes just transfer poisonous chemicals known to the humans. Snake byte by itself is harmless for humans, unless there's also a poisonous chemical injected into the human blood. You're right in terms of numbers, but let's say we leave the snakes out of this list.

    --
    Be wary of any facts that confirm your opinion.
  101. www.... by Anonymous Coward · · Score: 0

    www.internalmemos.com :D

  102. Re:Microsoft's article on reducing MS Word metadat by mnewton32 · · Score: 1

    It's a pity that more people don't just save as RTF. It's just as good for most uses, and it's a less obscure format.

    If you've every tried opening an RTF document from MS Word in any other program, you'll realise why this is a bad idea.
    You know what HTML from Word looks like, right?...

  103. Who the hell needs that by Snaller · · Score: 1

    you can get nearly infinite undo

    A paragraphs worth is should suffice.

    --
    If Google really cared they would fix Android Chrome to reflow text, instead of discriminating
  104. great dept. by McAddress · · Score: 1

    The first thing that came to mind when I saw the article's de[t. when-is-delete-not-delete? dept was the press release from the Windows XP London launch

  105. Re:It's been said hundreds if not thousands of tim by shaitand · · Score: 1

    interesting, we have a story about word being nasty, bloated, and most importantly, a security risk. I suggest getting rid of the security risk since there are plenty of other choices.

    Somehow this makes me a troll? What do we mod down ANYTHING that even hints at being a negative statement about microsoft products now days? How slashdot has changed...

  106. Re:flamebait ?!?? by zedmelon · · Score: 1
    I viewed my original post by clicking the link to it in my comments page, the same link can be found here, and it gives you more info about moderation:

    WHAT?!?? (Score:5, Funny)
    by zedmelon (583487) on 2003.08.15 15:36 (#6708189)
    (Last Journal: 2003.08.02 12:09)

    ...and I saw that some assmonkey had moderated me as flamebait. I suppose I can see a staunch MS fan could see it as flamebait, but I didn't think they were allowed to post on /.
    ;)

    --
    Mom says my .sig can beat up your .sig.
  107. Plus, you can train users until the end of time... by laupsavid · · Score: 1

    ...and they still won't give a fraction of a shit about security. Convenience is God to 98% of all the users out there. Their attitude is, "the (programmers/techs who installed this) shouldn't have made it possible, anyway. I have no responsibility for anything I do. Plus I hate to learn and think at all."
    And of course their managers are just as irresponsible and hold no one accountable except for tech people, for anything that happens via computer.

  108. Old News by outanowhere · · Score: 1

    This news is so old it has fossilised.

    Shouldn't someone as why was nothing done when this was first publicised?

  109. Re:-1 Troll, -1 Overrated, -1 Redundant by Litta_Feller · · Score: 1

    So that's how it works, now, huh?

    1. An AC posts a message purporting to be helpful, but it's not.
    2. The post contains a nice little troll hidden in what appears at a glance to be helpful text.
    3. The post still gets moderated as 'informative.'
    4. While the troll post is still at +2 informative, someone observes both that the post is NOT helpful and that it is also a troll.
    5. They also get moderated as offtopic.

    ...which brings us to

    6. We see so many posts that say /. sucks and wonder why they say so. I wish I had been around to see it in all its glory, before this stupid shit became the norm.