Metadata in Vista Could Be Too Helpful
linumax writes "Windows Vista will improve search functionality on a PC by letting users tag files with metadata, but those tags could cause unwanted and embarrassing information disclosure, Gartner analysts have warned. Search and organization capabilities are among the primary features of Windows Vista, the successor to Windows XP due out late in 2006. While building those features, Microsoft is not paying enough attention to managing the descriptive information, or metadata, that users can add to files to make it easier to find and organize data on a PC, according to Gartner. 'This opens up the possibility of the inadvertent disclosure of this metadata to other users inside and outside of your organization,' Gartner analysts Michael Silver and Neil MacDonald wrote in a research note published on Thursday."
Windows Vista will improve search functionality on a PC by letting users tag files with metadata, but those tags could cause unwanted and embarrassing information disclosure, Gartner analysts have warned.
Ha-ha! You're using Windows!
The new version of Windows will be insecure???
Say it ain't so.....
If my metadata could be viewed by other people inside and outside my organization, there's an easy solution to this.
Don't fill out the metadata fields!
Isn't this like saying Airbags are too safe? I thought whole point of metadata is to make it easier to search and find data? How can it be *too* helpful?
The greatest experience we can have is the mysterious.
- Albert Einstein
Should it be a surprise MS hasn't taken adequate security measures in the "advance" of its operating system that seems like another attempt to compete with google? I say stick to Google Desktop http://desktop.google.com/. And your own directory architecture for organization.
Walk with Music;
Now we have a business analyst group trying to direct a computer software company how to write its software. When Gartner starts making new technology or being otherwise reasonably involved in technology, they can have a seat at the table. For now, this is just horrendously bad policy. Anyways, the Microsoft DOC format already contains a horrendous amount of metadata, the full history of changes that led to the current document, among other things. Where's Gartner's whines about that?
No... say it ain't so...
:)
Surely Microsoft aren't adding a feature to Windows without giving thorough consideration as to how the feature will work in a multi user, internet connected, environment ?
After all they've show time and time again how much they cae about these things
Sky subscribers are morons. They pay to be advertised at !
My colleague at my former job once sent our boss a report in a file named 'for_dickhead_2003_11'. He changed the file name before attaching it to the email. Unfortunately, a self-reference in the file contents remained, showing the unfortunately chosen first name. Fortunately, our boss just politely reminded him to pick more neutral names, just in case.
Nothing worse then searching for one thing, and coming up with a "*ultra-midget-fetish-sex-in-chocolate*" result when your g/f is around.......... That's my biggest gripe of indexers. Too easy to accidently find files. Like search for your g/fs name if you want pictures of her (and she is hooking over your shoulder wanting them), she may see her name come up in a convo between you and your bud that you'd rather her not see.
In undeveloped countries, the consumer controls the market. In capitalist America, the market controls you.
Help me out here, but what's so difficult about not storing metadata in-line ?
After 10 years of M$ Word disclosing secret information, you'd have guessed that "a removal tool" as mentioned in the article is obvious to anyone with half a brain as not good enough.
Storing the meta-data in a seperate file, or how about with the other metadata (i.e. with the inode) isn't so hard, is it? And it is quite obviously the right thing. There's even a big, red hint right there in your face: It's called meta-data. Might want to treat it different from the actual data, you know?
Assorted stuff I do sometimes: Lemuria.org
I find it a little annoying when someone does a "doom and gloom" review of a beta product, focusing on bugs or immature features. Its like doing a review of a building in progress and shouting out: "It has no roof! The rain will come right in! What are they thinking!"
Humor from a Genetically Molested Mind
I hear that the 2008 Toyota Prius will have a 7' high spoiler. What's up with that?
Oh, sorry... I just figured that we're talking about products that are still a few years down the pipe that haven't been anywhere close to finalized yet.
I don't know about anybody else, but we not only don't evaluate software years before it's released, but we generally wait until the software has been out for at least a year before even looking at it. I don't know what the point is of reviewing a product this early. The only thing that I can figure out is that it's a way to get a few more pageviews.
I don't respond to AC's.
sounds like he's worried about people finding his porn collection when they search for seemingly unrelated things(scat music, majestic horse paintings, old lady jokes, kiddie books and toys, etc). maybe someone should just tell him not to tag that stuff
if i'm not immortal, what's the point of living?
...te?
is to make the metadata attatched to document files viewable only on the Vista installation it was created on. Perhaps it would be possible to have the operating system strip the data off the files that are being copied or moved to other network locations as a precursor to each respective process. In this case, they would also have to work some kind of functionality into the next iteration of Outlook, so that the problem could be stemmed from the email side of things.
What 3rd party vendors would do to accomodate this is anyone's guess.
How is this different than naming your file "Invoice for Asshole Larry.doc" and mailing it to the client? Simple solution: don't put potentially embarassing stuff in the metadata fields.
Do people really need an analysis to tell them this?
I've often been amused by what appears in the Properties pane of Word document sent by clients or what you can dredge up from Track Changes. Evidence of re-used documents, other projects, other clients, and deft attempts at redaction abound in the hidden metadata and edits.
The more data a computer saves (especially if hidden from plain site), the greater the chance of embarrassment and unintended leakage of sensitive info.
Two wrongs don't make a right, but three lefts do.
If you have any kind of data which needs to be kept private (we have HIPPA compliance to worry about at our medical office), using Google desktop is a bit scary. Yes, it allows you to "lock out" certain data sources, but on machines where private data passes in a lot of different formats, things can easily slip through the cracks.
Of course, we don't have it on our main office machines, because they are running Slackware. Our machines that are locked into Windows for hardware interface reasons had to have Desktop removed from them after a couple of almost-incidents.
YMMV
Using plain ol' text since 1968
Having something like "post-it notes" that do not stick to the file, but instead are part of the directory entry for that file, might be more useful and safer. If someone sends me a file, I don't want that person's metadata to pollute my classification of files.
That's somewhat like what happens with e-mail - I receive plenty of mails that the sender marked as "high priority", but that are low priority to me. Metadata on the file should be objective; subjective information should be stored somewhere else and not be transmitted together with the file.
We never send any raw documents out to customers. We always print them to PDF first. Looking back I wonder if there is still a chance private data could be leaked, that somehow PDF layers the hidden stuff underneath and if someone were to peel back the top.
But this will just be an extension to that policy to check for any meta data.
It's all under control. Just train your users to manage their own metadata.
Even the much vuanted google desktop had information discloser issues.
as this type of technology comes to the mainstream its to be expected the early stuff may have a bug or two. (see: google desktop)
and here they are slamming microsoft for a new feature people are asking for. and telling them how to do it, when they have no idea on how hard this kind of thing is to do from a software engineering perspective.
I mean sheesh The product is in BETA, make a bug report to microsoft as a beta tester if you find a bug.
I mean windows vista has alot of very new stuff under the hood which is very cool. much of the stuff effects security and stability which is a good thing.
-Nex6
"but those tags could cause unwanted and embarrassing information disclosure, Gartner analysts have warned."
Oh, you mean more embrassing than finding cookies and cached images from pr0n sites and the like? Unless you're considering self comments like "he's so hawt! I'd so tap that!" Not that you that most people's surfing already involuntarily discloses their personal data like a sieve.
I'd be less concerned about people appending credit card numbers and such to files, not embrassement.
You need a FREE iPod Nano
Homer: From now on, there are three ways to do things: the right way, the wrong way, and the Windows Vista way.
Bart: Isn't that just the wrong way?
Homer: Yeah, but faster!
Developers: We can use your help.
I'm shocked, shocked to see Microsoft prioritizing features over security.
</Claude Rains>
All movements for social change begin as missions, evolve into businesses, and end up as rackets.
The mac OS (offering previews of the next Windows OS since 1984) already suffers from this problem and so far there are no graceful solutions. Namely spotlight gathers sensitive info in ways I wish it would not. To be specific, I deal with a lot of confidential e-mail that can include personell problems of empoyees. At the same time it's got all my project info on it. When an employee comes to talk about a project I will often search for terms related to the project or sometimes by the employeees name in spotlight while they sit around my screen. Spotlight pulls up the docs and the e-mails onto the same search results screen. Seeing titles of certain e-mails or possibly just the addresses can reveal confidential information or be embarassing.
As a result I no longer have spotlight index my e-mails. And of course that's a pain in the ass since it means Mail.app's searhc feature is busted. While I can figure out how to work around that (e.g. don't use mail.app, which would be a pity), the story does not end there. Unfortunately, spotlight indexes my backup volumes too, and it can blunder across old mail there and index it.
Now you might think I could also turn off indexing the backup volumes but there's the rub. First I might not want to. Second, you can't always do it. Spotlight has some bugs in how it handles logical partitions on disks and in particular it sometimes ignores being told not to index a volume if another partitions is being indexed.
Anyhow eventually there will be more fine grained control on privacy, but then the interface will become more cludgy too. In fact that may just kill the whole fine grained control effort since most folks don't worry about this sort of things and would prefer simplicity.
It's perhaps worth noting that windows dropped making the filesystem a database (for now). That might be a smart move since making at a wrapper like spotlight means they are less locked into a single search design. Problems like this will emerge slowly and flexibility to plug problems will be needed.
Some drink at the fountain of knowledge. Others just gargle.
For example, several years ago Microsoft reportedly posted its annual report as a Word document, which contained evidence that it was composed on a Macintosh.
That example is good for a chuckle (OK, maybe a belly laugh for us Mac fanboys), but suppose someone sent a document to a customer that showed it was filed in a folder named "Correspondence with Idiot Customers" without the sender realizing it...
LOL, did Ballmer piss in your bed this morning? :-)
Beware: In C++, your friends can see your privates!
Another problem with meta data is the generation of meta data. If people generated their own data they could control what goes into it. But the problem here is that you just don't do it normally. Plus as documents change, get copied and modified and so on it gets out of sync unless you keep modifying it. Last thing most people would want is some rigourous change control protocol for every document and e-mail.
Which of course means automated meta-data scraping. this leads to the problem of confidential info disclosure. that's obvious. But it also leads to another problem that annoying. When do you update the meta data? when the file is created or modified? a small lag? or in batch overnight?
On macs you can force a batch overnight search. But the default on is for instant updates. If you add a search term to a document WHILE a search is being performed in another window it will find it! amazing. and very useful too. And it assures things like computers that sleep at night and detachable drives stay indexed.
But it's also amazingly annoying when you stop doing conventional desktop activities and start doing more unix like things. Tage for example untarring a 30 GB archive with twenty thousand small files in it or something that is generating transisent files in a rapid fire fashion. Well you start untarring and for the first few files it zips along. then suddenly throughput nose dives. Why? you look at your processes and you see MDL the indexing programming is chewing up your disk access.
You can work around this if you can control the file names and make sure they are ones it will not index. But that's not assured, always possible, and will vary from computer to computer.
So anyhow there's lots of fine tuning needed on these ubiquitous metadata systems. Fine grained privacy control and fine grained operation modes so it's live in desktop application mode and lags in unix/high performance modes.
Some drink at the fountain of knowledge. Others just gargle.
drwx------ 8 root admin 272 Dec 23 03:39 .Spotlight-V100
:)
Yes, if they manage to apply rights based system system wide, something like OS X, it won't be problem.
I mean if they are stealing, steal it completely
Note I had to 'sudo ls -la' to see it even.
(os x 10.4 "tiger")
Allchin stressed that Microsoft has broken new ground in Longhorn. For example, document icons are no longer a hint of the type of file, but rather a small picture of the file itself. The icon for a Word document, for example, is a tiny iteration of the first page of the file. Folders, too, show glimpses of what's inside. Such images can be rather small, but they offer a visual cue that aids in the searching process, Allchin said.
Kind of like Gnome has been doing for a few years now? How out of touch are these people???
I'm no computer expert, but I do understand the argument against "security by obscurity" which has to do with FOSS vs closed source software.
Medicine is different, though. HIPPA basically requires that you use this kind of security (obscurity). Let me give you an example. If I have your (HIPPA protected) chart in the office on my desk, that's OK. If I leave it in the waiting room, it's not. Information does not have to be hidden from a determined (and illegal!) search, because, well, that's illegal, and because medical practice would grind to a halt if you added that much paperwork overhead.
But if you make it too easy for someone to "accidentally" stumble on HIPPA protected information, you're in a lot of troub le. And Google desktop does exactly that - offering "suggested" completions as you type, allowing you to find out that your neighbor Paul Smith has a patient letter on my computer while you were looking for your dad Paul Jones.
Using plain ol' text since 1968
Isn't the solution to your problem to not let the person you're searching about to stand around your screen?