Microsoft Invents Symbolic Links
scromp writes, "Microsoft really does innovate!
See for yourself!
" I can't decide if this is supposed to be a joke or not. I mean, it's funny, but I just can't tell. Perhaps I need a cup of coffee before I try to post stories.Update: 03/03 06:11 by H :To be fair to Microsoft, they did talk about symlinks.
That might have been funny, if DOS & NT didn't already have such a device. You just direct output to NUL
Last I recall, Some Linux versions need 600+ MBs just to run.... so watching an OS mature to possibly become a threat... I would be defending *nix too...
Um, aliases.
well, you know I can even Boot Windows 2k with 0 MBs? Wonder how? Network Boot... So nothing new here....
Hrmm well, you know I can even Boot Windows 2k with 0 MBs? Wonder how? Network Boot... So nothing new here.... Actually you all are sounding like children... Bitch bitch here bitch bitch that... knowing Slashdot is Bias, I still read it to get a few laughs out of the bickering you *nix folks have to do. If you really think you got a solid OS that will take over the world, How come it has not been done yet? OSS may be good for the techie, but corporate world will 'Just say No!' cause you can not sue someone with OSS... but you can Sue Microsoft and other OS companies if they designed a flaw..
Will this MSDOS old app changes filter down to the original pointing files?
So therefore will this break legacy apps?
(apologies. my english is not good)
IIIIIIII| HEIL JON KATZ!
IIII|
IIIIIIIIII|The Fourth Reich is Upon Us!
IIII|
IIIIIIII| jonkatz@slashdot.org
Well.. I'm sorry.. you're somewhat true.. the tittole is *wrong* unfortunately for your purpose, it would actually better for Microsoft if the tittle was correct... this "redudency" is just not likely to happen.. it will not save as much diskspace as it will cost cpu and disk usage... unless, of courswe you think about using Winshit somewhat like you use unix, with a bunch of users downloading many things, and *perhaps* downloading the same things.. well in that case, it would be MUCH better and MUCH easier to have a control at download time, instead of having adaemon like solution, because at other times, it just won't happen... the best part is that, as I said, this is for the case you use Microsoft Macroshit Winbugs as you would use a unix box.... well.. please don't do that.... you know.. just... just don't do it.. I don't know you, but you don't deserve it :)
Sorry - they never claimed to do that. Windows has had links (as they are called in the windows world) of some sort from 3.1. They've never claimed to invent the concept of links.
I really hate to sound like an MS advocate, but please get it through your chkdsking (haha - get it? fsck = chkdsk) head that Single Instance Store != symlink. It is kind of compression mechanism, not just a linking mechanism. RTFA. Windows has had linking mechanisms from day 1 (well, day 3.1) and they never claimed to invent links. End of story.
I do not mean to sound like an MS advocate, but this is cool. I would never in a million years use it because if one file is corrupted, they are all corrupted. However, it is definitely something that has been needed in computers for a while and *n?x, BSD, &c. should add this functionality (provided its not patented) ASAP.
Please moderate this article up! (Finally a comment from someone with a real clue who actually bothered to do some research before knee-jerking)
berst!
> I haven't read anything on that page that says they've invented symlinks, speech recognition or anything else like that. Maybe the KDE team should create Microsoft for their .kdelnk files, and the "K" button, and the "Taskbar" (amongst other things). ---- :) :) well that's because you didn't understand what you read :) you all guys crying about that "nope nope it's not symlink, you linux geeks are sooo stupid you don't get the difference" just make me laugh... come on..I agree it is not exactly the same.. but please, read carefuly the article, and think about what Macroshit tries to get Cedt about.. ít tries to get credit about the major wonderfull incredibly inovative of the symlink and hardlink concept... now it adds it a feature (that btw loks a lot like a cool new source of troubles at best), but doe not change the concept..s in unix world, if we need asuche a feature 8which by no means needs to eat OS level), the OS provides me many many tools so that I cando the very same feature, or any other feature likt this one in a much more controled way.. oh.. but it's not innovative then.. ok.. I didn't get it, sorry.. oh jezz.. you're incredible, really :) oh.. and when you see KDE, please tell them that great innovation WHICH THEY DON'T CLAIM TO BE ANY INNOVATION OF THEIRS AT THE FIRST PLACE, don't forget to tell them they got it from Os2.. not windows ok ? hmm.. and seriously, don't ever never play on the "you copied me" ground if you are on the Microsof side... this is rediculous.. ask Digitar wat they think about Quick and Dirty OS... oh... but I forgot Microsoft's definition of invention is based on slight modifications (to bad) of earlier existing stuff hmm.. hope no Miscrosoft guy is listening here.. they will soon indersrtand they actuaolly invented invntion and will pattent this.. Guignol...
you did such a good job noticing the first troll what happened to the second? Remember they travel in pairs and triads and quads, hmmm, quad...
The first text 2 speech I remember was on PC hardware (not to say it didn't exist before) but it was bundled with an old Creative Labs Soundblaster 8 bit job. they called it monolouge for windows. Could read text from the clipboard back to you. There was also an ELIZA type dos program that did speech that came bundled with that card.
That's simply marvellous. Microsoft is definitely getting much better at innovating. Just think about it, I do have some files that are duplicates of each other. Although it wasted me a whooping 20MB!!! Imagine, I can save 20 MB of space just like that. I must really thanks microsoft. First it gives me FAT32 and I have saved hundreds of megabytes of space!!! And now this. I must be in heaven and Bill Gates must be GOD!! I think they have also manage to implementing something similar in their Exchange mail. Imagine, if the mail is send to thousands of people, it only keep a SINGLE copy!!! What's more, it also keep all the email address of all the people the mail is send it. Nevermind my mailbox only crashes 20-30 times. It is sure a good idea. You Linux fools should start innovating and stop whinning.
Bzzzt wrong!! try again.
Come back when Linux is pre-installed on 99.9% of the world's PCs and Linus Torvald's personal wealth is approaching the trillion dollar figure.
thank you
So, you're talking about saving space by storing files in a central location? Improved performance? Maybe security & reliability? OS/2 Warp's "Work Space On Demand" already has been doing that for 3 years. Wake up and get Warped, Mr. Gates!!!
Visit http://www.os2hq.com/ for more "Warped Perspectives".
I imagine that microsoft will implement some sort of diff-on-modify for files that are shared with this "technology". Then instead of only creating hdd churn when you modify or create a file, it can do it 24/7! When you're through, you'll have a complete disaster on your hard drive; one real file, 10000 diffs.
Love the innovation, Microsoft! Keep it up! I'll be damned if I ever pay for any of your software.
the amount of space they'll save is large.
/etc/configfile ... users can add their own under ~/.configfile if they want - but they don't have to. Most apps work quite well with the central config.
Unix doesn't need this. My entire OS with optional GUI (X), 3 desktop environments and tons of apps plus, source code and development headers for most of it and 20 users (one called "httpd" that has a *lot of files*) still only takes about 2.5 gigs on my drive. I could likely trim a server down to 80 megabytes.
Since my global config files are in
If I really want to find duplicate files and use hard or softlinks instead well a one line "find"
command will do it for me. But there's very little motivation to do this.
dmg
Hi there...yeah, real funny one, this "innovation" thing... what I was wondering is the following logical gap which might exist: Let's say I do have 2 similar files (stupid me!). Before I try to do anything with them, the magical wonderfully intelligent system (guess the name?) replaces all the copies with links. The next day I have to modify file 1 and file 2 differently. Will the mechanism be cool enough to restore & modify both of my files? Or I'll end up with 2 copies of the 1st modified fil? or the second? As I can see it, this kind of activity might *really* make my work obsolete by trashing it all the time. oops... I ment .. innovating!
What, you haven't heard of the UMM... calendar? Yah know, as in UMM, this can't be for real...
Well, the brilliant engineers at M$ hadn't thought of it either -- they were too busy copying Linux.
I guess to the non-educated user, this might seem like a wonder. Remember, Microsoft's best customers are those who don't know any better
How do I set Linux up to automatically make links to duplicate files (instead of doing it manually) ? Or are you guys really to stupid to understand the technology ?
I don't want to be a spoilsport, but it's a system which does symlinks automatically. If finds out whether two files are the same based on a hash of some sort, I'd guess.
So it's not just symbolic links. OTOH I wouldn't rant about this being the second best invention since the insemination of Bill Gates, as the folks at Microsoft apparently do.
I respectfully disagree. UNIX and MSDOS (err, Win2000) don't differentiate between binary & test files. If the contents are the same, the contents are the same.
If 2+ files are going to be bit-for-bit identical, there is an advantage to using a hard link for all but one of them. The overhead of managing this automatically by the OS & creating a new file one write may be an issue, but your points are, well, pointless.
Words fail me ..... this is such a revolutionary idea, how come nobody ever thought of it before?
.cshrc) - with this single store thing my .cshrc and bobs .cshrc are symlinks to the single stores .cshrc. When we edit our own cshrc then the symlink gets remove and we get our own copy - this is a very cool idea
Why when I want to sound sarcastic I fail miserably! Then perhaps you should read the article eh? This is *not* the same as symlinks - it takes the symlink idea a lot further. Me and Bob share have the same file (say a
It is *NOT* what you and I call symlinks.
Give them some credit...
How many hours a day do you Linux zealots spend looking around the Microsoft site?
It took me 30 seconds of reading the article to discover that what the M$ guys had done was not just re-invent symbolic links
And I would hope that anyone else who read the article would discover the same thing just as quickly, but that doesn't make the post "inaccurate" There was no claim that the post referred to the whole article.
The article is about "Microsoft Research Innovations" (including something similar to simlinks, something that creates them, something about IPV6, and something about text-to speech among other things)
I believe the poster (Scromp) was referring to the part about links, not the part about how they are maintained or the parts about IPV6, or the parts about text to speech software. If Scromp had claimed to summarise the whole microsoft document, then perhaps his claims would have been inaccurate, but he didn't.
So they basically took an intro comp sci class and heard someone say that redundancy is bad and had a flash of insight. Well I don't know about the rest of you but this is a very, very basic concept. The only real diference is that Microsoft now acknowledges it, and applies it to dll's.
See http://incolor.inetnebr.com/jeple r/sis-1.0.tar.gz
Jeff
jepler@inetnebr.com
too lazy to figure out why his login doesn't work
They are in effekt a link and a real document.
Then user 1 edits the document. Now user 2 has lost the original and are working with the "User 1" version.
Well I'm sure MS views this as a feature ;)
Not to be a retard... but what is the difference between a symbolic link like and a hard link ?
Thanks,
Anonymous Coward
RPM sets up links automatically when you install software in Linux. The pkgadd command in Solaris sets up links automatically when you install a new software package. There is nothing new here...
php is a WEB programming language, dumbass. Perl does much more than just web stuff.
Or did they just grab some office lakey is told them to interview them while they recieved a blow job.
I want her to give me a blow job
Yay! Now all the exact duplicate files on my system won't take up all the space any more.
Just think of how many gig I will free up by not having to allocate space for all those textfile.bak files.
[cough!]
copy priceless.doc priceless.bak I may be dead wrong here, and probably am (After all, I'm not a FS geek...), but doesn't the fact that one is called priceless.bak instead of priceless.doc make them NON bit-for-bit copies? Ie: priceless.doc != priceless.bak on a bit for bit level, simply because of the name difference?
Ah, sweet spin control! Now that it's been pointed out this isn't just 'symbolic links' which was innovative in 1974 on the Unisaurus operating system (may it rest in peace) we can all dig in and come up with many reasons why it's bad.
Bad bad bad! Microsoft bad!
baa baa baa! Microsoft baaad!
So then what happens when your file system is full? Modify a file, even make the file smaller than it once was, and get an out of disk space error because there happened to be several other copies of the same file lying around linked to it? What data would get precedence then? Your new data, the saved data? I suspect you'd probably get a blue screen. Unfortunately in a closed source environment like Win2k these questions will never be answered without experience, and that experience comes at too great a cost. _FS
We're Rubber and You're Glue. Neener, neener.
Try and be a little more transparently childish with your next comment, please, Mr. Gates.
"The first piece searches for duplicate files, computes a signature for each file and stores these signatures in a database. It then compares the signatures in the database and merges duplicate files."
Let's say this runs automatically. Let's say I've got a file which computes to the same checksum as NT's kernel. Let's say I put it on the same partition as the kernel. Of course, this is easily solved by comparing actual file contents with matching checksums before merging.
And what's wrong with the OS doing these things for you? Would you rather have the filesystem do this auto symlinking automatically or a perl script your hacker administrator wrote up?
Oh so I see how it is...
My hacker administrator didn't spend 1.5 years developing the feature and doesn't have the attention of the media, so the admin's code is garbage!?
The feature was "hacked" into Windows 2000 at the Redmond facility. But, just because Microsoft is Microsoft that makes it better.
hobby OS
8:31am up 103 days, 22:52
Wish someone would take up Windows as a hobby...
what I haven't seen anyone else metion that if you create a bit for bit backup and its replaced by a symlink, if the copy its linking from it corrupted (say due to unexpected shutdown due to instability) then you would be screwed as both copies would be corrupt
Because one of the major parts of the slashdot agenda is to spread FUD and narrow-mindedness.
Microsoft's new feature isn't a symbolic link. It's an automatic transparent hard-linking feature built into the file system. I've never seen anything like this in Linux.
I'm a hardcore Linux supporter and I'm all for slugging it out toe-to-toe with Microsoft. But fluffy inaccurate /. articles like this hurt the cause, not help it.
Oh, yes, it has been done before. I think LZW comes to mind... this is just the same thing applied at the OS level -- something that IMHO no engineer in his right mind should ever think it to be a good idea.
So if I understand correctly, duplicated files are found automatically, and the system created the link.
I was thinking: OK, MS think duplicates are useless, lets keep just an original and link to it . Now, why would I want to have duplicates?
1. Backup.
I copy a file on a floppy because I need it at work. Zac, the system deletes it and saves a link. At work, I drool on "Can't find the original file."
2. Hacking
I often mess around with binaries to cra...ehmmm...enhance, and normally I duplicate the file before messing around.
3. Apps
Many many many apps create working copies, duplicates, backups and so on...I don't think they'll react very well when their spooling file disappears.
I really hope for win users that MS somehow will let them stop the automatic detection.
A file does not contain a reference to its own name, unless placed there deliberately. The reference to the name is in the directory.
I think the motivation for this system isn't to make life easier for end users, it is to clean up their own mess.
With the exception of backups and copies made for editing, data files are like snowflakes, you have to look for a long time to find two the same. It will clean up library and binary files though as these are spread wantonly all over the place. If MS had stuck to a standard such as POSIX they would have a place for these things and duplication wouldn't be a problem. Probably why this 'innovation' isn't needed under *NIX.
But what would I know?
I did something like this in 1989 for Xenix System V. to keep users from cluttering up the huge 200meg hard drive i had a program that ran at midnight that would scan the user directories and see if they had made duplicates of the exact same item (Sans anything with the extension BKP or OLD.) and symlink the items. Now, here's the kicker.. Joe1 has fileA.. Joe1 sends FileA to Fred3.. fred3 accidently fills fileA with crap, now Joe1 just lost his file! with harddrive space being a NON issue in this day and age (18gig SCSI ultra drives under 300 bucks) this is more a waste of time than an Innovation. and it has the possibility in a server environment to do a huge amount of damage. Please microsoft, keep giving us stupid innovations that will make the Winodws product more and more unstable... it helps me prove to others that Linux is the way.
And what happens when a bit is accidentally flipped in the only image of the file? "That's ok, I saved a copy over here... DOH!" It's a symlink. Thanks, but that's being annoying, not smart. Bad computer. Quit deleting my files.
Do you mean that if i change "Microsoft" for "Piece of shit" in one Word document, it will change in all files on my hard disk (maybe even on the network, ... or the internet ... ?...)
This just confirms(paraphrased): "Those who don't understand unix are doomed to reinvent it, poorly"
Well, now that you've identified Linus Torvalds and Linux, we have to ask: what does it have to do with this article?
I don't really know why everyone is picking on Microsoft. If we just leave them alone they are on track to invent telnet sessions by the year 2018.
Just image what would happen as your disk gets full, and gets close to running out of room. Then you change one of the files (say about 3 megs) that is a symlink . The system now does a copy on write, but you don't have enough room, what happens? Do you lose the file? does the system crash, or does it just tell you that you can';t edit the file?
Why was this marked "offtopic"? I think it's a good point!
Kinda like the article, I'm not terribly technical when it comes to filesystems and such. What I was wondering, is...by storing a file, if you use a unique name for each file (myfile.doc, myfile.doc.old for example), doesn't the change in file names render the documents dissimilar on the bit-by-bit level? Or does a document/file contain no reference to it's own name? If no reference is made...then there is a serious problem...but if a file references its own name (in the header or whatever..'member..it's bit by bit..not whats actually IN the file), then they become seperate files, and therefore not 'hashed' or 'signed' by the OS/process for linking.
You may have "considerable marketing expertise," but I'll wager you don't know a lot about software, Open Source or otherwise. Your argument is analogous to someone at Ford or GM saying that cars with hoods that open, and engines that are accessible, will never work because backyard mechanics will tinker with them. The only "solution" you can offer to this "problem" is to securely lock the hood and require special tools to perform the simplest maintenance. (Needless to say, the key to the hood remains the property of the dealership and is not available to the car owner, and the "special tools" are available only to dealership service departments.)
Would you buy such a car? If not, why do you insist that that sort of closed model is the only possible model for software?
Your example of the "autoexec.bat" file is a poor analogy. "Autoexec.bat" is meant to be customized, and is typically different on every system. The Linux kernel source code is not meant to be customized, and is the same on every system. (The kernel itself isn't, but users who build custom kernels -- and unsophistcated users typically don't -- do that through the "make config" process, not by actually hacking the C code.)
For sophisticated users, making source code available means that they can actually find and fix bugs!!! In real time!!! The same day!!! They don't have to wait for the next release like the sheep who run the latest M$ bloatware. Some of us prefer to have systems we can actually fix, instead of systems we just have to suffer with.
You are dead wrong: they are still bit-for-bit copies.
From what I understand... a symbolic link is a file that points to another file on a file system. A hard link would makes two files point to the same inode.
I imagine MS's new feature is more intelligent than your script, such that a change in the file invokes a new instance of the file, not a change to the original. Duh.
You cannot choose NOT to install it when you install office! It takes a PhD to figure out how to disable it (if you take it out of the startup folder, the OS just PUTS IT BACK).
And guess what? IT DOES NOTHING!!!!!!!!!!
Stupid users who spew their documents all over the hard drive have NO FREAKIN' IDEA how to use Fast Find!
mmmmmm, Info-Tainment!!!!!
No, only stupid fucks call them "Microsloth."
The rest of us call them Microsoft.
apparently that's why it took a year and a half to implement, sym links would probably take a week.
This sounds like the standard knee-jerk Slashdot reaction to anyone who dares to be positive about the multi-billion-dollar earning, hyper-succesful Redmond based corporation.
I am from Europe, and we are always hearing about how the Americans like to reward success, hard work, achievment, and winning etc etc. Judging by the comments on Slashdot, this is not true. Don't be blinkered by jealousy.
Take a leaf out of the Bowman's book. Come to terms with success or your bitterness will eat you up from inside.
Maybe it's just me, but IPv6 seems just a *bit* excessive, changing from 32 bit addressing to 128 bit addressing. 128 bit = (after a little algebra and guestimation) 1 address per ~1E12 atoms in the earth, or 1 address for every piece of the earth the size of the tip of the sharpest needle you can find. (Keep in mind that this is through the *entire* earth, not just on its surface.) Well, Draves is right in "It will allow the internet to continue to grow into the future". Yes, 32 bit might eventually get 'crowded'. 40 or 48 bit? Not a chance.
--Some guy without a login (yet)
No, that be like crediting Ritchie every time some one created a C rip-off.
They never tried to take credit for symlinks
From the article (sounds like they want credit to me):
Your points about COM are valid, however.
No, it isn't.
I can guess from previous post that the new W2K feature allow us to do one modification in one Word document (for instance replacing "Windows" by "dumb") and all files on the hard disk will be impacted (not only Word documents, but also Excel, Access,etc...). Well, I think Linux community missed this functionnality. We should sell RedHat stocks and buy MSFT.
One major flaw in this design is that if a person actually wants two copies of the same file, they wont get that. Let's say for instance you're keeping a document for backup from your original report. You leave your system on during the night and in so doing Windows will scan and setup a symlink to your backup. In the morning you come to edit your document and make some unfixable mistakes and go to start with your backup, only to realize that your backup is now a symlink to your new document, the messed up one.
You have to give Microsoft credit - I think they've got a larger Reality Distortion Field(tm) larger and more intense than Steve Jobs! -Johan
I respectfully disagree. UNIX and MSDOS (err,
Win2000) don't differentiate between binary & test files. If the
contents are the same, the contents are the same.
I respectfully disagree with your point that text and binary are the same under Windows. They are not. That's why you have the O_BINARY flag to the open() C call under Windows - it's implemented under unix, but unnecessary.
I've no experience with NT or W2K, but wouldn't that cause a lot of file fragmentation?
Fake news releases aren't funny.
And damn! If they use the word 'innovate' one more time, I'm going to puke.
First, I'm pretty sure fsck has, from time to time, been able to clean up a partition without requiring a complete reformat.
Yes, I'm sure that there is anecdotal evidence of Linux being able to recover from a power-failure induced crash with a simple fsck.
Not often, of course, but from time to time it works.
Face it, your hobby OS is an admirable... hobby OS.
You effectively mount a directory over the top of another directory. The background copy is used if you don't change the files. If you do make a change, a local copy is put into the foreground.
In a nutshell Hard Link: Behaves as though it were a true file. It doesn't appear as a link to the user. Soft (Symbolic) Link: A small text file that points to the location of the original file, as well as containing a 'link bit'in the file attributes Type 'man ln' at the command prompt to understand the difference better. Please correct me if I'm wrong.
Do we want the changes to appear in all incarnations of the data, or do we want to do a copy-on-write?
Windows already has links. If you want to edit all copies, you use a link. If you don't, then you should get copy on write. I'm not saying this is how MS will end up implementing it, but that's the logical way it would be done.
Maybe you sbould read the article bone heads. They are not talking about symbolic links. They are talking about comparing file bit by bit and if they are the same then the system intelligently and automatically creates the symbolic link for you. That is actually a new idea. Not revolutionary, but it takes the concept of the symbolic link into a more active place. Can we get some more reading before every whines around here?
Scromps still wrong. He/Taco implied that the innovation was symlinks, which is not the case. If we are allowed to pick any particular aspect of a technology then there are like less than 1 innovation per hundred billion years.
If you're gonna quote a stat, at least quote it correctly. Yes, it shows them "dumping" 10 million shares of stock at a time. ...and Gates has (at last published date) almost a billion shares, and Allen has almost a third of that gratuitous number.
is that when this kind of extension was to be released as part of the Linux kernel, it would have been the greatest invention since sliced bread.
But now that it's a Microsoft invention, it HAS to be bad... you don't even have to read the article for that!
Disclaimer: I'm an AC and have no preference for whatever OS/religion/platform/license.
Maybe American vehicles are less sophisticated (Pushrod V8s with low gas milage) but in Europe we are far ahead of you.
I am thoroughly disappointed we were not able to /. the MS site.
I vote we keep posting MS articles til we do.
The complaints listed here have substance. You mark yourself as a Redmond Troll by accusing everyone else of exactly the sin'n'spin M$ is pulling in this attempt to break a good free s/w feature and sell it back to the public.
[~/phpdev]# phpcgi -q myapp.php3
-Well.. as you can see, when the burglar trips the alarm, the house raises from it's foundations and runs down the street, around the corner to safety...
-(house falls over and bursts into flames, small plastic human figures fall out on fire)
-Well the.... the real humans won't uh... won't, won't burn quite so fast in there, n-ghey.
I could say the exact same thing about your supposition that Linux cannot, infact, tolerate repeated abuse of the ext2 filesystem.
Bother to support your suppositions about ext2 with more than just anecdotes or don't bother.
When I was still running dual boot Win9x and my wife used my machine, I just plain let her turn off Linux. My ext2 partitions never suffered for it.
And you thought that "findfast" took a shitload of CPU up. Actually, the CPU usage might not be that bad since one could CRC the files during creation or something like that..
Automount (or amd) already does this with it's
So what you are saying is that you fucked your users?
That what I got out of the article as well, however, why would anyone want that. If you can make a file, then symlink/hardlink, and set copy on write, then you're garanteed where the file really is, because you put it there. With MickeySofts new innovation, it might save a little space, but you have no control over it. If the file getts put on a drive that becomes unstable, you can't move the file to a safter place and update your links, thus saving data, because you aren't garateed where the file is.
Atleast, that was my take on it.
How do you know this is how their system works?
I imagine it would work in a manner similar to Exchange Server, where a message with multiple recipients is stored as a single instance in a database, but if a recipient edits that attachment and saves, the original is intact and the modified version gets its own instance.
Further, the Single Instance Store appears to actively search for files that are binary duplicates and consolidate the storage of them.
If symbolic links do all this, then Unix is further ahead than I thought!
I can tell by the way the article describes it that its a mess. "The system recoginizes the data is duplicated..."
Symbolic links work better, and if I understand the programming correctly, they have a simpler, easier to understand architecture.
Nice idea, but its been done already.
- Programming Perl
has a script in it which will detect redundancy and create symlinks for you. Page 316:Why?
The fact that it took them 10 years to fully exploit the 386 in consumer products.
The fact that it took them 10 more years than Apple to deliver a reasonable GUI/OS/Shell.
The fact that they took a perfectly good microkernel architecture and let marketing pollute it.
The fact that even head hunting members of the VMS development team didn't quite help them build a robust OS.
The windows registry.
The fact that nearly everything Microsoft has ever implemented has been swiped from others, typically those that had the vision and talent to implement the new tech in question.
>is there something like this for *nix ?
Cron, diff and ln are your friends.
To someone used to a FAT system, this may seem really cool. To anyone who understands how inodes work, it's all a bit obvious really.
The question is, if they're going to automate it and make it transparent, are they sure that the system is always going to get this right in all the little corner cases? They'd better be, for the sake of their customers' data.
Didn't M$ invented Internet, multimedia and the binary digits aswell? =) Amiga - the definition of multimedia before windoze screwed it up!
The Mac has had all this for years. Macintalk text-to-speech, (flaky) voice recognition, shipped with the OS and as handy as the Speech Control Panel.
If they could only ship a box with a decent microphone, it might actually get used....
To follow your analagy: Fixing cars, sorry automobiles, is best left to the experts, since most car owners are not technically qualified to do so. The dealers must love the people , invariable male, who insist on wrecking perfectly good machinery by "maintenance".
Thus the idea of a "black box" approach is a very good one, probably why it is much used in software testing.
If you ever want a good laugh, look at the Solaris 7 source code. You'll see quite a few (C) Microsoft's in there.
No, a shortcut is nothing like a symbolic link.
- it does not operate at the filesystem level
- it stores extra info besides the file it links to
Symbolic links are interpreted only by the explorer, not by other programs that attempt to open files.
Yup, first devices to have been rolled out are so-called "wheel mice".
This is NOT unix style symlinks. This is MUCH, MUCH more cool!
ANYTHING??!!
Seems to me that I've never had so many problems
as I've had the times I said 'Ok' to a microsoft application
doing something automagically for me.
I don't trust them.
I've never trusted them.
I doubt I'll ever trust them.
Linux. The best code from the best kind of folks.
FOR the best kind of folks.
Just say NO to Microshaft.
... don't tell me M$ is going to patent this ? I am going back to sleep, for a very long time...
Seriously though, if all you knew happened to be Windows, and the only support you ever had came from (no offense!) NT admins in, say, a large company (such as Microsoft)... would you trust your sysadmin to write a script to do something useful? Or would you see it as one of those flashes of scary terminal that flies by after you log in? The less of those the better, eh?
Hopefully, when one alters the original file, it splits off the link altered to point to a new file while leaving all other links pointed to the old file, but then with MICROS~1, you never know!
trollin trollin trollin, keep dem windows blowin! you go girl!
QUICK! LIGHT YOUR LINUX TORCHES, I THINK THAT THERE GUY SAID SOME BAD THING ABOUT LINUX!
"The Single Instance Store, which is used with Windows 2000's Remote Install Server, consists of two pieces, according to Bolosky. The first piece searches for duplicate files, computes a signature for each file and stores these signatures in a database. It then compares the signatures in the database and merges duplicate files. The other piece implements the links, recognizes when someone tries to open a link and directs the link to the common store, where all the duplicate content is stored."
Hey why dont you nazis read the article through before you start your flamethrowers. That is NOT a *nix-style symbolic link being describing.
It seems to be something very different than links, soft or hard. This will let files share disk space if a subset of their data is the same. Microsoft actually may be on to something here.
Hmmm ... Seems to me that between UNIX and OS/2, all these "Innovations" have already been innovated! OS/2 had Voice Navigation in Warp 4 several years ago. Links have been around for quite a while. IPv6 has been in development for years and is already an option for routers and most UNIX's.
I guess Microsoft has finally given us their definition of "Innovation". Take products, features, etc. from other companies, and put the Microsoft logo on it!
jeffj@technologist.com
Please understand that I have no authority to say anything in here, these are pure observations.
As far as i'm concerned.. This is a Microsoft "innovation?". Of course it is, what other company/OS out there has a system that runs through and creates sym links(hard links?)? I have never seen it. Would it not be innovation to have a flying car? Yes... links are out there, but MS will just be using them a bit different.
Note: I have no opinion of Microsoft/Linux being better than one another.
Microsoft is less stable, but supports a heck of a lot more than Linux. I can't use Linux, just yet, to scan anything, so I can't ditch my MS partition completely. Granted Bill Gates is somewhat of a thieving jerk. He reconstructed an OS... DOS... Linus Torvald Reconstructed Unix... Just Linus Torvald doesn't have Billions of Dollars. Linus Torvald seems to be a bit more happy than mister Gates.
This daemon seems pretty cool to me. If you want a secure backup of your data, use a tape backup drive? Or throw backup bit or two and use a backup command that allows you to store a copy?
These issues aren't hard to overcome and MS is saying they have 1.5 years to work on it. What a wonderful IDEA. One comment I do agree with from above however is that Linux is a bit too !userfriendly. Yes, some of you think it's easy as can be to configure and run a Linux system. I'll bet my life on the fact that you weren't able to pull Linux out of a box install it and have everything running great. You most likely had some reading to do, and had to attempt setting it up a couple of times to really learn the system. Not to say that any idiot can learn MSWindows the first time they drop the CD in, but they sure can get the thing running nicely a ton quicker.
On a complete flipside note.
Wow.. How cool would it be for us technical folk if MSWindows dropped out of the picture... There would be even MORE of a demand for us, cause noone would be able to pop a CD in and vwala! Our OS is installed. We would be in terribly high demand. Getting paid through the roof! That dang Bill Gates has created a business savvy OS that lowers the overhead of every corporation that uses this easy to install/configure system. He not only stole DOS... Now he's stealing your earning potential.. Ok.. now i'm getting silly and my finger muscles are starting to hurt.. Anonymous Coward....
Hrmm.. Lazy fella I am
can you please pass the qualuads ?
that article said 'sextillion'
i've read your post, nice work. keep it up !
The really funny thing is their reason for this:
"why not save operating system disk space" seems quite a good question, but with every new windows eating up at least four times the disc space of its predecessor, there are quite other areas to improve.
Oh, and not only does M$ innovate cool never-seen-before stuff like symlinks, they also seem to be under the impression that they heavily invented ipv6.
I don't know everything about this technology but I'll hit it point by possible point:
1. If its reference counted links take a look at hard links in unix.
2. Take a look at symbolic links.
3. If its meant to make an administrators life easier, take a look at distrbuted filesystems (NFS & CODA).
4. If it magically maintains the symlinks look at STOW and DEPOT.
5. Everything I've mentioned so far was around before 1997 (3 years ago...) all of it quite mature.
And I did read the article, however infuriating.
Having said all of that, Microsoft could add a small feature or two to one of these, but is the standard of innovation that low? [If the answer is yes, god help all of us.]
Nor did you, evidently.
They only ever mentioned files. And as for all the other morons who mentioned COW, which fucking article were *you* (mis)reading?
Looks like symlinks to me. And additionally looks like the files are going to be symlinked automatically and possibly to a network drive. Uh huh. If that doesn't make you shit your pants then point your IE browser elsewhere; technically-oriented sites aren't for you, Bud.
And to all who mentioned that this idea is being slated just because it is from M$, in this case you're wrong, but even when you would be right, that wouldn't be a bad thing. Good *cannot* come from bad, and anyone who has even glanced at the Finding of Fact from Judge Jackson could tell, M$ has used deplorable tactics and hidden plain truths about deficiencies from their users. Any "innovation" from these guys needs to be evaluated with this perspective. Ignoring their modus operandi is worse than assuming from the outset that "innovations" from Redmond are bad. It is analogous to the security mantra of disable everything by default, and then only explicitly allow that which you trust.
The only true M$ innovation is the ability to turn bullshit into marketing manna. Oh, and to rip people off and still get them to defend you to the death, flying in the face of all reason and logic. M$ devotees do that *so well*.
It has always been possible for an application to install DLLs in the application directory. They just didn't do it. The app directory is searched first when loading a DLL, then the Windows (or Windows\System) directory, followed by anything in the path. So c:\progra~1\office\msvcrt.dll would be loaded instead of c:\windows\system\msvcrt.dll when loading office.
So what happens if the DLL it copied needs to be updaqted for security/stability reasons? Do you need to then mess arroung with updating the DLL's for each fricking program that used it separetly (assuming of course these programs decided to copy the DLL rather than use the systems'). This sounds like opening a can of worms (as does the idea of forcing all files with the same hash to be links to one master file). I suppose this is all really speculative, however, since that article really didn't say.
I'm really sorry I read your article. I now have a really bad headache. Excuse me while I go bang my head againt a wall. fooie -ok, I'll admit it! I like the `Comic San MS' ttf
Or 10 lines of PHP!
i just have to say: this is the most informative post i've ever seen on slashdot.
- pal
well, its copy on write, so you would need to change 50 files. There is no "authority", it isn't broken by design, all its supposed to do is eliminate redundant data. You're just letting your blind hatred of MS get in the way of what's actualy happening. Pretty sad.
I hate to tell them, but I implemented something very similar to this in '70-'71. Back then, HD space was VERY valuable. Every night we ran a daemon that found identical files and converted them to COW shared files.
"God! I so much wanted to moderate in this story, but fuck."
Gee mommy, see how I beat up his karma !
I'm the AC to whom you're responding. You're really agreeing with me, although you seem to think you aren't. If you want a "German-trained Ph.D." to work on your vehicle, I think you should go get him. The person who works on your vehicle should be the person you want to work on your vehicle, not the person the manufacturer selects to work on your vehicle. I repeat, would you want a car where only the dealer was permitted to open the hood? No: the decision whether to take the car to the dealership or not is, and ought to be, your decision, not the manufacturers decision. It's your car, not Porsche's, VW's, or BMW's. It's your computer, not Bill Gates'.
Incidentally, I drive a Volkswagen TDI, which is one of the most technologically sophisticated European cars made. I also do a considerable amount of my own maintenance on it. I wouldn't consider rebuilding the injection pump, and I will probably take it to the dealership to have the timing belt replaced. I am perfectly capable of replacing the air filter and changing the oil ... and so are you, whether you know it or not.
And, trust me, your auto mechanics in Europe don't have Ph.D.'s either.
"but with every new windows eating up at least four times the disc space of its predecessor"
Win95 minimum install 80-90meg
win98 minimum install 120meg
rh minimum install 500meg
sure go ahead & whine about how u dont need all that gui crap thats in redhat
dos minimum install 1meg
"Many of the innovations that are included in Windows 2000 have been built right here from scratch at Microsoft," said Kevin Schofield,
--but most of them we just steal from whomever we feel like.
you know instead of all this nonsense they could
have simply designed permissions so that
ordinary users/programs cant fuck up system
libraries.
wait no! lets not do that! instead lets
have an additional copy of all system libraries
wait! thats too much bloat... lets add filesystem
stuff to do wacky symlink stuff to save space..
why were we using dynamic linking again?
they are going to make such a big mess out of things that vendors will just ship statically
linked executables instead.
Actually, this is more than just symlinks. From
the looks of this technology, it works like this:
When a file is created, it is created in the 'central location'
and a symlink is created to point to that file in the 'central location'
(IE) You copy "foo.txt" to My Documents - The foo.txt resides in central location and you simply have a link in your folder.
If another file is created and it is the same file (size, date stamp)
then the OS will make a link to the already existing file
in this 'central location'. It'll be good for REALLY large files (like mp3's and mpegs, but I think it'll be more trouble than it's worth,
especially when 25GB+ hard disks are around $200.
Rob, You really should read these articles from start to finish before posting. This does not talk about unix similar "Symbolic links" but a single instance database structure for Windows 2000.
"We" are not "all aware" of any such thing, and your statement is prima facie silliness, because it implies that no software can ever be debugged. If fixing a bug introduces others, it's generally for one of two reasons. One, you didn't really fix the bug -- you just covered it up with a hack. Two, the system has serious design problems which are way beyond the category of a simple "bugfix".
To follow your analagy: Fixing cars, sorry automobiles, is best left to the experts, since most car owners are not technically qualified to do so.
Wonderful. I'm an "expert." I'm "technically qualified." I have a master's in C.S. and have developed professionally for 12 years. (Hint: I work for a company you've heard of.) Now, assuming I run a Microsoft OS and Microsoft applications, can you please explain to me how I can fix bugs in that system? By the way, I've found and fixed bugs (and submitted my fixes to the software authors in question) in both the Linux kernel and GCC, so please don't bother telling me "it can't be done." I've done it.
Thus the idea of a "black box" approach is a very good one, probably why it is much used in software testing.
The idea of a black box is wonderful for design, and fine for testing. It is absolutely useless for actually fixing the bugs. I have fixed probably hundreds of bugs in my own and others' code during my career, and you need to see the source code to do it!
So long as the attachment is filed in Exchange, the fs. linked storage won't apply. You'll have 50 GB of redundant data in your Exchange DB. AFAIK (not an NT admin, thank Linus).
well actually, they were not the first to create a spreadsheet application either, VisiCalc was first. -Brap
Code to identify bugs: grep "(C) Microsoft" *.c
If you put a thousand microsoft researchers in a cubicle for a thousand years they will chew through the fscking cables.
Like Java? [a C rip-off]
Sun openly admits that Java was heavily influenced by C++ (while Microsoft never mentions symbolic links in Unix!). Here are some quotes from the original Java White Paper:
See the difference?
|| but a way of transparently and automatically
|| controlling redundency.
| Correct. This goes beyond symlinks or hardlinks,
| but it is hardly new technology. I know of at
| least one revision control system that used
| something exactly like this
| and it was developed in the late 1980's.
I can imagine that this would reduce storage space if 100 or so people decided to download a particular program / software update / whatever.
Of course, they'd still each have to download it seperately first.
IMHO, What we _really_ need is to scrap HTTP/FTP/POP/IMAP and have a unified OO transfer protocol, with unique file identifiers and data authentication checks.
--
Jim Wase
If this puppy searches for duplicate files and links them without asking or telling you about it, two things come to mind. 1. You can't set up the link yourself except by actually copying the file. 2. You can't keep a back-up copy of a file on the server because it would be "linked" to the file in the database. Speaking of the database, this sounds a bit like they're using something like a virtual filesystem within a filesystem dealy. If I remember correctly, don't those run slower than an actual filesystem? Remember DoubleSpace...but no, that was mostly due to the compression. Then again, Lnx4Win tends to run slowly, and its based on a virtual ext2 filesystem... Anyway: Wow, they just thought of this? And it took them a whole 1.5 years to develop the technology for it?
Minimum Linux install: about 2 megs -- a stripped kernel and a statically linked /bin/ash. :-)
Well all they need do now is put all their files in /opt/K and we will get SCO OpenServer 5
ROTFLMAO!!!! Maybe this is why they call it "Zero Brains for Windows"?
Of course, under 10% of the posters so far have read the article, wherein we might then realize that what MS is claiming to have done has very little to do with symlinks- the only part where they mention symlinks is in one sentence in passing. The rest is dedicated to describing an automatic process which detects duplicate files and sets up symlinks in order to have a single instance of them. Some of the credit for this goes to Rob; this was a gross mispost that shouldn't have been described as it was unless he didn't bother to actually read the article.
Now, that said, the innovation sucks. As somebody mentioned, this defeats backups- even if there is some way to turn it off, that's extra overhead, not the "Zero Administration" that MS claims this promotes. Moreover, how about security? This is worthless in any multi-user system since you can't consolidate duplicate files across users for obvious reasons.
It pains me to do this, but... It would appear that what Micro$oft has done is automated the process of creating/managing the symbolic links. They are checking the file system for identicle files as things are being written and automatically creating the links, rather then having the admin do it. At least this is how it sounds to me reading the article. If this is in fact the case, then this IS new technology. If they are doing it the same, or similar, way that Unix does it...then you're all right and the slamming is more then well deserved. Now if it is an automated process, like I think, I don't like it. I'd rather have control of what gets linked and what doesn't. The NTFS is not backward compatible from 2000 to NT 4. This might be one of the reasons why.
Namyzarc
You're quite confused here.
The text translation in windows happens before the filesystem layer. The files on the disk do not have text or binary properties.
Basically, when you open a text file in windows, the system libraries convert your \n to \r\n for you before writing to the disk.
Yeah, but what the hell do you do if you really want two separate copies? LMFAO
Xerox-Parc: mostly based on earlier work by Engelbart, Smalltalk based on Simula
Pong: based on tennis :)
Its all really a question of what you consider "based" to mean. A web site I cannot recall listed all of Microsoft's "innovations" and researched which ones were real. The only ones which no one found prior art were the tabbed dialog and the dancing paperclip.
Well that's what the MicroMorons never learned. They got command interpreters that copy and move files and tell you what time it is. Ken Thompson is prolly been having a fit over these jerks all these years. Look at the UNIX shells and then you know. And it's the same with these monstrosities. Under the Wobbly Arches.
The point remains that he's right, the Slashdot tag line is deliberately wrong and yet all of the MS-haters accept it as gospel anyway.
>Hrmm well, you know I can even Boot Windows 2k
>with 0 MBs? Wonder how? Network Boot...
Wow, you mean I haven't been able to do that with my *nix systems for the 5 years I've been working with them? Damn, somone better tell Sun they don't have network boot in the PROMs after all...
>So nothing new here....
Got that right.
Better check again, liar.
Without the feature you would have run out of space long before this. Any more contrived examples?
You can restore it, because in order to write changes to foo.conf, the symbolic link has to be broken. So foo.conf.old remains unchanged and can be used to restore.
Winblowz 9x is obsolete...
Actually, no it's not. If you look at objective measures, such as patent counts, MS sucked rocks until the very last couple of years. Apple regularly stomped MS in pure number of patents (until 1998, when Apple's business problems took their toll), an incredible feat considering their relative sizes.
From 1990 to date, Apple has been assigned 1,270 patents, compared to MS getting 1,168, and an incredible 15,715 for IBM. (No idea on the quality of those patents though...)
Based on queries to the U.S. Patent Database
While the implementation of this idea may have some new wrinkles, the fundamental ideas on which it's based have been kicking around for a long time.
The article speaks of as program that searches for identical files and then creates symlinks. I think that may be a genuine innovation (but possibly a dangerous one). The real problem with this article is that it does imply that Microsoft has invented symlinks. No, I take that back. It doesn't imply it. It actually says that. I'm glad I got to read this and laugh at it before the Higher Ups at Microsoft see it and pull the article down.
so if abruptly my hard drive gives me some 'Uncorrectable error' on one of these files, symlinked somewhere else, I could recall: "Yes, I have another copy of it..." and pain,pain,pain
Loser, In a unix environment, and even in NT, we have such things as 'permissions' that did not exist in DOS. I have root, my users do not. Perhaps Bill Gates/Microsoft was so successful because people REALLY want a bad OS deep down inside.. If only Apple had close sourced their OS, they would have shared Microsoft's success! I sincerely hope people don't pay you for your thoughts.
You do have a point on this article. That is what I said. Another thing that closed source can do is allow one point of failure and repair for the OS. Once that occurs, you will not need to rely on "hunting" down obscure scripts to fix this problem and that problem. All Oses have problems with one thing or another... but only the closed source OSes get slammed? Hrmm, if I review the bug list for 1999, there were more bugs reported for *nix environments that there was for Win NT. Also, How many Builds of *nix sources have there been vs Closed Source? about 20-30 times more. As an administrator, I would not want to spend my time upgrading to the latest builds every 2 weeks. I would rather work on securing my network and eliminating all the backdoors and buffer overflows to gain admin access. Anyone else in the same boat here? Anyhow, as the old saying goes, 'to err is human' The more you have to play with something, the more problems will appear, the more time spent fixing things, the more unhappy your bosses are. Yes you can progra all your fixes yourself, but when you have to upgrade to a new build, you have to reapply your fixes if they are not already in the new source. Hence, Extra work for nothing.
they may not be able to take back all the old versions but they can easily take back all the new versions to be released.
>Ahh, I see, so unix has these smart auto symlinks, inbuilt
>text-speech, inbuilt smart troubleshooters etc?
>Um no.
Microsoft being first with inbuilt text-speech? Hardly. Amiga's came with this as a standard feature nearly 10 years ago. So much for Microsoft "innovation"
>They never said they invented text to speech. Adding text to speech in
>an OS for accessibility is in their minds innovation. And besides,
>they were mentioning it as one of the developments contributed by
>their research division.
Yeah right. Ever heard of a computer from Commodore called the Amiga? Of course Microsoft did. That's where they stole this from.........
IS it April 1st yet?
XenoWolf The Original - Since 1993
Shouldn't this have been posted under ;)
"It's funny: Laugh"?
Geeky modern art T-shirts
the article sortof suggests that they set up the links automatically
It looks like they're setting up hard links, as well as one of those annoying background processes that searches for files to make into hard links.
<rant>
Is it just me, or does anyone else have a problem with continually loading up more and more overhead to a system? I don't mind a cron job--I can see it, change it, delete it. But this may be one of those automatic 'features' which just means you'll need a 800MHz processor to get the performance you used to, without the 'feature.' I'm one of those people who hates it when my disk, cpu, or network monitors start showing activity I don't plan myself.
</rant>
Steve
Ahh, I see, so unix has these smart auto symlinks, inbuilt text-speech, inbuilt smart troubleshooters etc?
Um no.
Microsoft invents Office with shitloads of extra features, "ha, that's not innovation, that's been done thousands of years ago, it's called stone tablets".
Yahoo has a nice little story about Microsoft Kerberos. The article is here. Here is a really brief summary, but if suggest you read the article. Kerberos is a technology that prevents user passwords to be sent through a network, pretty standard stuff of UNIX. Anyhow, Microsoft finally decided that it was a good idea and implemented it in W2K. The only problem is that their version has undocumented changes that they made!!
Real men dump cores! Read my journal, I am neat.
But potentially you might have two identical small configuration files for different s/w packages (I might have , I don't know). I don't want my system deciding which files are identical and which not. The argument regarding hard and soft links is in my opinion specious, what IS important is that I decide (and not the system) when I can symlink something which may be important!
Seems like a clever idea to me... MS or not.
From what i read, it seems that their sym-link "technology" automatically detects duplicate files and automatically creates a single file and symlinks from the duplicates. So in the automatic, transparent from the administrator part, it is a new idea. (as far as i know, is there something like this for *nix ?)
I think it wasn't included before simply because NT4 was the first release of NT that was heavily used, (I used 3.51 a lot but we didn't rollout NT at work until NT4/Sp3) It simply wasn't needed yet and since there were a relatively small number of users the feature was never in high demand.
You know what's pretty sad? Slashdot is considered the essential Linux site yet they do more do damage the image of a Linux user than almost any other site. Stories like this, which clearly were not even read by the poster, only make you guys look like rabid zealots who'll simply ignore the truth so you can have a circle jerk over how much you hate "M$"
It's nice to see someone such as yourself taking a stand against bigotry. If the Linux community had an image of fairness and open mindedness they wouldn't be such a joke.
I was worried I was the only person who understood the difference. I really hate to admit it, but for once, MS has really impressed me. As long as it works, and its optional (very important, because I should have the option to opt for speed instead of space), I think it's good work.
To be honest, I think the linux community (or the OSS community in general) should try real hard to demonstrate prior art lest MS patents this because it is a good idea and I would like to see something similar in the OSS OSs.
Now, to comment on the Slashdot article itself, I am of the firm belief that 99% of all FUD is generated by the Linux community. For crying out loud, please read more carefully, both people submitting and when tacking on comments. "Read before post" rules apply to posting stories as well.
What's really horrible about all this is that they decided in 1997 to fix a problem that was apparent in 1990 (or before) and it took them until 2000 to ship it.
As a highly regarded "open source" expert in the world of marketing, I am someone with considerable insight into the world of "open source" and "GNU" software, which I have been studying from a marketing perspective since the very beginning of the phenomenon over four years ago. I am sure you will find my opinions interesting, even if perhaps some of you may consider them controversial.
To begin, let me just state that I respect the products of great software geniuses, whether they be "open source" proponents, like RMS, GPL supporters like Linus Torvalds, or whether they are simply super-human awesome (closed source) progamming experts (like Bill Gates and Steve Ballmer). Anyone who can write efficient bug-free code in a language as arcane and needlessly complex as C++ deserves kudos in my book. Trust me when I say, I have no axe to grind on this issue. When I want to surf the net, I am just as likely to use RedHat, as Windows, and I even recommend Linux as a web-server platform to our smaller clients who don't need real scalability or high performance. But I digress.
The main point that I need to make to the minority of hardcore GPL Linux zealots, is that the whole GPL thing is holding Linux back.
Linus Torvalds needs to produce a BINARY only release of Linux, and must take Linux back to the "closed source" model, if Linux is ever to compete with Microsoft on the desktop.
It's not enough to provide Joe Sixpack with a user friendly gui (KDE/Gnome) or a nice install tool. Joe Sixpack is quite simply not tech-savvy enough to be trusted with the source code to a complex system such as Linux. He probably can't even find the on-off switch on a bad day.
Anyone out there who has ever had to administrate the machines of a bunch of clueless users will be with me on this. How many times have you had to reinstall someones system because they screwed up their "autoexec.bat" file ? How many "config.sys" files have you seen trashed ? How many person-years of productivity have been lost because one of your users managed to fire up "regedit32.dll" and wreak havoc on the registry ?
Now imagine, in a Windows system there are only about 10 files the user can damage. Imagine now that the user has the whole source to the operating system ? Multiply the cluelessness of the average PC user by the number of opportunities for him/her to be clueless, and you will come up with a very large number indeed.
The user will never admit they screwed it up either. For Linux to succeed, Linus must take the source code back under his own copyright, and only make it available to certified Linux professionals, who can be trusted not to screw things up. To make sure they are serious about Linux, a nominal fee should be charged for the source code license (say about $1000).
Some people may whinge and say this is against the spirit of "open source", however, for Linux to survive it must adapt to its market, and in the consumer market place, the "closed source" model rules the roost. An examination of Microsoft (MSFT) stock price will demonstrate this clearly.
Again, I offer my considerable marketing expertise to this forum at no charge, as my attempt to "give something back to the community"
Thank you
dmg
they are still just a single way of doing things. Just because they are not _precisely_ symlinks, and are actually a bit closer to hard links, does not mean it is superior. We want concise clear tools (UNIX mantra) that do each of their little jobs well. If something needs a bigger system we want the points of functionality well reflected and refined in the interface. UNIX has symlinks and hardlinks which are nice, but it also has options requiring more infrastructure as well that blows this M$ option away.
There are many systems in the blurry area between object and page based DSMs and distributed filesystems, but the one that jumps to mind is CODA. Just read about it and then try to defend M$'s great new enhancement to symlinks.
It seems that there are really only two types of shared files--the first being configuration files and the other being actual work files. Config files *should* be small and trivially reproduced for each user as a distinct seperate file. If you are worried about the space this will take up then your system isn't suited for the number of users you have.
For larger shared files there are two types--ones that changes should be universal (everyone edits one file) and ones where people need individual copies. Both types are commonly known as database files. You either edit a record or create a new one.
Which brings up the biggest problem--this *innovation* sounds like it is one big itch of a security problem. Finding a bug that allows someone to maliciously make this system erroneously use an evil config file that gets replicated for every user is one of the worst situations I can think of. But the more common problems of granularity control as found in enterprise database systems seems like a major admin head ache.
If MS has created something that is genuinely a *good thing* that can be used in specific areas fine. But it seems that this functionality is really only useful for a limited set of machines. The example that comes to mind is a mail server where duplicates can be common but I don't think a general solution can beat an application specific one. The general solution can not address the space savings vs. speed as well as an app specific solution.
ACK (but don't SYN)
- OS searches for system files that are the same. Perhaps also for non-system files but the article doesn't really indicate that. It then creates a link to the appropriate file.
- If one copy of the file changes, it makes a whole new copy of the (now modified) file.
This is more than just a link. Quite a bit more. It does have a lot of potential problems, including the potential for a small change in a file to overflow the hard drive. However backing up should not be impacted. Also, if I'm backing up a file on the same drive/OS for fear of data coruption (by the OS or stray alpha particles) then this will hurt me. As long as this is easy to disable for both the admin and an individual user, I wouldn't expect too many problems.On the whole, it looks like a good, but minor, idea. This might be useful on some Unix'es for user dot files, but I don't think it is too useful otherwise. Hob
MS only talked about this while mentioning "servers." Perhaps it is some add-on just meant for the file server. I know at the last place I worked, the admins were constantly fighting with various departments for having many, many copies of documents and such on their network drives. One user would create a howto document in their directory, say \accounting\joe\howto_balance_company_checkbook. Another user in Accounting would then copy that directory to his own directory to further add/edit/etc. Multiply this by many people and a 10mb document (filled with huge screenshots and the like) started eating up 100+ MB. In this situation, I can see an automatic cleanup and linking service (and unlinking on new writes) quite useful. Lord knows the users refuse to clean up after themselves.
What's the old saying?
If you put a thousand microsoft researchers in a cubicle for a thousand years they can eventually code links and locate?
This is the main thing that got me. I know Windows installs can be large, but freeing up 90%? Hell, even freeing up 50% would indicate to me some sort of problem...
I have a feeling this is in reference to their symlink-to-server feature. Maybe what happens is if you have application foo on your system and it's also installed on a mounted server, your application foo turns into a symlink to the server.
Personally I don't like this at all. The automation process has *so* much potential of going wrong. Last thing I need is windows (well, not that I run it anyway) going off and arbitrarily deciding to delete/symlink files. It's not like windows doesn't have enough probems with keeping track of DLLs *anyway*.
{"/home/green"}$ ls -l big
-rw-r--r-- 1 green green 8796093022207 Jan 31 01:48 big
{"/home/green"}$ uname -a
FreeBSD green.dyndns.org 4.0-CURRENT FreeBSD 4.0-CURRENT #69: Mon Feb 28 00:46:42 EST 2000 green@green.dyndns.org:/usr/src/sys/compile/GREEN i386
--
Brian Fundakowski Feldman
ln -s
sed -f lamescript bigfile
With the Windows COW behavior, this would do the wrong thing and undo the link, create another (huge) file, and possibly run you out of space. The proper way to do this would be to implement a hard link, symbolic link, and THEN if there's a dire need that "COW link". Otherwise, you break POLA for experienced users, and inexperienced users are just thoroughly confused.
--
Brian Fundakowski Feldman
Erm, Apple Macintosh has symlinks (actually called aliases) since System 7, which even predates Windows95; and unlike UNIX symlinks they don't break when you move the target. And most Apple users are definitaly *not* geeks.
> What actually happens is that when an object is stored in the system - the system checks to see whether it is
> identical to another object, and if so - just stores a reference to the other object. This is achieved using a
> "signature" or in non-M$ language - a hash.
This scares me evin more than M$ claiming to invent symlinks. Imagine having a web page you're working on. You store a good copy in a directory as a backup, then put another copy in a different directory that has your changes in it. But wait.. before you get to modify you devel copy, Windows steps in and makes them all the same file, so when you go and edit your devel copy, you're also editing your backup copy. Then you make a stupid mistake, delete half the text and save.. well you also lost your emergency backup, so you're royally screwed.
This is yet another case of M$ thinking it knows what you want and doing it without asking you.
Jag
But you're forgetting an additional use of this "innovation". What happens when one of those 1,000 people decides to email his 999 coworkers the latest copy of his 50MB Word file. Now you've got about 50GB of data out on the network, all copies of this single file, which will probably never be modified by those 999 people.
As I understand it, this technique would use just a single copy of the file, saving a bunch of disk space.
theswindle.com is gonna have a monster traffic today, egads.
In their "press release", Microsoft says:
With 450 researchers working in four Microsoft Research labs on three continents, and thousands of engineers developing more than 100 different software products, bridging research with product development poses a formidable challenge. To ensure communication between Microsoft Research and the teams developing Microsoft products, Microsoft Research formed the Technology Transfer Group.
Headed by Schofield, the four-person group serves as a liaison between Microsoft Research and the company's product groups, working to ensure Microsoft research is incorporated into the appropriate software products as they're being developed.
Now, there will be no shameless anti_Microsoft comment here. I think I see the fundamental problem with their "research" group. They clearly are not technical, or endowed programmers. I don't doubt MS has all these folks, but they're basically asking hundreds of starving artists who may have never heard of a symbolic link before, a core UNIX feature, to come up with creative ideas. Of COURSE this very idea will come up and they will honestly feel it's innovative.
For all the Microsoft haters out there, be GLAD they just now figured this out. Things would be much worse for you if their research group consisted of 500 subscribers to the linux-kernel mailing list.
I've always thought that M$ shortcuts were their typically broken attempt at symbolic links. If this is suppose to be their attempt at hard links I guess it will be just as twisted.
Let's make sure that they don't get a patent on this!
The difference between Canada and the USA is that in Canada healthcare is a right and gun ownership is a privilege.
There are many optical backup systems that do pretty much this. They may not scan looking for identical files, but they do automaticall migrate files to a near line storage system and replace the original with a link. Absolutely nothing is new with the M$ approach.
The difference between Canada and the USA is that in Canada healthcare is a right and gun ownership is a privilege.
Yes, it's too good to be true. First of all, I've yet to see a Windows app that puts common DLLs in its own directory instead of the Windows directory tree. Second, it wouldn't matter, because of Windows' braindead DLL loader. Third, each app may have a different version of the DLL - so even if the above two conditions weren't true, automated combination wouldn't occur, because they wouldn't be the same files.
Sam: "That was needlessly cryptic."
Max: "I'd be peeing my pants if I wore any!"
But you're forgetting an additional use of this "innovation". What happens when one of those 1,000 people decides to email his 999 coworkers the latest copy of his 50MB Word file. Now you've got about 50GB of data out on the network, all copies of this single file, which will probably never be modified by those 999 people.
When some luser does that on my network, I'll be sure to make certain they know the depth of their mistake by means of repeated application of the clue stick.
Amiga stored its configuration files in two places - in the virtual file system volumes ENV: and ENVARC:. ENVARC: was typically stored on disk, and ENV: was usually a temporary copy usually stored on the RAM disk. Having a temporary configuration state for the current session was pretty convenient. If you broke a file, you always had an unbroken copy in ENVARC. =) The files in ENV: lived until you turned off your computer. The files in ENVARC: were persistent. However, almost each and every Amiga program that needed configuration files placed them in ENV: and ENVARC: by default, and when added up, these could use quite some space. Memory was more expensive in those days and AmigaOS did not have any support for virtual memory hardware.
What HappyENV did was create a virtual device which would expose each of ENVARC:'s files as ENV:'s. It would not be loaded into RAM until you actually open()'ed it in which a new file would be created in RAM, managed by HappyENV. It was not exactly copy-on-write, more of copy-on-read.. but the principle is there and it can probably be changed into using copy-on-write very easily. I believe that the author have thought of copy-on-write and figured that it would be more efficient to keep files in memory for reading as well.
HappyENV can be found in the software archive Aminet, for instance here and is Cardware (you should send the author a postcard) with source code included. (assembly language)
"We mustn't be caught by surprise by our own advancing technology" -- Aldous Huxley
Sounds like the infinite amount of monkies typing on an infinite amount of typewriters deal to me.
In Republican America phones tap you.
No, the linking is only happening within a specific filesystem. Different disk means a different filesystem. My reading of this is that this linking would be invisible to all user-level programs, it is simply built into the filesystem.
Politas
Does that not lose just about every benefit of having the PHP interpreter resident as part of Apache?
Most of this system could be implemented trivially on any Unix. (The copy-on-write stuff, if it exists, might require a kernel module, depending on the implementation's needs.) Take the output of 'find / -type f -print' (adjusted to taste) and pipe to a Perl script which hashes md5sums and file sizes to filenames, then takes matches and creates (sym)links accordingly. If you're really into this idea of a "central store", copy the files to somewhere under /var/spool first.
The main reason this hasn't been written is that most real sysadmins recognize the pitfalls you describe. So, there's no demand for this "feature".
Microsoft Research has done it again! New, from the people who brought you the Office Paperclip: Auto-Backup-Destruction!
Symlinks with copy-on-write could be very useful. But the article doesn't mention anything about copy-on-write, and automatic symlinking without COW could be very dangerous.
Has anyone seen any pointers to additional info on this 'feature'?
This is a good thing - it should reduce all the space taken up by the people who email huge dox back and forth. Now if only it works for all those chain letters ...
--
"I find your lack of faith disturbing." -- Darth Vader
I remember hearing about this type of technology a few years ago in relation to Netware. Seems like a sensible way to conserve disk space, and maybe more interesting in a networked environment.
... "Their interesting things are not innovative, and their innovative things are not interesting" (in reference to MSFT of course)
This reminds me of the old saying
"Single Instance Store (SIS) Single Instance Store (SIS) is a file system filter driver that conserves disk space by removing multiple copies of a file and replacing them with links to a single shared copy in a common folder. These links differ from hard links in that if one copy of the file is changed, it then "splits off" from the others and becomes a separate file. SIS, which is installed only as part of Windows 2000 Remote Installation Services (RIS), uses two FSCTLs for private communication (FSCTL_SIS_COPYFILE and FSCTL_SIS_LINK_FILES), which your filter driver should allow to pass through unchanged."
I dunno how it "splits off" anything if "one copy" of the file is changed -- I thought there was only one instance of the thing anyway..
It sounds kinda like there *may be* multiple instances of a given file but a link to only one "official" one..
Again, I dunno...
t_t_b
--
I'm on PJ's "enemies" list! Are you?
It is amibiguous.
.. to make the link seem real. Witness the behaviour of the GTK+ file dialog for contrast.
What I meant was that when the shell chdirs into a symlinked directory, it fakes out chdiring via
It's not something new, it's not different, it's not an innovation. It's a small, almost isignificant *feature* to something that's been around longer than me - links. I say insignificant beacuse as so many people have pointed out, this can be done quickly and easily with a perl script.
.login or .logout or what have you. But when people start making changes to these, which they inevitably will, the saved space is lost anyway. I suppose it may be argued that some may not change the files, and that having the space before anyone does is still nice. However, may I remind you, applications by default should find these files in /etc if the user doesn't have them in their home directory, and not require a link or extra file of any kind.
Ok, so maybe when you change your link to the copy it's no longer a link, but your file. Big, fat, hairy deal. It'll save you some space when you initially setup a system with a gazillion users needing a gazillion copies of
The space saved is negligable, at most, when dealing with files as small as default home directory configuration files like ".logout". If there are duplicate files larger than this, like dll, or executable, there shouldn't be any extras lying around. That's bloat and OS inconsistency and Microsoft is trying to make up for that with this.
On my Win98 box, I found several copies of various versions of the "OLE" dll files. If Windows had a decent hierachal file system, for example, and got its libraries organized in specific directories, then the problem of duplication, triplication, etc. would not be as much of a problem. Look at Linux. There are several different directories in which library files are expected to be found, and several different names each file may have, so they're all symlinked to one file.
I'm an inexperienced home network admin, and I can't think of any files other than those small home directory configuration files that a system might have duplicates of. If someone would care to balance my examples by posting some more, please do so.
My argument here is to simply contradict those who say "read the article, it really is new" because it's not. It's an old idea with a new feature. The whole idea or technology is based around a link, which is an old idea; this article gives the impression that it's the most new and innovative thing since sliced bread.
Did you people read the article? It took 3 guys to think of it, and I imagine the initial idea was a symbolic link after which the new feature followed, then it took 3 reasearch guys and Bolosky one and a half years to design, working full time. The article doesn't have much information on "Single Instance Store" (why not just use "link", it's much more concise.) but I don't imagine the technology is that complicated.
It seems to me that most of that is to hype it up, and maybe because they wanted to wait for W2K and say this is one of the brilliant new features making the purchase worthwhile. Case in point:
"A key administrative improvement in Windows 2000, the Single Instance Store is among the many innovations built from the ground up by Microsoft's research arm 'Microsoft Research.'"
First of all, most people here agree that the idea is based around links, as such, the "innovation" was not built from the ground up, since the idea has already been implemented elsewhere. Secondly, it's a "key" improvement to W2K; please tell me the new version of Windows packs a better punch than an old dog with a new trick. The other half of the hype in that excerpt comes from the mention of their "research arm"; it makes the new feature look that much more incredible.
"The Single Instance Store...consists of two pieces. Searches for duplicate files...stores these signatures in a database. Merges duplicate files. The other piece implements the links..." There you have it, a database integrated with an old idea.
This article is so full of rediculous content that it's, well.. rediculous. I'd like to trash the whole thing, but I feel I've done my share for now. I'm not a Linux or OSS extremist, nor do I hate Microsoft, I just don't like them and am getting tired of this kind of BS.
-kidlinux.
Well, I just checked one of my boxes and I have 140 duplicate files.
I would save oohh 120Mb on a 100Gb file system. I REALLY need this software... NOT.
Why don't you check your systems? It's easy, find, checksum and sort. Then diff the duplicate checksums.
This software is just a silly toy for NT administrators to drool over.
Deleted
Do you just copy files from the client to server? Why?
That's dumb. That's what tape is for. Tape is cheap storage. Why would you use disk?
You need ADSM pal.
Deleted
You might want to try it.
Deleted
If that's all it is the god help them.
We're talking cron and a 4 line shell script to implement similar functionality on Unix.
Would I implement such functionality? Would I hell...
I don't believe for a second that significant amounts of disk space will be saved. What a waste of time.
Deleted
I just checked one of our file servers (unix) and on a 100Gb file system, ~ 60Gb used, I have 120Mb of duplicate files.
MS are inventing problems to solve.
Deleted
It'd only be useful if all your files are copies of one another and that in itself is just dumb.
It's a solution looking for a problem which doesn't exist.
Deleted
How can anyone want their OS to eliminate files it thinks are duplicates? Doesn't the fact that Unix doesn't do this tip MS off that this is not a good idea?
Why aren't they funny? In this case, the "release" gave me a framework of satire to express my opinion of the subject at hand. Rather than some stupid "first post" or "hot grits", I tried to be humorous in a constructive, contributory manner.
If you don't agree with my sense of humor, I don't mind; I may not agree with yours, either. However, saying something so general as "Fake news releases aren't funny" is similar to "Anonymous Cowards are uninteresting". It attacks the vehicle rather than the content.
I'm sorry if you didn't enjoy my parody, but I meant it in good faith and spirit.
Dewey, what part of this looks like authorities should be involved?
It's the same as hard links (you know, ln without the -s option. Simple, been around as long as soft links. Try creating hard links to files and then modifying or deleting them.
Life is complete only for brief intervals in between toys or projects -- John Dalton
we wait for system to "merge" them and right after that we append 1 B to the end of one - system is for a long time (seconds? minutes?) busy with making copy of changed 200 MB file to make it possible to append that one byte.
this does not looks like increase in performace even when we do such thing very seldom.
to avoid that, one have to be able to limit such automatic merging.
and without automatic merging this SIS is just `ln -s file1 file2` with aditional "daemon" which is causing problems from time to time (can't say how often) - i.e. SIS is just less reliable ln.
and this does not looks like great innovation.
hany
I doubt this will break backups because it does not go between file systems.
A MUCH better way to do this would be to cache identical *blocks* of the file system. This is identical for identical files, and should compress better when files contain identical pieces. This has to be built into the file system, so really they have kludged together a poorer system that does not need to modify the disk layout. Also, this has been done for years and is not an "innovation".
Symbolic links are probably the most serious missing piece of functionality on NT. They prevent the migration of files to/from NT/Unix because of the need to add "drive letters" to the starts of paths. In the software I am working on, the NT version has great amounts of code that basically replaces "/foo/*" with "z:/foo/*" (with a lookup table of values of "foo" and "z") before EVERY SINGLE FILE OPEN! Needless to say this is an incredible pain, especially when dealing with scripting languages.
Symbolic links were purposely removed, too, possibly to discourage Unix/Nt interoperability. MicroSoft will deny it, but versions 4 and above of MSDOS had an "assign" command that was the same as a symbolic link to a disk. If this command still existed our interoperability problems would be solved (and, if MicroSoft is listening, we would be using NT a lot more! Idiots...)
"Shortcuts" are not symbolic links. They are small files containing the actual name of the file and other information such as the hotkey. In fact you can open them in the text editor.
A couple of years ago I was doing some reading on the Linux kernel, and I ran across a research project which did something similar, in main memory. It was called something like the "memory merge kernel", but those words are so often used together that I can't find anything relevant on google (and Linux.com keeps coming up first for no reason - blecch).
The idea was the same - extend copy-on-write (which only works on memory copied from a fork()), by scanning for and merging duplicate pages, regardless of origin. This way if 100 users are running netscape from 100 different shells, the duplicate executable pages can be found and remapped to unique copy-on-write pages. It was supposedly very efficient, with a slight hit in overhead of around 1%.
---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
let's take advantage of theprevious story and register .unixhaditfirst. : )
REDMOND, WA: Microsoft (MSFT) stock jumped 45% today after a press release stating that a new invention "which will revolutionize transportation" had been created in their highly secret labs. The inventions, which can be mounted underneath large crates, allows the crates to move with very little friction to the ground. The invention, which quickly was nicknamed "wheels", is a result of considerable team effort, and comes only a week after Microsoft invented the not quite oval-shaped line, nicknamed "circle".
In the same press-release there is a mention of another invention called "sliced bread", but the details about this are not known at this stage.
(Uh... before I forget: Yes, it's quite possible that there actually _is_ something new about "Single Instance Store", but I couldn't help myself...)
Slashdot has it wrong today. Microsoft has had the equivalent to Unix symlinks for a long time--they're called "shortcuts". Like a Unix symlink, a Windows shortcut is a small file that does nothing but point to another path where the real file is.
Huh? Please do not confuse shortcuts with symlinks. Symlink is a file system-level object. If I create a file called foo and make a symlink to it named bar, then, as far as any application is concerned, foo and bar are the same thing. I can open bar in an editor, for example, modify and save it. In fact *any* application that needs to read from/write to a file will not see the difference between foo and bar
If I create a directory called foo and make a symlink to it named bar, foo and bar will be equivalent as far as any application is concerned. Once again symlinks are file system level objects.
"Shortcuts", on the other hand are GUI-level objects. If you create a file foo and make a "shortcut" bar to it, you cannot treat it the same way as real files. Sure, you can create "shortcut" to a file and then double-click on it to open it in an application. But that's about all you can do with it. It's a GUI-level object. Anything that does not deal directly with GUI will not understand symlinks. That, for instance, includes daemons err... sorry "services".
As for automatic symlink creation (which is what it looks like Micro$oft has "innovated"), I find it of very low value. First of all, you do not want different files to be symlinked just because they have the same contents. In fact you often make copied of config files for backup and what not. But, once again, Microsoft decides to make decisions for you.
Secondly, the overhead of keeping hashes err... sorry "signatures" of files will likely outweigh the benefit of saving space. It will take up a lot of CPU cycles and memory. And HD space is much cheaper then memory and CPU cycles.
Windows 2000 is a piece of bloat as it is already, and it will become even more memory-hungry as a result of this "innovation".
___
___
If you think big enough, you'll never have to do it.
2M... do you know how many single disk distros there truly are? my favorite is tomsrtbt but there are a bunch of others.
Did you mount a military-grade, variable-focus MASER on an unlicensed artificial intelligence?
"Behold! I drag this big file from one directory to another on the WNT box... and it starts copying... I'd say that was about 5 seconds, yes boys and girls? Ok, now let's try the same thing on Windows 2000. Bear witness to the marvel of negligible file copy time!".
Gasps of astonishment fill the auditorium! Rustle, rustle, the sound of a thousand fools reaching for their wallets....
Once more unto the breach, dear friends, once more, Or close the wall up with our American dead!
okay - so the symbolic link is aimed at an NFS
mounted file system. Or maybe it's an ftp'able file system - both have been done.
This is a non-invention. Heck, I remember using sym-links on BSD 4.2 in 1984.
Have you compiled your kernel today??
Sounds like their automation may create a new form of DLL Hell. One that would affect all kinds of files.
Too bad they didn't invent the ability to mount a filesystem in a directory path in Windows 2000/ME.
She believed, for instance, having learnt it at school, that the Party had invented airplanes. (In his own schooldays, Winston remembered, in the late Fifties, it was only the helicopter that the Party claimed to have invented; a dozen years later, when Julia was at school, it was already claiming the airplane; one generation more and it would be claiming the steam engine.) And when he told her that airplanes had been in existence before he was born, and long before the Revolution, the fact struct her as totally uninteresting. After all, what did it matter who had invented airplanes?
really? it seemed to me that the article was trying to make someone believe it would work accross a network, even though it didn't. in other words, I thought MS was trying to make people believe that if they "read between the lines" but wasn't about to outright lie about a feature that they don't really have.
Somebody get our flag back!
.lnk files are worthless unless you're doing 1 of 2 things:
1) launching an executable
2) launching something that is a registered filetype (i e, C:\>foo.lnk where foo.lnk points to foo.txt)
Quick correction:
3) dragging files into a symlink folder
Hamish
"Wise men talk because they have something to say; fools, because they have to say something" - Plato
"Do you want one more new feature," Lucovsky concurs, "or do you want to fix more bugs?
Well, if you work for Micros~1, the answer is clear.
My personal favorite from the worshipful OS/2-becomes-Needs-Towing was this gem:
"What I think is cool," Cutler interjects, "is that the system doesn't crash, and it doesn't lose my work, and it has functionality. I could care less that the visuals are flashy if my 32-gig hard drive goes away."
My employer-mandated NT4 workstation crashed 10 times in February. And I hate to tell these Olympian gods, but it does lose work. Mebbe I need a 32G HD like them to prevent crashes, huh? Sorry, boys, I ain't that suggestible.
Yes, it's true that this idea is a lot more than just symlinks.
But it's STILL not innovation. It's nothing more than a logical expansion on an existing idea. Symlinks are still the core idea at the center of all this, they just added some of their bloatware on top of them.
Who gives a fig about disk storage anyway? What I can't understand is that we still have DLL's: load load load load load for every request. Does anyone remember reentrant code a-la CICS? Load run run run run run. Maybe someday MS will "discover" reentrant code. Dare we hope that some MS genius will unearth the mysteries of running reentrant code as a nonstop task in the communications subsystem.
Of course it is what you or I call non-symbolic ("hard") links. Exactly the behaviour you describe.
Geoff
It sounds like this thing is more like an agent that actively seeks out duplicated files than an operating system mechanism. It stores a value that's relatively unique to each file. When it encounters two that are the same, it's found a duplicated binary, and it just uses one binary for two files. But that doesn't sound exactly 'innovative' enough to warrant a web page, so there must be something I'm missing.
If they're at all intelligent (and I don't think they'd let something this simple break that bad), they'd put some sort of flag on the duplicated files that tells them to give it it's own binary when the original is changed.
Visi Calc was the first.
Shaun Nelson - Bastard Operator (From Hell / For Hire)
"Dear $BLIND_USER, I am Microsoft Text To Speech, and I'm your guide for tonight. To the left here we have a squarish grey object with the letters "start" on it. You can use your mouse to click it, and you'll get the start menu, from which you can launch applications. Now, if you just move your mouse around, I will tell you when you get closer."
It would be easy with a command line, but I'm curious how they intend to do that to something like the Windows GUI...
Wanting to check MS's assertion of 80-90% recoup of space, I wrote a Perl script to go through the files on my Win98 partition, generate an MD5 signature against each 1k block, keep track of duplicates, and spit 'em all out when it's done. I realize this is different from what MS said they did, but I also think it'd be neat to have a file system that did this automatically at the block level. If anything, since I used a finer granularity than MS did, I should have better "compression".
:). Perhaps I'd get better numbers if I ran it against an actual file server. But, based on this, I'm kinda underwhelmed. 10%, while good, is a long way from the 80% they claimed. (Though they did say "as much as", so I suppose I could cut them some slack.)
So, the results: on my Win98 partition, I'd save 73338 blocks out of a total of 742448, or about 9.88% of my used space.
Perhaps my file system is atypical (I don't use it very much, after all
Script available at http://www.theclapp.org/hash_disk
-- Sir Robin
My
"frees up as much as 80 to 90 percent of the space on a server, allowing users to store as much as five to 10 times the information as they could before"
First of all, hwo did they get from 80-90% to 5-10x? Secondly, who the hell has 80-90% of their server space taken up by duplicate files? That's one of the most bizarre claims I have ever read. Is anything approaching 80% of the data on your servers duplicated?
"I would assume that, in this architecture, if you change one of the instances, it creates a new copy and does not modify the original file. This would allow you to have only one copy of a configuration file that needs to be copied to each user's directory for possible customization and only have the extra copies made if someone did, in fact, customize. Can anyone out there who really knows about Single Instance Store care to comment? "
I don't expect that they've got this working exactly like unix links, because that would really change the semantics in the file-system;
What would make more sense (and what it sounds more like, reading the article) is that links are used where content is identical, and links are replace by `real data' when the content becomes non-identical.
Ideally, rather than duplicating an entire file when a few bytes change, only these new, changed bytes are represented on the disk as `real', and the other bytes, which are still identical to those in another file, remain links/pointers--stripping out redundant data (and you can do all of this at the level of how the file-system stores data, underneath the level of `how the file-system presents things to userspace').
This is what's known as file-system compression:)
Compression has been used in MS OSs (sortof: how did doublespace work?), and has also been used in many other OSs and file-systems.
Of course, it'd be really nice to have symlinks and hardlinks, too:)
I'd also like to see another type of link, which I haven't seen anywhere, yet: it acts like a hard link, but, when the `official' (initial?) copy/link goes away, all of these links go away (like `aliases' in Netscape's bookmark-files).
-rozzin.
If you make "backups" to the same filesystem that stores the originals, then you are asking for trouble anyway.
Ben "You have your mind on computers, it seems."
You see, last night, my dorm had a brown out. My computer, during it, instantly cut of. I was web surfing with Netscape, and compiling something or other, and had a suspended vmware in the background. On restart, all I found in /lost+found was some web pages from the cache. I did a make clean to be sure. I was recovered in about 15 minutes.
I don't know what universe you live n, but I don't even worry about losing data from a power out.
P.S. Why the hell is my load average so high?
-David T. C.
If corporations are people, aren't stockholders guilty of slavery?
The whole idea here was supposed to be that we only need 1 copy of a system file in 1000 home directories, that sort of thing. If a user has a (read-only) symbolic link to some system file, they can't modify it unless they make a copy and replace that file. So the dumbed-down user interface to "install-whatever" attends to that. Big deal!
To summarize:
1) for large files, this is pointless because modifying one byte does a copy.
2) for small files, all this does is fix system-administration problems they've had all along, because they're trying to make a single-user platform and culture into a multi-user one to glom onto all that Enterprise systems cash.
Oh and BTW, say I have two files located at X, they're the same size and have the same contents. How do I know if one of them is a link? Even better, what if I want one of them to be? Or better still, what if I don't want one of them to be?
Here's another one. How the Hell do I do disk space administration in a situation like this? Buy another ($800/server, $20/user) friggin' package that correctly informs me how much disk space is being used? What? Now all-sudden, all my scripts that detect and calculate file sizes are for shit. A symlink reports its file size as 1K, because it is.
This bytes!
--------------Rev. C.C.Chips---------------- For the real truth, visit
"single instance storage" is not new; it's been used in mail systems such as Groupwise, WordPerfect Office, and Microsoft Exchange for years. I'd guess that the main reason it hasn't been applied to an OS file system so far is that filesystems are generally searched linearly rather than binarily, at least at a given directory level. Reiser has a new approach to the directory structure; it would only be a matter of time before people would realize that "copy-on-write" technology applies well in such a file system.
As a matter of fact, you could create a nice nightmare for yourself by trying to do single-instance storage at some lower level, like the cluster level. Wouldn't that be fun!
I think the author of the original post definitely had the right sense. This is an Interesting Idea, but it's not God talking. It has more the quality of "I have a shinier shoe polish."
--------------Rev. C.C.Chips---------------- For the real truth, visit
Besides that, I've heard people parade shortcuts as microsoft's "symbolic links."
Until I can install an OS without running a GUI and get at all the links, and all the long file names the idea of Windows having "symbolic links" because of shortcuts bites my big one.
Case in point: I've installed several different "rescue" linux disks, with no GUI's, and all of them had both symbolic and hard links. How could it be otherwise? If M$ had had a decent "single instance storage" system, I bet we'd have single-diskette Windows by now. Maybe not much of one, but we'd have it.
--------------Rev. C.C.Chips---------------- For the real truth, visit
Dude! SAM was awesome! I was just trying to remember the name of it...
10Brett-T
Oh, bother.
You could probably do the same in about 20 lines of perl.
MS Technology that puts a ' where it belongs. I'm sure it's not that hard, but I mean symlinks took 1.5 years. So that solution is still at least 3 years down the road.
ALL HAIL BRAK!!!
Why should it be a problem with touch ?
If you think of it as a symlink, you can touch/revision control the link without touching the original file.
I don't know how it works, I'm just saying that if it is well done, there is no problem for touch and revision control.
Excel, was, of course, a clone of Lotus 123, the PC killer app.
-BrentSo if I wan't to change *ALL* the files do I change 1 file or 50? If it's automagickal how does it know who's (which file) the authority?
Seems broken by design. That's the innovation I expect from microsoft.
IMHO I think you got it backwords. Your situation will work:
... once *ALL* file files are changed the will all (again) be merged into a single file to which everything else points.
At EOD (end of day) when you make your backups you will only have new names (over night the copies will be merged into one).
At SOD (start of day) when you start making changes your changes will be written to new physical copies (your previous EOD backups will not be modified).
The Scenario which does not work is when I have to change a config file for *everbody* because the locations of some programs data has changed. I can't just change one file, I have to change *ALL* the files
If you don't know that you only had to change the file once [on say UNIX] you might not realize whats really going on and that you could have done things more effienctly. As it is for Winder weenies it's just transparent -- but given the cost of disk space it's a real waste of CPU isn't it?
But have you?
Which then purges the duplicate files and creates new copies of them when they change, so that you only ever store data once?
I think this story was reported poorly, in that everyone here's just thinking that they did symbolic links all over again, when they actually did a bunch more than that. Unfortunately, not everyone has read the artical to see what they actually accomplished.
Its the way the scheduling of that daemon is performed that is interesting.
MS has recognised that low priority I/O intensive tasks like disk scanners often impact system performance heavily, becuase the task scheduler see's the task as a low resource task, becuase it is consuming only small amounts of CPU, Even though it may be generating a large load on the disk system.
It might be interesting, but it's not innovative either. I don't personally know about the schedulers in other *nix OSes, but Solaris already does this (and I would assume that at least the other commercial *nixes do too). It gives processes different priorities depending on CPU and IO needs. In fact, you can tune it to suit your needs... even on a live kernel (i.e. no reboot necessary).
- My favorite error message: xscreensaver, running on an old Sparc 5 w/ 8bit color: bsod: Couldn't allocate color Blue
It's used on MS's Remote Install Server to reduce duplicate files. Remote Install Server approx backup restore service.
Dantz Retrospect netowrk backup has had this since the early 90's. Application files don't change, so back up the first one, and ignore the rest, or compute a diff.
Filewave does the same thing for install services.
Nothing new here, move on.
Nearly as bad a MS PR blunder on Cleartype.
First of all, while I think this is more original that just a symbolic link, it's hardly revolutionary.
<P>
Secondly, why is this necessary? Wouldn't it be better to just design the layout of the files in the operating system so that nothing need be duplicated?
<P>
I was highly amused reading about their other "innovations." The text-to-speech thing has been revived more times than Show Boat. SAM on the Apple and C64 did it. The Amiga, for which Microsoft wrote a version of BASIC, had it built in. Even speech recognition, which is technically much more difficult, has been produced in numerous forms.
Micro$lop has finally done some real innovation :-)
:-D
Never mind that Unix has had soft and hard links for what, 30 years? Well, Unix ain't an operating system it's a hack for a few ultrageeks
'nuf kidding, I think it's good that M$ is closing in on good software practice. They may have a bit left but at least they are getting there Real Soon Now (tm).
And, this is not the first thing that M$ has 'invented' or 'improved'. HTML, Java, distributed objects and GUI just to mention a few.
Groan. You've got an nicely ambiguous sentence. I assume you're meaning "[under unix] Handling of ".." is done by the shell. "
No. .. is a directory entry pointing to the inode of the parent directory. Except on a root directory, it points to itself. (convenient, eh?)
[lurk]
Friends help you move. Real friends help you move bodies.
However, most systems have a decent number of i/o related tasks.Everytime a block is written/changed, blocks are compared and symlinks are made? This has got to be nothing but bloat.
It looks like this is only at the os level. i.e. you can't manually make symlinks
I can understand 10% in saved space, but more than that? The space required for the implementation of this scheme itself has to be substantial.
I must admit that is some cool research material, tho.
pilot
The article says the server actively seeks out identical files and replaces them with symlinks.
Anybody who's ever lost an association (program icon, whatever you clal it) in windows just because they moved the original file, will know how much trust I put in this "new technology".
Still, even if the server does automatic de-duping of files, and even if they do manage to solve the can of worms they open, and even if it does not negatively affect performance, it's still too close to symlinks to be really considered "innovative".
Symbolic links!
Even if they are automated!
<calming down> Automated symbolic links would be a bad thing IMHO. Just because there are ten identical files on the server does not mean that we necescarily want them to be identical.
OTOH if it automatically unlinked when you changed one of the files, so that all of the other files were not changed (think any sort of RCS) than this could be a really interesting idea.
Lots of overhead, but all of the sysadmins out there know plenty about users tendency to keep their own copies (on the server) of standard documents that are available to everyone else (on the same server).
Now that the numbing (from banging thy head against said wall) has softened, this actually doesn't sound like such a bad thing (The idea of auto-linking and unlinking.) Obviously Microsft claiming PR for "ln -s" is a very bad thing.
chris
-- I need more coffee. It's Monday. There is no such thing as enough coffee on a Monday.
....
Text-Speech - That was builtin Mac OS years ago. Another stolen So called innovation!?
Smart TroubleShooter - Well . How smart could it be?? As far as I see After 5 years struggle of implementing "plug and play" in Windows I still can always see "plug and plug and plug and continue to plug". I really don't see any smart things coming out of Microsoft any time sooner than NEVER. I am not saying UNIX is very smart, UNIX is Dumb, very Dumb, but I think it's the system Admin's Job to be "Smart".
BTW: Microsoft did not invent Office
I've taken Software Engineering .. But I don't Understand .. Probably need to re-take ... ~_~
I am not sure if the Symbolic link thing is really that useful, as Hard drives nowadys are so _BIG_ I don't really care about space anymore, maybe it'll help administration of server, but I don't see how. Anyways it is certainly different to the traditional type.
I particularly liked the part where it said that the Microsoft Research Dept. contributed "more than 15 innovations to Windows 2000."
Does this mean that anything else that changed was contributed by the Microsoft Copy-other-Ideas Dept.?
nod> The article described something that went beyond just symbolic links. It actively looks for duplication in files, etc. Sounds like they're trying to fight WinBloat looking for redundancy after the fact instead of just not introducing it into the code in the first place .
Boy, those innovations sure take a long time.
I hope they patent this exciting new discovery. Maybe we'll get a Linux version
~~~~~~~~~
auntfloyd
I seem to recall reading a paper on doing something similar in a Unix system (to remove all the dot-files that were copied to user directories and never changed), but I can't find a reference at the moment. I also believe there's been work on copy-on-write file systems in Unix as well.
I agree, this sounds rather like "hard links".
But this story comes one month too early. It should be definitely scheduled for April 1st!
I've quickly skimmed over the article, and as far as I can see they're talking about something more like hard links...
Hopefully M$ have implemented this as a Body/Handler pattern for the file system, i.e., if the user edits one of the linked files, that the now different file is unlinked and becomes its own separate file. This would allow you to have backups that didn't change with the original sources.
However, there is a performance hit for this if the linked objects change a lot. Java Strings are implemented as an instance of the Body/Handler, and string manipulation is much slower using Strings than StringBuffers/char arrays.
I'm really going to go out on a limb, but here it goes.
I predict that by the year 2005 Microsoft will discover Fire and invent the Wheel.
I've telephoned the UK HQ and I have an official response coming soon. I think I did a good job at beating down any arguments that they gave to me.
Jonathan.
http://www.jonmasters.org/
It doesn't actually say anything about copy on write technology you know! But even if it did, copy on write isn't new and has been around on UNIX for donkeys years. I'm off to phone M$ HQ in Reading (UK) and get a response on this.
Jonathan
http://www.jonmasters.org/
- if I modify a SIS stored file does that modify the copy other people see?
- are there really all that many in the way of duplicate files on most servers? If so, then there wasn't all that much 'information' being stored on the server.
This reminds me of Ted Nelson's Xanadu project, which stored files as lists of references to other files. It would seem to me that storing references to common sub-ranges of files would be a real win, while this just sounds like a cheap hack that may look good on a feature list.How many of you people actually read the article?
This isn't ln -s, it isn't really hard links either.
This is part of the whole initiative to get rid of shared libraries that Microsoft is taking on (remember the ability to clobber one program by removing a part of another?)
The trouble with locating all the libraries in different places is that you eat an enormous amount of disk space.
Yeah, you could do something similar with ln -s but it becomes a very manual process: Am I the first copy of the code? Am I a bit-for-bit duplicate of existing code? While these things could be done with a few lines of code under _ANY_ OS (not just linux) incorporating into the OS creates a _standard_ way for it to take place, even for code that has already been written.
Yeah, sure Micros~1 sux0rs, and the marketing machine is in full tilt, but why does your undying hated of this company cloud your reasoning so much? Maybe I'm just wrong in assuming that since this is slashdot, the majority of readers are intelligent?
whatever.
The Single Instance Store, which is used with
From the article
Windows 2000's Remote Install Server, consists of two pieces, according to Bolosky. The first piece searches for duplicate files, computes a signature for each file and stores these signatures in a database. It then compares the signatures in the database and merges duplicate files. The other piece implements the links, recognizes when someone tries to open a link and directs the link to the common store, where all the duplicate content is stored.
There's nothing new... there are no copy on write... Write a perl script that makes a database of file signatures and you got the first piece. That script then uses the second piece, called "ln" on unix and you got the same solution. Put the perl script into your crontab and now it is transparent.
Innovative? My ass^H^H^H^H^H^H I don't think so
Oh, and I forgot...
Everybody (even Nataly Portman) knows that symlinks were invented by Al Gore
ln -s lnx ms
or
ln -s www.linux.com www.microsoft.com
-- ladies and gentlemen we are floating in space!
and when you look at it you get the very familiar Open With ... dialog box.
You think that they could have handled the file with no extention to default to notepad.
.
The AFS/DFS snapshots (which soft-updates might
include in a *BSD near you any year now =) are
only taken so often, so you can't unpack an archive, delete it, accidentally rm -rf the new
directory and get it back, unless someone did a
snapshot somewhere in between.
Snapshots _are_ cool, but they can't solve all
problems.
If you snapshot every 4 hours, someone will ask
for a 2 hour old file, you can count on that. =)
-- I'm as unique as everyone else.
shrimp writes, "Slashdot really does have journalistic integrity! See for yourself! " I can't decide if this is supposed to be a story or not. I mean, it's innaccurate, but I just can't tell. Perhaps I need to read the link before I try to post stories.
--
Here is the result of your Slashdot Purity Test.
Linux MAPI Server!
http://www.openone.com/software/MailOne/
(Exchange Migration HOWTO coming soon)
Certainly there's more to this than simply symbolic links, but a couple of things are puzzling. Firstly there's the claim it can free as much as '80 to 90 percent' of the space on the server - This seems insane unless people are copying the same file all over the place!
Secondly, there are some interesting security implications of a daemon that wanders around identifying matching files. I'd hope that file permissions were stored with the 'link', so that ownership and access doesn't get lost when files are coalesced. But on top of that, if the O/S allows you to know a file has been merged with other instances it could be used to leak information about files you can't see (If I know that my zip file of some prohibited software has been coalesced with something else, I know it exists somewhere on the server, even if I can't see it, for example).
Just my $0.02.
This triggered a thought. This is mostly supposed to save space, I imagine, in user directories that are on a file server. It would probably work great for stuff that people like to download and email around a lot (like that mpeg that somehow gets into everyone's mailbox in a company, taking up an extra 30 MB per user almost overnight).
If it were applied to documents, especially those created by our beloved MS Office applications, then it would be absolutely useless. It would also be useless for databases and other binary formats. Many of us are used to the *nix world where most things are text files, so the idea of consolidating files into one location and distributing patch formats makes sense to us. In the proprietary binary world (MS) this does not apply and would not work.
How many people here have seen a Word document grow by 30 KB and change binary format significantly because of one corrected misspelling? It happens all the time. The savings from a consolidated file system would immediately disappear in this case, and there is no way to effectively or efficiently make a patch format for a change like that, not in a proprietary binary world.
I caught the automatic portion of this as well, however I can't help thinking there will be something of a performance hit to achieve this - unless it's only run as a scheduled task.
;-)
Mind you I also liked the way they're trumpeting IPv6 - I mean it's not like any other OSs have already is it???
You're right, what this article talks about is not _just_ a symlink. What it really is is a system to make sure that all duplicate files are actually symlinks to a single copy of the file, all done automagically. I couldn't really tell from the article, but I think this process may be pretty opaque from the users point of view. I don't know about you, but I can think of a whole host of ways that this can go wrong. Especially over a network. When I'm working with files on a network, I always make copies of important files to keep locally for times when the network is down. I don't think I would like it very much if I discovered that my local copies of files weren't copies after all.
Anyway, the usefulness of this is beside the point. The point of this discussion is that the MS article/abstract/whatever heavily implied, or even outright stated that all of the concepts involved were developed by MS employees "from the ground up". Most of this article is just so much hyperbole, marketspeak and self-congratulation.
Well I have written a throw away python script/shell script to find all the duplicate files in the system by computing md5sums and storing them in a hashtable.
Actually, if you've used one, they actually are really nice. Much better than the silly Sun and SGI optical mice with the metal mousepads that I've used. I can use it anywhere I can use a regular mouse, and a lot of places I can't use a regular mouse.
Besides, it has a standard interface (USB or PS2) and replacing it is easy... I had an SGI mouse die on me, and we had to buy a new one (much more expensive) and have a service tech deliver it to us.
It really annoys me the religious ferver that a lot of people direct against Microsoft. Not that I'm defending them, but sometimes they make good stuff.
.. to appear as if they are an innovative company, for the antitrust trial and in the eye of the public. How many times did they use the word "innovate" in this article? Its nothing but a fluffed up marketing FUD piece packed with lies. Its just self-appraising repetition of "This innovation that we've personally innovated here at the innovative Microsoft Research department shows once again how innovative our companies many innovations continue to be (innovate innovate innovate). Innovate innovate .. blah blah .. innovate .."
And this paragraph: The result is a feature that frees up as much as 80 to 90 percent of the space on a server, allowing users to store as much as five to 10 times the information as they could before. "The bottom line is that it saves the administrator time, which is why it's part of Zero Administration for Windows," Bolosky said. "It's designed to ease the lives of the technical support staff."
Is that supposed to be some sort of spontaneous testimonial? It sounds more phoney than the phoney testimonials in mail order catalogs. And I would bet my car on testing their implication that they are going to save 80 to 90% of disk space like that (and without any OS overhead at all, ha!). Even if this thing is a much fancier version of the ancient symlink (eg automatic detection of clusters of data that are the same) you still end up with nothing more than an overblown Stacker, with maybe 20 to 40 % compression if you're really lucky.
In spite of the fact that they spend the whopping sum of 5 Billion US$ a year on research, the best they've managed to do is repackage various ideas as their own, tout that they've "assisted with IPv6", and of course they came up with a dancing paperclip. You have to be pretty damn un-innovative to manage to throw away $5000000000 like that.
In Win2K an application has a choice of using the system dlls, which are protected and can't be written over except by a service pack, or it's own private version of a DLL. So if your app requires a specific version of msvcrt.dll, you can install it in the application directory and it will use that copy instead of the system copy
Windows has always done this, from Win95 is as far back as I can remember. As developers we've encountered the dreaded MFC42.DLL(and family) incompatibility problems, and in some cases we "solved" this by dumping the DLL's in the app directory.
I've noticed in the past MS has announced and advertised a number of things as being available in their latest OS offerings, that were never heard of again, or seen. Whenever any new MS OS hits the markets, rumours and myths tend to run wild about the wonderful amazing things this OS is doing (things like "Windows 98 moves your most commonly used program files to the sectors closest to the center of your hard disk for quicker loading" and "Windows 98 has better crash recovery" (haha) and a variety of other zany claims.)
Anyway, despite the commotion around SIS on
This looks like its done at installation, where you install a package and all it does is fill a tree full of \\host\share\whatever\path links apart from the read-write files.
If you had a pervasive NFS mount you could do the same in UNIX, but NT's ability to make arbitary mounts just by specifying \\ at the the beginning of the path is adding the value here.
Lets hope the server's reliable then, and that the system administrator doesn't move things around.
-- Don't believe everything you read, hear or think
The world really is backwards.
That blue screen thing... all Microsoft baby.
The very first implementation of Text-To-Speech I ever saw was for the Commodore 64, an Assembly library with BASIC bindings called SAM (I don't remember the company). I think it was around 1983 or so. Anyone else remembers this cool toy? Their selling point is that is was the only one that was software-only at the time (thanks to the excellent sound synthesizer inside C64 machines).
:-)
:-)
:-)
I loved it and loved writing code for it. I can still hear the demo saying:
---------- Imagine a scratchy but very understandable voice here:
"Hello, my name is SAM, the software mouth for the Commodore 64 computer. I am the most versatile, understandable speech synthesizer on the market. And I am the lowest price of them all.
So what can you do with me? Why you can put me into your own programs. How yould you like your business software to say "Please enter this week's purchases". Or imagine an adventure game that has this:
The elf was capture by the giant. He began to cry and he said: 'oh, no! Please don't hurt me Mr Giant!' but the giant was very mean, and he only said 'Ho-Ho-Ho!'."
-------- Imagination off
There was also a demo of SAM singing the Star Spangled Banner and reciting the gettysburg address.
There were two ways of getting it to talk; Natural and Phonetic. In natural mode, it would just interpret your typing, but it made pronounciation errors often. The best way was to use phonetic style, which involved a lot of "H"s to give the correct inflections.
A BASIC program that "talked" looked a little like:
10 PI 64:SP 128
20 SAY "H4EHLLO, H4AW R U?"
30 SAY "I H4OP UR F4Y4LLING f4AYHNE."
This would have said "Hello, how are you?" (with proper question mark inflection) and "I hope you're feeling fine". If you took the punctuation mark it would stop "feeling" natural because the slightly lowered pitch when you finish a sentence was not accounted for (it was my most common error)
It took a little experimentation, but once you got a hang of it, all you needed to do was to maintain two strings for each message, one the text, another the speech. Not bad.
If you're into emulation it might even still be floating around, these days the SID (The C64 sound chip) emulation is not that bad anymore; it would probably sound pretty close.
So, any more Microsoft "Innovations"?
- No Sig Today
Kind of like rm -rf, no? or a mistyped dd command?
God! I so much wanted to moderate in this story, but fuck. Why do you guys think that everyone at Microsoft is automatically stupider than you? You may not like their business practices, you may not like their marketing, you may hate their OSes, but their "What If..." department is as good as much any other research center out there. They undoubtably thought of every issue in this entire slashdot story in about the first ten minutes. This is actually a pretty damned cool idea, once some of you next-gen unix admins start operating in a lab/server enviroment you'll see how this can be a really fucking quality concept.
Then they've undoubtedly got a marketing department with the technical savvy of 99% of the Linux developers out there. Beans.
Disks don't usually fail in single sectors these days. Plus you don't use this on your workstation, you use it on your server(s).
Unless of course that engineer was a researcher. Then he might go spend some time working on something like this. And who knows, maybe he'd be able to come up with a workable solution. Wonder what these guys who developed this do for a living?
The parent article is wrong. With NT you certainly have the ability to install software on servers and have it properly shared out to users. Much the same way that on unix some luser can install his own private copy of e in his directory.
This is a non-issue. It's been solved for a long time in Comp. Sci.
Instead of duplicating a particular bit of memory, the copy is only made when a process wants to change the memory (hence, copy-on-write).
Here, you're doing the same with files. It checks the disk for identical files, then replaces duplicates with links to a canonical version. Presumably, if you write to one of the links, it is turned into a real file while the canonical version is unchanged. Again, copy-on-write.
An interesting idea, not *that* innovative, but I'm amazed it took so much time and manpower to implement.
The ambitions are: wake up, breathe, keep breathing.
If you actually read the article (wouldn't that be novel), you would see that these are (basically) automatically created hard links. The OS searches the drive for duplicate files, and then sets up the link by itself. This functionallity doesn't exist in Unix by default, but a script could probably be set up to do it in a matter of minutes.
Microsoft's shortcuts aren't exactly symbolic links. Most programs do not see them as the file they point to.
So much for "Plain Old Text". It thought the angle brackets around "linux/errno.h" were some weird sort of HTML tag, and stripped the whole construct out.
Slashdot - News for Herds. Stuff that Splatters.
Actually, it sounds (superficially) like rsync more than it does like symlinks alone.
[while I'm here and mentioning it, check out rproxy, which is truly qoole. You know how to use google.]
I think I know where the developers are coming from. In fact, I think it demonstrates one of the finest qualities of the "open-source" community and method - less reinventing of the metaphorical wheel. And no, GUI Linux installers don't count.
-- Post No Gravy
Interesting that this "invention" press release was issued right after the heavy campaigning of Al Gore, "Inventor of the Internet" in WA.
./. Aggggh, &c.
Gosh. I never knew there was an election on in Western Australia. Thanks for keeping me informed!
ObClue: No, I don't give a flying fucking fruitbat for discussions of US politics in
-- Post No Gravy
It is actaully a reasonably good idea although I can't see how it could have taken so long to implement it. Now it is quite possible that this has been done before, but it certainly isn't just symbolic links!
... informative :P
Have a look here:
http://rsync.samba.org/
You may find it
-- Post No Gravy
So what happens if your computer is using a symlink to a system dll and the network fails? Or an application file? Does the client know to keep a backup locally? And once the network revives can the client manage to upload the replicated file? I don't know this sounds pretty dangerous to me. If 2000 is as stable as like to say it is maybe I'd welcome a 80 to 90 % reduction of files on my client. I just don't have the faith in the networks at most companies. I'd be willing to have a local master copy if I owned the original but wouldn't that spoil the symlink idea? Sounds scary to me. Trust the IT dept? No way.
And just how do you go about giving a PHP script root access?
compile php as a cgi and turn on suexec in apache. or do something really dangerous like running apache as root. either should work.
john
-- john
Wouldn't setting up a database of the entire filesystem use up more space than you're saving?
Just a thought...
--- If OS were buildings, then the first woodpecker to come around would erase 95 % of civilization.
About the innovation of this text-to-speech-feature:
I remember entering this command on the CLI of my Commodore Amiga 500 back in 1987.
df0:> type blahfasel.txt >speak:
And 'lo and behold, the Amiga was speaking.
--- If OS were buildings, then the first woodpecker to come around would erase 95 % of civilization.
>
:)
Text to speach and voice reconition was orgionally started by AT&T and I wonder what OS they ran, can you say Unix
I have a friend who was part of that team in the late 70's to the late 80's. I get the sinking suppision someone in MS switched inovation with porting. If you switch thoes words in their press relase it makes sence.
Just another Techno-geek lost in cyberspace.
Could please someone _Update_ the the posted article and correct it to Hardlink. Seems like half of the comments still thinks it has something to do with symlinks, which it hasn't.
Clearly the semantics of the Windows file system dictates a certain type of behavior. There's absolutely no ambiguity here at all: the OS would do copy-on-write. All of this talk about symbolic links is confusing you, this action is intended to be transparent to the user.
But certainly no leap of logic. And certainly not worth 1 and a half years of development. No matter what, this technology is probably not really that oog of an idea. Every file creastion, copy, will need to do this lookup, adding significant overhead to each of these operations. Then, when a slight modificiation is made to one of the links, for logical performance, the system would have to copy the data over again so it won't modify all links. they may have spent some time streamlining the process, but there is an unavoidable and noticcable overhead. For businesses (which windows 2000 is targetted at) hard drive pace is typically considered "cheap" but the added overhead to the process which consumes memory and CPU usage will not be appreciated. The processing time and RAM consumed by this "innovation" will likely outweigh, in all respects, the hard drive space saved.
XML is like violence. If it doesn't solve the problem, use more.
Windows 2000 now maintains a separtae group of dlls for EACH program now? If .dlls are supposed to be shared libraries, it seems kinda silly.. might as well static link everything :)
XML is like violence. If it doesn't solve the problem, use more.
> It's more than suggested.
... right? And it's simple enough so that every program that accesses it does exactly what it's supposed to do and never messes anything up. And it's impervious to virus and trojan attacks, and always survives crashes completely intact!
Yep. It's copy-on-write. Been done: standard stuff for storing strings.
Sounds like an idea as bad or worse that the Windows registry. Is it going to be like the registry in that it ignores creation and modification dates? If so, how does it interact with "touch"; or worse, what does it do to version control systems?
The registry is nice and stable, of course, and never collects useless garbage; but if it does, it's a simple matter to put everything right
Awesome file system havoc on the horizon.
why would you trust an MS programmer? Just because they work for MS? Do they really have a that stellar of a track history of writing solid bug free code?
War is necrophilia.
I call them much worse things myself. The stupid morons actually work for them.
War is necrophilia.
OK. Lemme get this straight. The SIS feature is going to have to perform file monitoring to see if a file matches any existing files and then set up the symbolic links. It's still going to have to do a comparison search and relink every time a file anywhere is created/changed/deleted. If that isn't a resource-suck, I haven't seen one.
Oh, wait! I have! Can anyone say Fast Find? Most of us know what a performance boon to our machine *that* is. First thing I do when I have a fresh Office install is disable that CPU/RAM/disk-hogging piece of #*@*$.
If they've figured out a way of doing this without significant system overhead, I'd be impressed, but this just sounds like yet more bloatware coming out of Redmond...
What about the Amiga? How about:
type mytest.txt > SPEAKER:
It's a simple matter of complex programming.
I don't think it would have to be that inefficient. It could be implemented efficiently by using keys.
all they would have to do is set it up so when joeblow saves a file, Windows RC5's it(or uses some other inexpensive one-way cipher). Then it looks for something with the same key value. If it finds something with the same value, it saves it as a symlink. If not, it saves it as a "real" file, and puts the key into the db that can be searched whenever another file is saved.
2) when joeblow opens a file, the OS just has to intervine and perform some type of test to see if the file is actually a symlink. if it is, find and return the real file. if not, just return the real file.
I don't think it would be that hard-or even that expensive.
Did I read that article correctly in that MS uses a 'social' development process?
"My philosophy is that technology transfer is fundamentally a social process," said Schofield. "So the people I hire spend a good amount of
time within the product groups, wandering the hallways..."
No wonder it takes years to develop product...
B
"Look, Smithers! I'm Davy Crockett!"
I just read through the release myself, and it appears that MS is using this only for SERVERS. While I'm not too much of a MS fan, I do know that routinely our servers are filled with multiple backups of workstations or data simply because someone's in a rush to get something done and they don't check to see if the stuff is already ON THE SERVER! (ugh)
Depending on the data stored on your servers, this actually could free up 80-90% (our service department uses their server solely for temporary customer backups, which are 95% identical)
Now the biggest question to be asked is:
Will this work reliably, or is it just another bug-ridden "feature" ?
you would think, as a normal linux user... in windoze, half of the \winnt dir is duplicate CRAP. then half of \Program Files is duplicate crap too.
I can see the figure realisticly being something like 60-70%, actually...
remember, windows IS a complete moron.
From a motherboard manual, error beep codes: S-L-L-L-SS: Speaker Error
The wonder of this article is that it exposes the lack of verification by Slashdot editors and the frequency with which commenters do not even read the article in question. One wonders if a submission of the infamous goatske.** would be accepted and discussed by the Slashdot readership using the comment included with the submission as a basis. This reminds me of when an AC posted, regarding a development kernel release: "Linus' comments:" and then linked to goatske.**
Chris Hagar
"The price of freedom is eternal vigilance." - Thomas Jefferson
Maybe have a disclaimer?--NOT! In this diverse community we must accept a limited number of these H.I. types. I think most /.ers will get a good chuckle, I did.
1000 SlashDot sigs
Actually, if they are running Exchange Server, the email and its attachments are only stored once. So, if you do send the 4.7 meg "Chimp picks nose while skydiving naked.mpg" to everyone in the company, the contents are stored once and referenced to each persons mailbox.
mike (who visited Microsoft a couple of times during the Exchange beta and saw the "deer caught in the headlights incredulous look" when I asked them about single user mailbox restorations. "We decided that wasn't needed for the first release" was the answer. "Ok, so I have to restore the whole message store, ie: SYSTEM, to retrieve a single important email?" "Er...yea...But you'll never need to do that!" me: "Er..yea..right..uh-huh")
What's my Karma Mr. Burns? "Excellent"
It's another example of Windows taking too much control, when letting the user decide would be more appropriate. Plus, computing a "signature" to determine which files are identical sounds like the OS can't even be sure that two files are identical before destroying one of them in favor of a link! Let the user or program that creates the files decide!
I don't know if this is a joke or not. It seems much like the kind of thing MS would really try to pull. It's really rather kind of astonishing when you consider the sheer number of linux boxes physically AT microsoft. I mean, a lot of their employees profess to screw around with linux.. no idea what they actually do with it, but how could they use anything UNIX-like for more than a day or two and not be familiar with links? (soft and hard).. Maybe their research department aren't the people experimenting with linux, eh? Maybe they should be, heh. I'm just waiting for the full fledged press releases claiming MS the brilliant innovator once again pushes the envelope by inventing symlinks and will be patenting and license the technology to those interested.
As far as I can remember, there were disk compression utilities around long before MS 'innovated' them. Anyone remember stakker?
Typically, I run
find . -name '*.eps' -print | linksame
to be sure the links it wants to make are sane, then
find . -name '*.eps' -print | linksame | sh
to actually do the linking.
Anyway, for anyone who'd like to have this `innovation' on their machine, here's the script:
#!/usr/local/bin/perl -w
use strict;
my %sizes;
while (<>) {
chomp;
my $size = (lstat $_)[7];
++$sizes{$size}{$_} if (-f _ and $size > 0);
}
while (my ($size, $filehash) = each %sizes) {
my @files = keys %$filehash;
next if @files < 2;
my %sums;
foreach my $file (@files) {
$file =~ s/([\s\\<>&;\{\}\[\]\(\)])/\\$1/g; # Quote shell metachars
(my $sum, undef) = split ' ', `md5sum $file`, 2;
warn "Couldn't sum $file\n" unless length $sum == 32;
push(@{$sums{$sum}}, $file);
}
while (my ($hash, $same) = each %sums) {
next if @$same < 2;
my $first = shift @$same;
print "chmod ugo-w $first\n";
foreach my $other (@$same) {
if (system("cmp", $first, $other) != 0) {
warn "files $first and $other look the same but aren't\n";
next;
}
print "ln $first $other\n";
}
print "\n";
}
}
That's been around a long time before Create did anything with it.
This post encoded with ROT26. If you can read it, you've violated the DMCA. Handcuffs please, sergeant.
Zero Administration for Windows
Stop! My sides hurt!
The Divine Creatrix in a Mortal Shell that stays Crunchy in Milk
The House Between - Original Sci-Fi Series
Of course it is not about just symbolic links. Despite the benefits it may bring in authomatic space, which will certeinly fall MUCH behind their alleged 80% economy, it is not nearly as functional as symbolic links.
It is clear from the article that one cannot creat a symbolic link to a directory. And although not clear, I dont think it will be possible to manually create symbolic links, if one wishes to.
That, IMHO, reduces its functionality to much less than that of symbolic links.
-><- no
So has MSFT filed for a patent on this innovation yet?
'I ain't a liar, baby, and I ain't proud I just want what I'm not allowed.' -- Violent Femmes, 36-24-36
I wonder what they think - smells to me M$ is going to push this.
Le M.
P.S. What's a first post ? This is mine...
Research is what I'm doing when I don't know what I'm doing.
More importantly however, let's consider an algorithmic analysis of this "innovation". Imagine copying N files into a filesystem of this type - for example, restoring a filesystem from backup after doing a reinstall.
In a tree-based implementation, the running time is suddenly O(NlogN) on average ( O(N^2) in the worst case), and that would probably hit the disk a lot.
In a memory based implementation, say one that uses MD5 or SHA-1 to avoid lots of disk accesses, it seems the running time would O(N^2).
So let's give them the benefit of the doubt for a sec, and say they have an incredibly well tuned hashing algorithm that makes it O(cN). But wouldn't this end up taking up just tons of memory, to have a hash table sparse enough to get that good of a running time? Imagine doing it for a news server's filesystems!
So it seems this may be the kind of innovation that sinks ships. And if it is, I hope they don't notice it until the "innovation" is already in the field and can't be retracted...
This probably won't get read, since this article is down the list a bit, but here's my own unscientific test to have a go at the truth or falsity of this claim:
Just to check it out, I performed the following test on my linux workstation. It's a fairly vanilla Debian distro with KDE and lots and LOTS of source code. Oh, and I also have a minimal Debian GNU/HURD partition mounted. I cleaned out /tmp before I started, as I had been running some custom code which filled it up recently. I hope this doesn't skew the results too much.
Here's the output of df:
Filesystem 1k-blocks Used Available Use% Mounted onThe test I performed was:
Most of the duplicate files are in /usr/share/{zoneinfo,locale,doc} (no surprises there; I have two OSes with the same Debian infrastructure!). There is some duplication in /usr/share/apps thanks to KDE (lots of duplication in the i18n stuff). There's some duplication in the perl libraries between /usr/local/lib/site_perl and /usr/local/lib/i386-linux/. Interestingly there was some duplication in my Netscape cache. I also discovered that I still had an old kernel in source form sitting around, so all of the headers were duplicate.
Including all that, the final result is... 162536752 bytes, or 6.8% of used space. I therefore conclude that while it may help a little, the true savings are nowhere near 80-90%.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
Yeah, that's a pretty funny story. What I want to know is .. what took so long? Seriously though, is this really gonna save 85-90% of your disk space? I don't know about anyone else, but very seldom do my programs install the stock DLLs into their own directory, without checking c:\winnt\system32.
> I don't want my computer to think for me, I want my computer to just do what it's told!
As someone else already wrote: make it optional to turn it off.
Just because you don't use it it does not have to mean that it shouldn't be there. You don't tell your computer how to write data to the hard disk - the operating system does - and most probably you wouldn't want to take care about disk I/O everytime you need it. Isn't it nice that it's already there?
If you never used the -l option on ls, would you take it out? Maybe someone else might want it...
If you want your computer not to do it, simply tell it so - that's what you want (you said). I think it sounds like a good idea (plagiate from someone else?) to keep scattering of files (especially windows needs this, as (at least it happens to me once in a while) you save something into whatever was set as a default folder while you thought you'd save it somewhere else. This feature would consolidate things like that.
Do not forget - there are many people out there that use computers for work and don't want to know how it works - it's a tool, and if a hammer was a complicated thing I wouldn't care, as long as the nail is in the wall after I applied it.
What's next? Oxygen?
Uh yeah, and in a related note:
Microsoft Research Innovates Life....
Pretty soon they will claim all rights to inventing the wheel or maybe even for creating the earth. And you know that many people will beleive it. I have to give it to Gates though. He sure know how to run the BS-meter up.
Are you thinking of "shortcuts"? .dll? Just doesn't work, since shortcuts are NOT symbolic links. They don't represent a pointer to the inode. The bottom line regarding this "new" feature that M$ claims is so innovative, is that it isn't innovative. It may be a specific application of the generalized sym-link/hard-link concepts, but certainly not innovative. This sort of crap could be written into a command-line application easily in UNIX-land.
If so, then have you ever tried to shortcut a
On the other hand, I believe Micro$oft will get its precious patent...but it won't be a broad sweeping patent, as it is only applicable for their file systems. Samba of course will have to support it...but as we've seen before with other M$ file system changes...that shouldn't take more than a few days to do...
Thats why the Unix-gods invented user-level and group-level permissions. Hopefully the root user isn't typing rm -rf on a daily basis (unless they are REALLY sure of where they are in the file heirarchy.)
Last I checked...Win NT (at least 4.0) doesn't have this kind of file permissions. Maybe its just the admins here aren't using it...(which is just as bad...IMO) What good is file security if it isn't enforced by the OS?
And then they butchered it to create VFP. Providing no way to upgrade a large foundation read based system without a rewrite of the entire user interface.
As one of the chaps here at work said, they're not simlinks, but borglinks.
:)
:)
Whilst this article certainly reads like Mickey$oft have the cure for cancer, their real innovation is the magic behind it. Probably all of four lines of perl
How's this for a theory... The borglinks are kept in a hive (like the registry hive) and when it corrupts itself, you lose 80% of your filesystem.
Heh, yeah I'd put that on my production server!
I laughed so hard I cried. Three senior researchers spend 1 1/2 years on this idiocy yet. This is too much. Unix folks have been using symbolic links judiciously for an awfully long time and a lot more flexibly than this wierd scheme implies.
Thanks for nothing Bill. Except good yuks.
So, if I understand the article correctly, they did not only have this great.. NEW idea, but they also went to Bill Gates, Bill liked the idea "harhar cool stuff let's build this before anyone else comes up with it" and then they spent 1.5 YEARS?? with it? omen, this is just sad.
I agree that they didn't simply recreate symbolic links, but, on the other hand, I don't know if I consider this 'innovation' either. To put it in a management summary form, basically, they took the manual process of symbolic links, and automated the process so it happens on a continual basis. If automation=innovation, then every sysadmin I know must be the innovator of the millenium, because each and every one of them have taken numerious manual processes and, via cron, or some other batch utility, removed the manual component.
Bascially, they took two 'ideas' (sym-links, and automation), and married them.
To give the devil his due, I WOULD consider the compressed file system (like in NT 3.x/4.x) an innovation...
-- You can't idiot-proof anything, because they're always coming out with better idiots.
Maybe its the past 12 factor kicking in, but I just thought of something...
So If its automagically linked, then this opens up a new door for cracks...
A file can be brute force written until it is linked, and the user knows the file already exists. That way, One can get the contents of a file that one wouldnt otherwise have permission to...
Or does this not files with strict permissions?
I'm no NT guru...someone else enlighten me pleez.
But if you were making this as a backup on a different disk?
You would make a backup as usual. Then your original disk crashes. But don't worry, you had a backup on your 2nd disk right? Nope, it got turned into a symbolic link???
Maybe we'll end up having to define complex rules on when to auto-sym-link and when not to... And it'll default to auto, and newbies will be caught out.
When they 'invented' Win95, they were given a lot of crap for having shortcuts that they have instead of sym or hard links in unix, macs, os/2 or whatever.
.ignore file or something?
They retorted that symlinks are evil and hard and bad for newbies, easy to make mistakes on, etc. So now, they decide to do the same thing, <I>Automatically!</I>, behind the users back, and also call it an innovation?!
It's not a bad thing to add to windows, but automatically... Is this like every time I create/alter a file, it will scan thru all network storage areas to see if there are any identical bitstreams? Or will it be like incremental garbage collection for hard disks?
What if I wanted to store multiple copies of a file to protect against bad sectors, because I'm don't have a proper backup system? That won't work anymore. Unless we set up some
I hope they don't make a wizard for it ^_^;;
I dont think that it'd be the end of backups. If you made a copy of a file it'd link it in, but as soon as you change one, I imagine that it'd spawn a seperate copy. If it were smart, maybe it'd create a diff or something? then when you move it off of the filesystem, it could grab the merged data and the diff, apply it and then copy. The only problem that I see, is that what happens when u frag yer drive - sometimes i get lucky and a backup file is left in an unfragged part of the disk; ud be outa luck here. Someone suggested that a simple daemon could be developed to do this in unix... but i dont think so. This has got to be smarter than just symlinks. Otherwise, it'd suck pretty hard. I wouldnt mind seeing this feature if it didnt cost too much in overhead. Anyone wana implement this as a linux kernel module? hehe. It'd be easy to do without the diffs and stuff I think... just auto create 'symlinks' but tell the user that they're real files and when the symlink gets changed spawn a new copy of the file to be stored. Also spawn new ones when moving/copying across filesystems. Could be pretty cool :) One note... MS says that for remote installs it would just do its linking on the client... what happens when you do this on a big lan, and yer server goes down? :P
yeah, so disable fastfind!
Having read the article, I think that these are really "smart" hard links implemented transparently to the user.
Like hard links,deleting one link does not delete the file and 2 links refer to tha same physical bits.
Unlike hard links, when you change one copy, the other copy remains unchanged. This would have to be done by making a new copy of the file when changes are applied to one linked copy. This is necessary for the process to be transparent to the user.
Also links are created automatically when one file is detected to be a bitwise copy of another.
The only problem with this is there are times where we deliberately want 2 copies of a file. What happens if one of 2 files (one for backup) becomes corrupted ? What happens if i am about to trash one partition and am deliberately copying stuff to another ?? What happens if i am copying stuff to another hard drive which i intend to physically remove.
This is another example of removing flexibility and power from the user with the minimal benefits of saving some space and making the system easier (??) to use.
Back in the nonexistent good old days, there were programs which allowed you to make a tradeoff, in exchange for a performance hit, you could compress the redundant bits out of a sector of data, and store more data on a hard drive. The magic was that it was transparent to the user of the system (outside of the performance hit).
Now we are presented with a new version of the same decision, a performance hit in exchange for transparently taking the redundant files out of a filesystem. The magic, again, is that it ends up transparent to the user. Which is why this is not the same as symlinks, which require user (or system) management.
As much as I dislike the monopoly, this is an interesting, apparently novel application of a metaconcept. The overall concept (data compression) is clearly old, this instance (file system compression) appears new, raising the question:
Should it be patentable?
--Mike--
See, no matter when they filed for the patent, whether it was 5 years ago even, soft links have existed under unix for WAY longer. Finding an example to invalidate their patent, should they apply for one, will be an absolute cakewalk.
Yes, it's using symlinks, they didn't even state that the symlinks were their invention. It's the idea that the system automaticaly creates the symlinks when it finds another file exactly like the one you're trying to store.
I've noticed more and more that if somebody finds a piece of "anti<your favorite evil entity(tm)>" And with what seems like very little checking, the article is posted with sensational claims by the editor who posted. No, Mr. Rob "CmdrTaco" Malda doesn't have the monopoly on bad posts, I see others doing it too. Could it be that this new corporate take-over is killing slashdot? <conspiracy theory="Perhaps Mr. Malda and his cohorts are secretly trying to destroy Slashdot so that they can rebuild it later under an open, uncommercial, user supported system">
Power to the people! Hackers of the World Unite! Go Rob Go!
... but I digress
I seem to remember MS pushing DLLs, citing two benefits:
1. Standardization, look and feel is guaranteed to be consistent if you are using a lib to do the job.
2. (the relevant part) Disk Space, less repetition = less disk space.
-Peter
Along the topic of innovations, check out the title of their department:
...among the many innovations built from the ground up by Microsoft's research arm "Microsoft Research."
Now _that_ took some ingenuity!
-mparcens
~~~~~~~~~~~~~~~~~~~~~~~~~~
JavaScript Error: http://www.windows2000test.com/default.htm, line 91:
Isn't Excel a Lotus 1-2-3 rip off? They haven't "invented" anything. Everything they do is based off some other piece of software written years ago by somebody else. The only difference is a pretty interface and shitty code underneath.
--
"What do you want me to do? Whack a guy? Off a guy? Whack off a guy? Cause I'm married."
I love how they admit that they ignored memory efficency and performance until the 3.5 release... even though they claim NT is faster than anything else on the market. What a crock.
--
"What do you want me to do? Whack a guy? Off a guy? Whack off a guy? Cause I'm married."
On the contrary, I don't think this would help much at all. While many files have identical chunks of data these are very unlikely to line up perfectly with block size.
As an example, say I have two identical text files of a large number of blocks. At the beginning these could be just the same set of links to the same set of blocks. But if I add one character, say thirty lines into the file ("oops, typed print instead of printf again") none of the later blocks will be indentical anymore. So I think very little space would end up being saved.
Of course, if you could get away with variable block sizes that might help, but then you're stuck with the insanely difficult problem of how to break up a file so that you get the best match with existing blocks. No thanks.
I'm sure glad they invented and implemented Text-to-speech in W2K!
I used to have a lot of fun back in 1988 with text-to-speech on my Amiga 500. Now that MS has actualy invented this, I've come to realize that I must have been living in some other alternate reality back then.
So, for me, MS has done so much more than they modestly claim... they've defined REALITY!
There's a new God in town..
-- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
Read the article. This process automatically creates the links if the bits are the same. Why does Slashdot feel the need to post a negative article about Microsoft without reading it first? witz
I think ist's something that would be nice to be able to turn off, but for some people's applications it could increase performance by increasing hits in the disk cache.
Sorry,
:-)
you are wrong
<quote>
Slashdot has it wrong today. Microsoft has had the equivalent to Unix symlinks for a long time--they're called
"shortcuts". Like a Unix symlink, a Windows shortcut is a small file that does nothing but point to another path where the
real file is
</quote>
try to load a MS "symlink" in an editor like notetab: notetab file.lnk
Whats loaded?
Nothing, something binary or did you get an errror?
MS links only work when you CLICK in the EXPLORER on a link.
They are neither sym nor hard links!
By the way: NT supports links, but I do not know whether they are hard or symboloc.
angel'o'sphere
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
It was on the BBC micro before that.
This isn't much by itself, but it is key for the enhanced installation/setup engine Windows 2000 introduces.
Under the new installer, every application thinks it has a personal copy of all the .DLLs on the entire system -- no more .DLL version nightmares. The only way such a system would ever fit on a hard disk is to use an automatic symlink mechanism like this.
In typical Microsoft fashion, they've written software that tries to do the job of an intelligent administrator, without giving the administrator the ability to manually create links. That's great that they're creating the links automatically, but all it really accomplishes is saving diskspace by removing unwanted duplicates. That's a nice use of symbolic links, but it's certainly not the most powerful use of them. For instance, you may want to create a single document, and then create a symlink for each user to that document. In a Unix environment, you can update that central document, and the user symlinks continue to access the updated document, so each user now sees the updated version. With Microsoft's solution, the updated version now justs sits there, while all the users still see the old version, which isn't what the administrator was trying to accomplish.
This is just another example of Microsoft's programmers thinking they know more about system administration than system administrators do.
---------------------------------------------
SERENITY NOW!!!!!!!!!!!!!!!!
It means just applying copy-on-write to files when copying them inside a filesystem. Or maybe the Microsoft way of calculating checksums does it more complicatedly, but the same basic idea. So there are actually sort of hard links, not soft :)
And is not a new idea... or then, maybe my CS lab professor who told about this precise idea about a week ago had got the idea from M$? I don't think so.
With Microsoft OS's and programs, this feature means that when the file gets corrupted, it gets corrupted for everybody. Hooray.
NOSPAM@REMOVETHIS.NO.SPAM - you'll find the real address somewhere
Actually, you have it backward. NTFS (pre-v5) has built-in hardlink support, but nothing like symlinks. Also, no utility to create said hardlinks (as far as I've been able to tell, and I've been using NT since 3.5). Cygnus' cygwin environment has a working ln that creates hardlinks, though.
*rolls eyes*
that's not what i was saying. I was saying I wouldn't trust ONE administrator writing some script to do this opposed to hundreds of smart engineers, mathematicians and scientists from microsoft research.
And compared to other software and what Microsoft software does, yes I trust Microsoft engineers.
Any idiot who's ever developed large software knows it's impossible to develop something bug free. Look at how other companies are trying to reproduce the kind of work Microsoft has done. Sun's Java (microsoft had COM), look at the HUGE bugs in Java, what about star office as opposed to MS Office? And star office doens't even do half the things MS Office does, and then there's Gnome/KDE, which has been under 'open source' deevelopment for over 3 years and still is buggy. What about Netscape?
Yes I do trust Microsoft more than most other software companies, and certainly more than some administrator.
Bah don't be insulting. Chances are I know as much or more about Unix than you.
I happen to take a cosc course at a strictly unix only university, and I've been using Linux for 4 years. And I wouldn't trust any admin (NT or Unix) to write any script that involves the crtitical files.
This isn't just symbolic links, it's smart automatic dynamic symbolic links. There's a lot of computer science and mathemetics going into getting something like this to work efficiently.
If you're going to go around saying everything Microsoft doesn't innovation cause something similiar has been done before then at least actually make sure it hasn't been done before.
For anyone curious about (normal) symbolic links in Windows 2000, you can get the source example for making links from www.sysinternals.com.
Well EVERYTHING done in Unix can be implemented in Windows, and most prolly vice versa. The point is whether it's done or not.
And what's wrong with the OS doing these things for you? Would you rather have the filesystem do this auto symlinking automatically or a perl script your hacker administrator wrote up?
And it's obvious that this is not a simple symlink in the unix sense. Obviously if you change one file, then they will no longer be linked and will get split up into two different files.
And maybe you should try out the Windows 2000 troubleshooter, it's quite good. Especially when it says something like "open up network properties" or something, it'll have a hyperlink to network properties. Much easier than having to follow instructions on how to open up network properties (or even worse, not being told at all where it is).
Windows 2000 help is not has "heavy" as unix man pages usually (that's what MSDN is for), but it's certainly much more useful and handy for the average joe, and even the more experienced jack.
Uh, Windows 2000 does have symbolic links (yes you can mount disks to directories too).
Just cause THAT article didn't say so, doesn't mean it can't do it.
They never said they invented text to speech. Adding text to speech in an OS for accessibility is in their minds innovation. And besides, they were mentioning it as one of the developments contributed by their research division.
Uh huh, and name me something that isn't based on something else.
Amiga copied a process that has been around for thousands of years, it's called reading.
Anyway, I said nothing about Microsoft being the first to implement such a thing. It's innovative non the less.
Did the amiga text to speech engine read out the UI to you? (and save the 'but i can see the UI for myself', it's more disabled people and people who need more usability). I also doubt the amiga text-speech was extendable with plugins.
look at microsoft's enable site if you want
This hasn't exactly grabbed the attention of the mnedia, only the attention of /. cause someone read the first line and didn't think before they posted the story.
My point was, I would not trust any administrator to implement anything like this himself. If you've taken a course in software engineering, you'll see what I mean.
It's smart. If you have 10 files the same, and change one, it's hardly going to change the other 9. Think about it.
Just cause you think of symbolic links from your perspective, doesn't mean Microsoft's "links" and "Single Object Store" is the exact same thing.
I'd say that they'd probably be using a cluster level link, not symlinking.
Innovations developed by Microsoft researchers consistently find their way into company products. The most recent example of this is the number of innovations that were incorporated into Windows 2000, Microsoft's flagship operating system, which launched worldwide on Feb. 17.
That's right, they have produced the number of innovations directly from the void.
father, hatch, make, originate, parent, procreate, produce, sire, spawn
--mark
That is why I described hardlinks as two filenames pointing to the same inode (or substitute "starting FAT sector").
As a point of interest, the DOS implementation of FAT does not allow for safe hard links.
In Unix, every inode includes a reference counter, which identifies the number of directory entries pointing to it. There is no equivalent in the DOS FAT (I'd imagine this holds for FAT32 as well).
In Unix, if you delete a file that has multiple hardlinks, you basically delete the directory entry and decrement the reference count. Only when the reference count decreases to zero do you actually free up the disk space.
In DOS, the lack of a reference counter makes it tough to delete a hard-linked to a file without destroying the other hard links. A reference counter could probably be hacked in (a-la-VFAT long filenames), provided that you were willing to break existing disk utilities.
"Single Instance Store (SIS) is a file system filter driver that conserves disk space by removing multiple copies of a file and replacing them with links to a single shared copy in a common folder. These links differ from hard links in that if one copy of the file is changed, it then 'splits off' from the others and becomes a separate file."
So there you go, it is *not* a hardlink. It does a copy-on-write, so if you modify your file, you do *not* modify all the other instances of that file on the server, as has been fudded in this forum so far.
-------------
The following sentence is true.
The following sentence is true. The preceding sentence was false.
...it'll have a hyperlink to network properties...
Hmmm.. maby its just me but this is a lot like a feature that MacOS had quite a while ago. The MacOS help pages would open a control-panel for you and even circle in red on the screen where you needed to make the change, its not a hyperlink to the control-panel but it was still a very slick way of doing it.
Not realy the same but I wouldn't doubt that this is where they got the insperation.
-----
Can I Play With Madness?
It took them one and a half years to invent symbolic links. I guess when you have to hack stuff into 45 million lines of code, it will take a while.
They say it'll allow users to store up to 10 times the information on the server that they could have before this innovation. What the hell are Windows users doing that require so many duplicate files?
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Three years ago, Bill Bolosky and two Microsoft colleagues were brainstorming technology advances when an idea occurred to them -- why not save operating system disk space by storing duplicate files as links that point to a single file housed in a central location?
Not only would this save storage space, they reasoned; it would also result in substantial performance improvements. Moreover, it would make it faster for information technology (IT) managers to install computers for new employees since they'd no longer be required to copy massive amounts of data each time they set up a new desktop.
What does that sound like to you?
----------------
Overheard: "Aww, why'd you go and install Windows on a perfectly good machine?"
From a newspaper article thousands of years ago:
In related news, other members of the community denounced the "invention" of the wheel calling it mere FUD. They claimed, varyingly, that:
1. The wheel was the same thing as "walking" which they had originally invented thousands of years earlier.
2. MS just wanted to introduce new jargon like "wheel" to dupe innocent people into believing they were actually innovating.
3. Some people acknowledged that it was an interesting concept but said that they were sure MS screwed things up while inventing it and that it was in their humble opinion a bad idea and that users should get a choice about whether to use it or not.
4. They don't trust any product from MS and MS sucks and is evil and wicked and they are sure the wheel is just a mechanism to hurt people and extend their domination and own everyones souls.
Mmmm.. Donuts
>That's all well and good but what do the rest of us who load RedHat Linux do when, after a couple of hours, come aross a 'python' error and can't continue?
Again, back to researching what you buy/use.
I don't use RedHat.
I use something called BSD. And *I* don't have 'python' errors.
And, again, if you researched your 'python error', you might be able to get the SAME error under Windows.
>Microsoft does raise the bar
What bar is this? The one where people go to drink after a day of work where they re-load the OS to fix a problem?
Even Microsoft admits that NT crashes every 5 days. (the 2000 rollout) And you call this a 'raised bar'?
Unix has had stablility for YEARS. Microsoft claims 2000 is stable. Microsoft also claimed NT was stable.
Guess your 'bar' is a whole lot lower than mine.
If it was said on slashdot, it MUST be true!
>Unix/Linux is still not ready for the desktop
Apple gets to be the next company to try to make unix on the desktop.
in 6 months or so, Mac OS X (non-server edition) will ship, and next year it will be pre-loaded on ALL boxes.
If Apple keeps selling boxes after moving to unix (Mac OS X) then unix makes a fine desktop.
>Can we still be friends?
*smile* we never were friends, nor enemies.
If it was said on slashdot, it MUST be true!
I don't know Joe....Unix has been around 'for the rest of us' for years.
Its a matter of if you do research as a consumer to find what you need, or do you just use whatever was pre-loaded on the box you bought. I guess I have a higher standard of what I expect from my computers than Microsoft has delivered.
If you are happy with crashes, data corruption, and re-loading your OS, then by all means keep that 'technology to the rest of us.'
Given that Apple (the computer for the rest of us) is moving to Unix, it looks like the only 'geek factor' to Unix is close-minded people such as yourself Joe.
All Microsoft has done is lower what people expect from software. The EXPECT lockups, re-loading and reboots. Exactly HOW does this help comsumers?
If it was said on slashdot, it MUST be true!
SAM on an Apple ][+, via the speaker port.
If it was said on slashdot, it MUST be true!
The MKS toolkit has a working ln as well.
I wonder if one is included with the NT Server Ressource Kit, too.
Simple answer: Because the users want them.
You could have mountpoint-like behaviour with DOS using the JOIN command,
you could have mountpoint-like behaviour with NT 4 using the DFS extension.
Nobody used it.
And that's why we still have drive letters.
I doubt this will work anything like unix hardlinks. Even with the lack of technical detail it seemed more like this could perform linking across network thus really freeing up duplicate space(although, contributing to network traffic).
Hopefully it uses some sort of inteligence to decide when to link the files. Maybe it won't link files with the "archive" flag set or something. Or maybe it'll only link files with the same file-name. Either way, there had better be way to tell it not to link specific files.
I think they finally came up with the core dump, too.
AFAICT, neither emacs nor vi get you a new inode. Try the example I gave above, this time using vi or emacs to edit the file. Seems to keep the hard link on my system.
What do you mean, you doubt many people edit config files with cat >>? I saw a slightly stressed sysadmin here edit /etc/passwd with cat> (he missed the 2nd >). Not a happy man!
Pardon?
$ cat > a
Hello...
$ ln a b
$ cat >> b
...mum!
$ cat a
Hello...
...mum!
Hard links don't get dereferenced, unless (in the example above), you do something like
$ cp a c
$ cat >> c
and Dad, too!
$ cat a
Hello...
...mum!
$ cat c
Hello...
...mum!
and Dad, too!
The thing here is that its like exchange. Now exchange is a beast, but the idea is good. Keep one copy of a mail that may be read by hundreds of people. In a large scale system then all you need to do is share some files. I mean instead of everyone on an NT system installing their own stuff, it keeps on copy, thus freeing up the other crap that is there.
Innovation is not only coming up with new technologies, but also applying technologies. The screen reader in win2000 isnt great, but its a step in the right direction for all OS's.
br --jay
Arriving shortly on your local news ticker...
"Microsoft announced today that they would be releasing their newest version of their Single Instance Store innovation. It incorporates a new and improved hashing algorithm which will save 85-95% of disk space on servers.
In related news, users of the original Single Instance Store (SIS) technology were astonished to find out that due to a bug in the "signature" calculation algorithm, the software was combining 80-90% of the files on the disk into a single file, rather than just combining equivalent files. This seems to have been due to the algorithm being accidentally combined with their "troubleshooting" algorithm, another of Microsoft's innovations. It seems that instead of linking identical files, the original SIS technology was linking all files which contained or caused errors. More as the story develops."
The "Top 10" Reasons to procrastinate:
The "Top 10" Reasons to procrastinate:
10.
Is it just me or does this seem terribly reminicent of the never-ending battle between M$ and Netscape? This is nothing new. Micro$oft is known for their exploitive behavior. They see something useful in another operating system or application and they assimilate it. How much of netscapes inovations now reside in the M$ Internet Explorer? Unix is just the target this time, and it will continue to be. There are many inovations in Unix systems that will be absorbed by M$. I would suggest that the Unix world start taking some ideas from M$, but we were the one who came up with them in the first place. It would be like pirating from ourselves! I don't know about anyone else, but I'm all for creating an exclusion in the GPL for microsoft and all its employees.
I think they invented symbolic links back in the win95 days of shortcuts...
No, the sad part is that on /. you have to mention when you're telling the truth.
I'm not making this up!
I don't WANT my files automatically hard-linked or sym-linked together like that.. if I want them linked, then I'll link them, if I want them seperate, then they'd better stay seperate.. it's another case of Microsoft trying to make the computer think for you, but I don't want my computer to think for me, I want my computer to just do what it's told!
What struck me as a bad idea is that all this "symlink" information is stored in a central database, as opposed to using "symbolic link" files like Linux/Unix do. So if you happen to have a power failure when one of those databasee sectors is being written, your entire multi-gig disk could be essentially hosed.
I know that complete ignorance of the "enemy" is the in-thing on Slashdot, but if you must know, NTFS5 supports symlinks. And it's about time, too. Here's a free tool for creating them: http://www.sysinternals.com/misc.htm#junction
You, sir, are the embodyment of what I have (for years) called the Magical Vendor Syndrome. It appears that you are making the unquestionable assumption that MS (of all people), or any vendor, is somehow magically smarter, more clever, more knowledgable... than anyone you could ever work with on a daily basis. What is the foundation of this assumption?
Unfortunately, you are not alone. I have run across too many of your kind in my 10 years of systems management consultancy (oh, I'm sorry, you called us "hacker administrators") that it makes my blood boil.
The problem with most sufferers of Magical Vendor Syndrome is that they tend to believe too much of what a vendor's PR people say, instead of accepting the fact that most vendors are filled with more yahoos than anyone would ever beg to realize.
Ohh... and, I would trust a Perl script written by any competent, professional systems manager (who intimately knows my local network and my company's business needs) long before I submit to some snot-nosed fortune teller in Redmond.
--
--
A PC without windows is like chocolate cake with no mustard.
Shhh! Keep it down! This is gonna be on the next "Windows NT vs. Linux" information sheet Redmond provides to the retailers....
Online gaming for motivated, sportsmanlike players: www.steelmaelstrom.org.
Online gaming for motivated, sportsmanlike players: www.steelmaelstrom.org.
Online gaming for motivated, sportsmanlike players: www.steelmaelstrom.org.
Online gaming for motivated, sportsmanlike players: www.steelmaelstrom.org.
80% - 90% is a LOT, and I think the only reason it works is specific to the nature of NT.
In order to get this much reduction, and make a process with this much overhead worthwhile, users have to have virtually all of their files in common. Where does this happen other than NT? Sunddenly everbody's default config files for every default application they run will only take up one user's worth of space, and only the unique files and modified config files will take up additional space. The only way for this to be efficient is if the bulk of the HD space a given user uses is msde up of many default config files -- this makes a lot of sense for Windows, but not so much for *nix systems.
I was really shocked when I first heard this number, but when you think about it, it kinda makes sense.
-ac
<Annoying, ain't it?>
A quick comment before I start - First Microsoft mimics MAC OS and now it appears that they have been attempting to mimic Unix. So if Amazon can patient a web page concept how can Microsoft get away with this. Now on to the subject of the article - My first thought is how much additional over head this database would take-up. Unix references links through inodes so the file is more of a pointer on the physical disk using a database would be more of a redirect. Not only is reading a database at an application level more system intensive than reading the lower level inode reference but can be of potential danger in different ways.
So what are you going to do? Bleed on me?
Backups could be a concern if you did not get a good backup of the link reference database but since the links are stored in a database this may have some rebuilding information included. I don't know.
So what are you going to do? Bleed on me?
Good point. We all knew about symlinks years/decades ago. I did see that the article talk about administrators having to copy large amounts of data to a new machine. Aside from the OS, what does a machine need to ahve installed on it? Every system I have used usued mounts everything important.
Its true Unix dosen't exist. MSFT says so right here!
--
--
blinko - "the nail that sticks up gets hammered down"
>"We have hundreds of researchers working on tens of different research projects, and we're always searching
--
--
blinko - "the nail that sticks up gets hammered down"
From my reading of the article, the search order is not quite how you describe. System dlls are always loaded unless you:
I could be missing something here, but it seems to me that a tradeoff is being made. You are getting better compression of your data at the potential cost of reliability. Multiple instances of data = redundancy. If you have only one copy of something and for some reason something goes wrong with it, you have a problem everywhere it is needed. To some extent this parallels problems that the windows registry gives us. With the registry, you have a single point of failure, if it gets corrupted, very bad things start happening. Another example (as others here have pointed out) is DLLs. Multiple programs use the same DLLs. If they disagree over what version the DLL should be or if the DLL goes missing, you can end up with multiple simultaneous problems. I'm not saying this whole idea is a bad one or that it definitely cannot be made to work well, just that there is some risk here which makes the quality of the implementation very critical.
But here's an example. If I have 1GB drive C:, and 1GB drive D:, and copy a file from C: to D: for a backup - what happens?
- The file is copied and not linked.
- Then it can't be very valuable on a server that is likely to have several separate drives anyway.
- The file is linked.
- Then when the first disk goes bad and I want my back-up, all I have is a link.
Again, it's hard to argue semantics when we don't know the technical details, but it seems to me that something of this nature shouldn't be automatic, it should at least ask if you want to do it - and if it does that, then it's just one more annoying dialog box you'll have to put up with.----------
Stupid sexy Flanders.
Here's an interesting thought, what if there are two files (foo.dat - bar.dat) which have the same name, but are entirely different data files. would this kill one of them and link it to the first, or would it be smart enough **chuckle** to realize that there is a fundamental difference between the files and ignore the multiplicity of the filename.
I guess what Im getting at is -- Does this utility check for multiple instantiation of a file by checking: the filename, a checksum, # bytes used, digital "footprint" what.
/me seriously doubts that Microsoft was able to pull this off in any way that i would not find to be more trouble than it's worth
May the forces of evil be confused on the way to your inbox.
This is applied to the file system in the following way: When you make a copy of a file, it simply points the copy to the original file. When _either_ of these is written to (or perhaps even upon first read), the file system breaks the link and actually does the copy before the modification.
I'm not sure about the hashing discussion - it seems that by the pigeonhole principle (summarized: if there are more pigeons than pigeonholes, some pigeonholes hold more than one pigeon.), that there will be some collisions between hash values, but because of the way the hash function is constructed (think of turning the file into a sequence of 8-bit integers, and looking at the sum modulo some number) order is unimportant, and so permutations of the same file have the same hash. These permutations can be found quickly - in fact, 1/2 have the first 8-bit sequence off, 3/4 the first two, etc.
The idea in principle does not seem to lend itself to security holes - when any file is modified, it gets broken from the original file. Thus I couldn't replace explore.exe with a trojan horse by copying explore.exe elsewhere, and using a dissasembler to add the code - when the dissasembler writes, it breaks the link. However - how many exact duplicate files exist on the system? And what about the system making links between files of normal users and administrator-reserved areas - if the users see that their mail file is a link to one in the administrator's home directory, they're sure to know that the administrator is reading their mail!
"The romance of Silicon Valley was about money - excuse me, about changing the world, one million dollars at a time."
Visit
That's the thing we all love about UNIX: total control of our systems. If I don't want a process to run, I pull it out of the appropriate /etc/rc directory and yank it from crontabs. Voila!
Microsoft innovative? I guess flying donkeys have finally been discovered.
I thought Al Gore invented file links!
Oh wait a minute...isn't that what they've been doing since DOS? (Except for Windows, which was an innovation because they stole it from Apple [who in turn stole it from...etc....]).
And so the cycle continues...
ICQ: 49636524
snowphoton@mindspring.com
Got Rhinos?
ln -s microsoft a_little_late_on_the_uptake_aren't_we_boys?
Yeh Right....
I can imagine it now...
edit mygameslist.txt
{I add a list of 10 games}
CTRL S
{OS tries to find an identical file}
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
GRIND
{File saved}
Yes, sounds like a bonza idea to me....
They are talking thru their fucking hats again
Simon
The real linux_penguin has Slashdot ID 101961. Anyone else is an impostor. Including Bruce Perens.
READY.
LOAD "SAM64",8,1
SEARCHING FOR SAM64
LOADING
READY.
SYS4096
READY.
SAY "MICROSOFT INVENTED TEXT TO SPEECH"
READY.
You've got to be kidding me!
Simon
The real linux_penguin has Slashdot ID 101961. Anyone else is an impostor. Including Bruce Perens.
You are, ofcourse, assuming that Microsoft have half a clue....
A bad mistake to make my friend.
Simon
The real linux_penguin has Slashdot ID 101961. Anyone else is an impostor. Including Bruce Perens.
It looks like the operating system enforces this policy (e.g. the users don't have to manually create links). It would function kind of like a really high-level form of file system compression. Files would become 'blocks', and the file system would keep one copy of each block with pointers to it for each duplicate instance of the file.
It's basically symbolic links enforced by the file system. Sparse files (where the free space is compressed with run-length encoding) is the next closest thing that comes to mind.
43rd Law of Computing: Anything that can go wr
Ha! This strikes me as obvious - "Look, we invented a way to muck up a file-system so badly that no other operating system would ever attempt to mount a drive like this, for fear of accidentally destroying it! Lets spend thousands of man-hours writing code to essentially encrypt a drive in such a way that it'll take a really long time for the open-source people to confidently say that they've figured out all of the ins and outs of our convoluted system! It'll make anyone who installs W2K fear ever installing another system!"
Education is the silver bullet.
The best part is where it claims:
"...allowing users to store as much as five to 10 times the information as they could before."
Which is of course the most ridiculous nonsense I've heard from Microsoft recently. Even DOS 6's laughable "Doublespace" utility comes closer to delivering on this outrageous claim.
My personal favorite line:
Altogether, nine separate groups within Microsoft Research contributed more than 15 innovations to Windows 2000
WOW!!! Can you believe that!!! In a mere three years they managed to come up with 15 innovations. Those guys gefinitely earned their 7 figures.
Never underestimate the power of human stupidity -RAH
Is there a patent? Mayhaps someone can write a filesystem which implements this. I'm really doubtful that this is anything that will more than marginally affect effective hard drive capacities, and at some cost in overhead, but it might be worth playing with on a UNIX i doublt that a filesystem is what is needed, sounds like a daemon scanning in the background during file copies, etc. is more of whats going on. that and somesort of database to store the hashes.
Obviously Microsoft did not invent a particular software process to solve this. Anyone with a knowledge of design patterns would immediately recognize this as the GOF pattern, Flyweight. Larry
Wait a minute. A thought. They claim: "...80 to 90 percent of the space on a server...", right? So, by my fairly limited understanding, this would mean that any files owned by any user that had duplicate content would be symlinked using this system.
Several problems I can see already:
Salamander - Since this probably won't get moderated up (I'm not really saying anything anyway, just asking questions, so it probably shouldn't be), could you respond? I'm quite curious about this....
-RickHunter
--"We are gray. We stand between the candle and the star."
--Gray council, Babylon 5.
FoxPro?
--- RFC 1149 Compliant.
Wow!
Not only did Microsoft invent symbolic links, they also invented text-to-speech, a tool which allows you to troubleshoot computer problems (whoah) and IPv6!
Very impressive work -- and now that we know that Microsoft has somehow kidnapped (penguinnapped?) Tux, we know who's responsible for all these wonderful additions to Windows.
Tux is changing Microsoft from the inside!
Eventually, the world will be forced to say "oh, forget this Windows garbage, let's just use ______ (name of your favourite *nix here)".
meisenst
Green's Law of Debate: Anything is possible if you don't know what you're talking about.
Am I kust plain wrong or is this technique perfectly suitable for dll's?
impmression: Every application has ist DLL's in its own directory.
At the same time, all instances of the same dll are symlinked and use no additional disk space.
When new versions appear, they could be tested
by the system and in case of a failure it just keeps the old ones.
Too good to be true?
-- Just compiling Roxen 2.0beta, cool.
Ceci n'est pas une sig
Can anyone tell me?
Thanks,
-jimbo
"Hold me Bob!" "I would if I could man!" -Larry and Bob in VeggieTales
I always thought it was pretty cool and it definitely was a significant step from 1-2-3, which after all had a pretty limited interface, no fonts, one document at a time, the typical for the time totally idiosyncratic command set (no edit menu with cut copy paste etc).
Excel was a logical step rather than a new concept but I think they were the first to take it (unless Jazz had all those features and predated it, which wouldn't surprise me).
The time when M$ was really abusing the Excel situation was in about 1994-1998, when they brought Apple to their knees with the dreaded Mac Office 6.0 featuring some of the worst ever programmed software for the Mac. Word and Excel had taken the lead from WordPerfect and Lotus and Apple still had some independence.
What are the most important programs to the Mac? Word and Excel. What happened to the Mac user who installed 6.0 (probably forced to cause of file formats) ? The programs were BUTT-ugly, slow, fat, flaky, crashed- M$'s answer- get a PC (ie a Windows license). Once they played the Office card to the max with the threat of stopping it entirely (see Jackson's findings of fact) Apple bent over, Jobs abased himself before Gates at MacWorld (the giant video head of BillyG), and Macs now ship with IE (even though it's an integral part of an OS that doesn't run on macs, not an application). And then Office 98 came out, and hey presto! let's get an iMac they're CUTE. and they come with IE, goodie! Back then Lotus was really a much more obnoxious company than M$ about intellectual property - copy protected, 300$ for a DOS spreadsheet, the whole suit against Borland and other 1-2-3 clone makers...funny that Kapor ended up doing all this EFF stuff.
Besides the inbuilt text-speech which is kind off cool. The so called innovations are doable in Unix, off course they actually involve making the administrator earn his pay by writing scripts that suit his needs as opposed to adding bloat to the OS on features that can only cause trouble.
Besides the fact that there may be a good reason for me to have two identical files on my machine or that two files may be identical bitwise but may be totally different (e.g. an ASCII textfile and a binary file).This so called innovation calls for a process to be constantly running on a box that is hashing filenames and storing them in a database, performing database lookups and symlinking files. Now unless this process is constantly running (and hence slowing my machine down, making my 128Mbs of RAM on a 450Mhz K-6 seem slow) then what happens when 2 files are symlinked for being alike then one is changed when the other shouldn't? This is a recipe for disaster and nt an innovation.
As for the statistical trouble shooting tool crap, all it is is a computer generated FAQ that you don't have to read. After using Norton Utilities on my home PC and seeing what happens when you let a piece of software pick a solution to a problem for you instead of reading for yourself and solving the problem as it applies to your situation I cannot call that an innovation but instead another way to shot yourself in the foot.
The most amusing side affect of this new "innovation" is that it will be no longer possible to back up your important files. Further, if a user wants to copy another users file, and then edit it, both of them will end up with the same copy.
Thank you Microsoft, for inventing another feature that sucks
Pong & Visicalc are two. The mouse & pretty much anything that came from Xerox Parc are good examples too :)
Syllable : It's an Operating System
People dont read articles before wailing about them. Their idea is pretty good, except, I didnt see where it said when somebody tries to edit the file, it splits it. Looks to me like it would turn your .cshrc into a symlink, then when you changed it, your friends copy would change as well, without you ever knowing about.
But continuing on to other 'inovations'. Text to speech. Its already been on the MacOS for a while. MS must do most of their 'brainstorming' in front of a line of computers, one running solaris, one running linux, one running macos, etc...
So MS has finally invented links. Took them long
enough. Now we unix geeks can use them! I mean,
since they've been invented and everything.
But i wonder why there is only mention of the fact that you don't have to have copies of data all over the place (incidently the 80-90% figure is ridiculous -- you only get that if you're a complete moron, i would think), but make no mention of the other uses of links.
I never really use links to save diskspace, but to make use of the other advantages, which we all know. Would this be a side effect of what they've written? Probably not, i guess.
Q
os.system("perl -e 'print \"My first Python Script.\"'")
BTW, that's not symbolic links they re-invented, it's more similar to hardlinks. Though there's actually something interesting in there: the "new technology" includes a program that automatically finds identical files and creates links instead. Not that you couldn't do that with a shell script on Unix, and there probably are such scripts available somewhere, but it's not standard procedure to use such a thing.
The illegal we do immediately. The unconstitutional takes a little longer.
--Henry Kissinger
It will probably work like Unix hardlinks, so the links can't transcend partitions. And a copy of a file on the same partition is, by no stretch of imagination, a "backup", since a HD failure usually frags the whole disk.
The illegal we do immediately. The unconstitutional takes a little longer.
--Henry Kissinger
As I established in an earlier discussion here: marketing is the science of lying and cheating...
The illegal we do immediately. The unconstitutional takes a little longer.
--Henry Kissinger
Any ideas who the article is targetted at? Doesnt seem technical enough for techies.. maybe a bit too technical for general consumption. Managers perhaps?
All that "innovation" and Windows 9x still doesn't work! I am sooo impressed!
If you have 32 bit checksums (4 billion numbers) then you only need about 65,000 files before you expect two to have a matching checksum by chance. (A similar point was made in the comments to a story about DNA matching from about a month ago.)
Imagine the damage if your annual report happens to have the same checksum as something from your porn collection, and they get linked.
Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
My guess is that the feature described would be more analogous to making hard links (ln) than symbolic links (ln -s). (If you don't know what that means, you should go read the ln(1) man page, and possibly link(2) as well.) In your example, if I say "ln foo.conf foo.conf.old", no new storage is taken up, but it looks like there are two copies. Then when I edit foo.conf, the underlying file is de-referenced, but hangs around because foo.conf.old still references it. A new file is created to hold my edited version.
Of course this is all conjecture, as the article doesn't go into enough detail to tell what's really going on. However, I doubt that even Microsoft would do it in quite so brain-dead a manner as to cause the problem suggested.
CVS is teh suck. Use Vesta instead.
I admit, I oversimplified. It does depend on exactly how you manipulate a. However, most text editors (at least the ones I use on a regular basis) will get you a new inode when you save because they don't actually edit the file contents in place. (They're more like deleting the file and then writing the new one.) And I doubt many people actually edit their configuration files with cat >>.
CVS is teh suck. Use Vesta instead.
im laughing so hard that im crying in comp. sci class.
thank you for saving my day ;-)
the real shiftaling has user number 5134
Karma: -43 and DROPPING!!!
Well think of a little database of md5sums of all your files and some interface directly under
open() (#ifdef WIN32\n#define open() _open()\n#endif) and
close() (#ifdef WIN32\n#define close() _close()\n#endif)
that examines the file in question on opening and closing. Not exactly a performance improvement, but it could be if you scheduled it.
No. It's a really sick joke, knowing you can just make a filesystem api based on inodes. But then again, this is just another typical case of NIIR (Not Invented In Redmont), isn't it ?
Religion is what happens when nature strikes and groupthink goes wrong.
But the article had a scary bit about IPv6: Remember embrace and extend?
No, they bought that, too.
Actually, this algorithm is fundamental to the idea of "hash" anyway.
If you hash, you are bound to wind up with duplicate pointers sometime, so implicitly in the hash algorithm is a final "real data" check.
<grub> Reading
The thing is how many files are there that are redundant? In a typical harddisk, what are the chances that you have duplicate files all over the place? If a system is diligently admin'ed, there is very little chance of that at all!
Is the increase in code complexity and software bloat worth it?
This is just typical of MS. They invent a cool sounding feature and never investigate if it actually would benefit anyone. If anything, this is a convincing demostration of the failure of closed proprietry software. Vendors forcing customers to adopt features that they will rarely use. I don't think such a feature would not live long in the open-source environment. Anyone can plainly see that such a feature is pure cruft and will ignore it/not install it.
I am sure this is not an indication of the quality of research that Microsoft Research is putting out. But I could be wrong.
So what happens to your files when this database gets trashed.
I forgot, Micro$oft databases never get trashed. Seriously, so if the database gets corrupt we still have the 'symlinks' [They will have a different name] but now we have to grind through the 18Gig IDE harddrive to recreate the database?
Will this be less annoying than fastfind running constantly?
i've done my fair share of accidentally deleting files, and have been glad for keeping local backups, which according to this, would have only been links to the orignal.
i did read the article, but i might have missed something. if i were to delete the original, would all the links die? and how does it pick which file to base it on? or does it move a copy to a central location, and even link your original to a new master-copy
MS wants to re-invent another concept, fine. And some people will actually believe them. Whoop-dee-do, what's new. I'm reminded of that all-too-common question asked not so many years ago that sounded something like (gak) "Did AOL invent the Internet?" (Oh, you didn't have to listen to that one? Yer lucky.) This will be similar.
My problem lies in the fact that MS is guaranteed to screw this little algorithm up, or make it too easy to be screwed up by users, just like so many other things, and as usual, we're going to be stuck cleaning up their mess on a daily basis. GRRRR...
Here's a crazy idea: perhaps we could all take a lesson from Apple, or MS for that matter, and get UNIX made a high-school math requirement or something. By the time it makes a difference, the would-be Geeks won't even need to know MS exists, and the rest will have Eazel and other super-GUI's to use...
--------------------
Experience is a wonderful thing. It enables you to recognize a mistake when you make it again.
TangoChaz
"It's not enough to be on the right track -- you have to be moving faster than the train." -- Rod Davis, Editor of Seahorse Mag.
TangoChaz
--------------------
Wise men talk because they have something to say, fools because the
If they set up the links automatically, it begs the question of what happens when the file is opened for write access. Do we want the changes to appear in all incarnations of the data, or do we want to do a copy-on-write? Do you want your OS to decide this for you?
What happens ?
When opening the file a new copy has to be created. Then you can work on this copy. After closing the file the checksum is calculated (requires reading the whole file). The checksum is the same, because you just read and wrote the same byte. But now you have to compare the two files bytewise. A checksum makes it only very impropable, that these files are really different files, but to be sure you have compare them.
After that the second copy can be deleted. wow ... what kind of performance do you expect !?
Another problem is, that you might run out of diskspace, just by opening a file and writing one byte ... GREAT. I WANT TO HAVE IT ! NOW !
What do they mean innovation. If you look at it from an academic point of view, it's one of the first things they teach you about in your 1st year, 1st semester database design modules. Redundancy is bad and has been bad for dozens of years. It is one the founding principles of modern database design. Far be it for me to start slagging off M$ but the guys who built Access aren't renouned as world leaders in their feild. They just waste valuable brething space.
Amen brother!
Kudos to Microsoft for trying to improve their flagship OS with a feature that's definitely going to be useful.
BUT, they wouldn't NEED any of this, if they didn't have such a crappy model in the first place! WHY WHY WHY do we still have drive letters?!
It would just make life so much easier if I didn't have to sweat every time I move around some drives and reboot - praying that all my applications won't break because the registry has decided they're not where they're supposed to be.
The are called MicroSoft. Didn't you see Pirates of Silicon Valley ?
:P
Q: Where do you want to go today?
Lars -
"WHY WHY WHY do we still have drive letters?!"
Amen, brutha. I YEARN for the day when Microsoft does away with this klunky kludge of kruft.
Q: Where do you want to go today?
Lars -
Not only would this save screen space, they reasoned; it would also result in substantial performance improvements. Moreover, it would make it faster for information technology (IT) managers to install computers for new employees since they'd no longer be required to type in such a long IP address everytime they set up a new workstation.
The three sent a memo to Microsoft Chairman Bill Gates outlining the idea. Both Gates and the Windows Two Thousand product team liked the proposal, and the Windows team asked Bolosky if he could develop the feature himself. During the next year and a half, Bolosky, a researcher in Microsoft Research's Systems and Networking Group, and three of his researchers worked full time with a crack squad of numerologists to invent the following...
zero - 0
one - 1
two - 2
three - 3
four - 4
five - 5
six - 6
seven - 7
eight - 8
nine - 9
Have you read the comment text perhaps? If you had you might see why I came to this conclusion. Likewise the books I've read about Microsoft have led me to believe that their success is owed to their ruthless business strategy, and not their saavy programming skill. Their flagship product, DOS, upon which their entire fortune has since been built, was not written by them. Since then, their software has either been purchased, or implemented in a half-assed manner in order to get a leg up on the competition long enough to steal their features. I doubt I could match Bill Gates at crushing one's opponents in the business arena, but as a programmer he was at best mediocre (although it's impossible to grade him accurately due to his policy of not allowing people to see his work).
There are a few pro-Microsoft posts on this thread, but I replied only to this one because of the glaring innaccuracies contained therein (only one of which is mentioned in my original post). I call people trolls when I find it ridiculous to believe that, given they are mentally compentant human beings, that they are taking what they are saying seriously. Whether or not the original poster was not a troll or not a mentally competant human being, I will leave as an exercise for the reader.
I am from Europe, and we are always hearing about how the Americans like to reward success, hard work, achievment, and winning etc etc. Judging by the comments on Slashdot, this is not true. Don't be blinkered by jealousy.
Do all Europeans make gross generalizations about people in foriegn nations :-)? I myself cannot speak for ALL Americans, or even any American besides me. Personally I don't think that money is that important -- I value personal satisfaction and morality. I'm sure that in your biased eyes I am longing to be just like the Bowman's (whoever they are), with a Porche Carrera and a job at a soulless Silicon Valley manufacturer of bullshit, but that's not how I measure success.
Please have a pleasant day and a rewarding tomorrow,
-W.W.
"Well it should be obvious to even the most dim-witted individual who holds an advanced degree in hyperbolic topology...
You gave yourself away right here, Mr. Troll sir.
I give your trolling a "C-".
Sorry old chap, try not to be so obvious next time.
-W.W.
"Well it should be obvious to even the most dim-witted individual who holds an advanced degree in hyperbolic topology...
One of the most practical applications of SIS is to support Remote Installation Services. RIS stores copies of Windows 2000 Installation media files that are uploaded from the original CDs. This makes it take much less space for storing several OS versions and the relevant service packs (the need for such technology is complicated by Microsoft's new maintenance and release strategy which is quite ugly). The fact that RIS REQUIRES SIS may make it difficult to replicate the RIS store in a large enterprise, because I'm not sure that it is easily duplicated or backed up.
Oh my god. This is scary. It also makes me want to give my own BSOD.
---------- I laugh at a dumb SysAdmin.
I'm telling ya, this whole friggin company is in denial. I don't think they believe their own propaganda, just that if they repeat it enough, everyone else WILL. Can someone point me to something worthwhile (there has to be SOMETHING) that these guys DID invent? (Excel, maybe? I don't think they bought/stole that off someone else...)
mas cerveza, por favor politically incorrect stu
wow! that's a good one. I won't refer to the tech wow! that's a good one. I won't refer to the tech value of the 'innovation', but I will comment a bit the spirit of this article. the yanks and other pals from the west cannot know but I bet that ppl who grew up in a communist country in the 80s (like Romanians, Hungarians, East Germans, Russians, Bulgarians) recognize the classic propaganda pattern. It really makes me sick. If you translate it in romanian and put it next to an old communist party document, you' won't tell the difference!!! trust me, I WAS THERE!
"It's another example of Windows taking too much control, when letting the user decide would be more appropriate"
You have to remember though that it's a function of Remote Install Server. This isn't shipping as a built-in part of Win2k Pro - It's an option. And I can see the usefulness for it in certain situations.
signature smigmature
- James
Boy I wished my Linux box could have those. ;) D. Wulf
- Hard drive size calculated to be several hundred GB, thanks to infinite recurse.
- Explorer slowed down a lot when doing said infinite calculation.
- Folder could not be deleted without nuking the entire hard drive.
The real sad part is that I'm not making it up. Win95/FAT32 actually did this!!Maybe it's a new application of an old process, but LOOK WHO'S TRYING TO DO IT! I certainly don't want Microsoft helping me administer that way. I can just imagine the headaches it would cause and I pray there's a way to turn that sucker off. They have enough trouble trying to build a web browser that's secure, much less a central repository that'll keep track of all my traffic. How will they deal with sharing conflicts? Probably through memory leaks. And anyone who has written any software can't help but laugh at the Research Division (in caps)and the overstated 'bigger is better' way the M does things (we have liasions (with communications degrees prolly) for our programmers so that way no one really knows what they're talking about!).
"Thank you. Please spellcheck your genitalia references though.
"My philosophy is that technology transfer is fundamentally a social process,"
Sure this sounds great! I am a productive member of soceity can I join in and help code Windows 2000? Please send url where I can download the complete Windows 2000 source code, thanks.
"`Ford, you're turning into a penguin. Stop it.'" -THHGTTG
Scenario: 1) User creates a document, priceless.doc; decides to take a copy for safe keeping. 2) copy priceless.doc priceless.bak 2) W2K notices it's a bit-for-bit copy; replaces priceless.bak with a link to priceless.doc (or vice versa.) ...later... 3) Clumsy user damages/destroys priceless.doc 4) "No problem, I've got a backup." 5) User looks at precious.bak 6) "AAARRRRGGGGHHHHH!!!!"
People have raised two problems that are caused by the single store mechanism: the need to keep backup copies in case one gets corrupted, and the surprising growth of disk usage when writing small changes to large shared files.
I thought of a variation in the single store mechanism that would solve both problems. While it is good to reduce many redundant copies of files, it is not essential that the number be reduced to only one for each file. Instead of a single store per file, how about a small number, on the order of two or three? This could be done even if there were only one use of a file, just to provide automatic redundancy. Putting the copies on different disks would make it even better - disk is fairly cheap after all. RAID makes a lot of sense for the same reason.
When a change is made to a file that has duplicates (maybe not the right word; "shared" is not right either; "mirrors", "clones", "replicas"?), then one of the duplicates can be used rather than making a new file. This reduces the number of duplicates by one, but if you are short of disk space, you can live with it for a while.
Storing differences would make sense too, though with even more complexity in the file system trying to make it transparent.
One other thing I haven't seen mention of: The article suggests that the process first looks for redundant copies, and then makes a signature of the file which is stored in a database. But the signature would not be useful (except for verifying integrity) unless it were compared to that of other files. So the process is most likely that it looks for duplicates by first computing the signature of a file and looking that up in the database to see if any other file has the same.
Looking for similar though not identical files could be done in a similar way, by computing signatures of blocks of the file. When there is an insertion in the middle of a file, we can keep data in blocks aligned by inserting new blocks rather than shifting all the content of following blocks.
I don't know much about compressed file systems, but I would guess they are approaching the same kind of solution from another direction. But I wonder if any compressed file system tries to build in some small amount of redundancy for reliability while it is saving you from grossly wasteful redundancy.
Daniel LaLiberte https://www.facebook.com/daniel.laliberte
No need to get all defensive. Linux is available for all people too, and Microsoft is passing this off as a great revolution. Actually, it's just a slightly new spin on a 15 year old idea.
What you really mean is'Those who do not study Turing machine are dommed to reinvent {OS/360|Unix|Linux|W2K}, poorly'
I think everyone kind of missed the point here. Yes, this is very much like symlinks or mac aliases, and yes, microsoft probably lifted the tech from one of these, but the thing that everyone missed is that microsoft has the most widely-supported OS out there, and if it were to actually get better, wouldn't that be nice for the rest of us? I don't care one bit what operating system I'm using, as long as it's usable. Maybe MS is geting there?
Interesting that this "invention" press release was issued right after the heavy campaigning of Al Gore, "Inventor of the Internet" in WA. Methinks Al and the Bill-ionaires had a "power lunch" together or something...
I think that a rather large drink would be in order. But it is a great invention, why not implement this in Linux, and call it "links" (although there a slight difference, win2k searches for duplications all by itself (what a great idea to improve performance)).
-------- And here's my sig: -rw-r--r-- 1 arco users 83 Mar 3 08:49
This goes right along with all the wonderful things Microsoft has done.
Remember the time they invented the mouse...yea for microsoft mice!
Or, how about when they reinvented the modem, the Winmodem...yea for microsoft!
Or that time when they invented the browser...yea for Microsoft!
big talk...that's their story..
This was meant to be a joke...i know it is incorrect. (yes...you have to tell people that these days!_
Did you notice the way micro$lot get the attention of the *NIX comunity? Or are they trying to make their users believe as microsoft as a company of innovators? I mean, did Linux developers the same when they when they created Masquerading or other useful utility for Linux? Course not! Linux developers aren't a marketing staff. They are the real innovators!
Did you remember when Apple sued microsoft for copying their GUI?
anybody still thinkin' about micro$lot as innovators?
Could someone with more knowledge clear up how you
are supposed to delete a file with this system? I
think with multiple links to the same file, you
could easily delete the links. What happens when
the last link is deleted? Does it remove the
original from the database? Does it stay there,
in what amounts to another recycle bin?
These people looked deep into my soul and assigned me a number based on the order in which I joined.
"[Microsoft Windows is an example of] true innovation..." -- Slashdot.org
Where do you get that Redhat's Min install is 500 meg? I'm sure that I don't need the swahili transaltions or any of that... Heck, I'm running slack 7.0, and I KNOW that it isn't even eating up 500 meg right now, unless you count all of the images and stuff that I've been generating for webpages and stuff...
Eh...
The following is a posting I made today on the ReiserFS mailing list, in response to this Microsoft innovation. Maybe this will help some people pull information from the ashes, because there is something to be learned everywhere. I am not a platform bigot; I'm merely against Stuff That Sucks. And we have to always be on the lookout to find the diamond in the rough, even if that diamond is just a gem of an idea that you'd end up cutting yourself.
:-}
Heck, sometimes good ideas are arrived at by the process of eliminating the bad ones.
===
*beep* *beep* *beep*
That's the sound of my FUD-ometer, in light scan mode. It's not going
into red alert because this Microsoft article was merely rife with
unsubstantiated, abstract, sensationalistic, and wild claims, rather
than actual blatantly destructive FUD.
Many times when Microsoft's actions are observed by
non-Microsoft-oriented individuals, the reaction is rather extreme.
Microsoft as a whole basically lives in its own impenetrable,
self-contained, narcissistic universe, which they believe is
inherently objectively correct and good for all humankind. Sometimes
that fact is not directly evident in each case, but we can STILL look
at what they do and SOMEHOW contrive something pretty cool.
It sounds like this technique may be working AT LEAST in their own
environments, and it may be worth looking into even if JUST to provide
totally abstract inspiration or motivation to create our own product.
I'll just step into the Microsoft world here for a minute. I'll
attempt to make sense of some info so as to use it for our advantage,
not to vent sour emotions, or whatever. The truth about Microsoft is
out there, and it's hideously ugly, but once you clean it up, it can be
useful, as long as you then reimplement in a completely different way.
:)
I'm thinking that they're subversively referring to a few unwritten
facts:
* that the network is often an objective bottleneck in a file service
context
* aggressive data reduction techniques improve the performance of
things like caches and general throughput, but we usually don't do that
nearly as much as we can, such as extra conceptual layers like
compression, checksumming, and hard linking
* that their software is almost totally inherently non-multiuser and
grotesquely bloated, so as to likewise dramatically bloat the net
effect of any technological advance
So if you had a way to reduce the damage of the fact that, as another
esteemed reiserfs contributer pointed out, each small Microsoft
application installation will take about 100 MB, then you'd be doing
way better. -Especially- in a Microsoft-oriented system. Not good.
But better than worse.
Let their overall stated BENEFIT of their technological property be of
little consequence to us. It's under the governance of an alien
distortion field as far as we're concerned, which may or may not pan
out to reality. Unless you're re-engineering one of their own
products, as is the case of Samba and WINE, it's often a waste of time
to try to rationalize their claims.
Virtually all of their software is a monolithic monstrosity in every
sense of the concept of computing. That's quite a kind way to state it
once you've examined it from any one point of view, and to get any more
kind would require the blurring of many ethical and technical details.
So ANYthing you do to reduce that will be noticable, and people will
consider anything to improve their Windows systems, even if it's
architecturally questionable.
I think you guys have the right direction, when you're talking about
normal concepts such as copy-on-write and other stuff like that built
right into reiserfs. But don't lose any sleep over deciphering their
claims, their implementations, motivations, or goals, unless you're
wanting to be an expert in group psychology.
And maybe I just made a complete fool of myself, but all I care about
here is that people on reiserfs@devlinux.com are having a tough time
deciphering the value of a product and information about that product,
which were engineered in an environment where time, space, and ethics
may or may not apply
The whole technology was created because of flawed design of applications on windows:
.rpm's/.deb's that you install. End of problem.
First, instead of installing an application once, and users may not write it, every user get's their own installation of a software package.
So, instead of fixing that problem, they fix the resulting disk space problem, and to top it off, they create "self healing" operating systems and applications so that if the user damages it (read: some buggy software screws it up), it can reinstall itself.
Linux has something for that. It's called, users cannot write to applications, and keep a backup of all the
In case you didn't notice, this symbolic link technology (which has been around in various forms for a while) only seems to run on the file server. This means that system files would be unaffected by any increase in efficiency. True, this would cut down on redundant files on the server, but I would bet that nearly all of the critical DLLs for any application would be stored locally, so you wouldn't have problems on startup. (But who reboots NT machines anyway? They're SO stable!) About the only benefit I can see from this is that you don't have as many redundant mp3s sitting around on the network. Maybe it would apply to porn videos as well. In other words, media files, mostly read only. Databases wouldn't be covered, nor would any important application or OS files. Most documents would be single-user, or modified, anyway. This is really innovative. Uh-huh. Sure.
WARNING: there is a trojan on your
Words fail me ..... this is such a revolutionary idea, how come nobody ever thought of it before?
Why when I want to sound sarcastic I fail miserably!
A did read it, the lunch stayed down - but only just :)
What you really mean is Those who do not study OS/360 are doomed to reinvent UNIX, poorly.
A well-crafted lie appears unquestionable - Dama Mahaleo
I got this back from Microsoft:
0 0/02-28w2k.asp).
-----
*email address deleted*
Below you'll find a few points that will hopefully clear any confusion
resulting from the article "Microsoft Research Innovations Enhance Windows
2000" posted on PressPass on Monday, February 28, 2000
(http://www.microsoft.com/presspass/features/20
Microsoft's Single Instance Store (SIS), a new feature in Windows 2000
developed by Microsoft Research that helps improve file system performance
by reducing the amount of redundant data stored on a server, is similar to
the "symbolic link" feature implemented in UNIX and other operating systems.
Microsoft researchers were aware of symbolic links and other previous file
system innovations; with SIS, they developed a new file system feature that
differs from symbolic links in several fundamental ways:
1. If a user has two files sharing disk storage via SIS and someone modifies
one of the files, users of the other file do not see the changes. For
example, if files called "foo.txt" and "bar.txt" have the same content and
are sharing disk storage because of SIS, and a user makes a change to one of
these files, the change is not reflected in the other file. The two files
are "linked" only as long as they are identical. With symbolic links,
changes made through one of the links are visible to users of all the other
links or the underlying file.
2. The underlying shared disk storage that backs up SIS links is maintained
by the system and is only deleted if all the SIS links pointing to it are
deleted. In contrast, symbolic links can "break" if a user deletes the
underlying file.
3. SIS works automatically without any user involvement, in contrast to
symbolic links, which must be set up and maintained by the user.
-----
While symbolic links aren't necessarily set by the -user- (the sysadmin is more likely to be doing this sort of thing in a situation where read-only files are accessed by multiple users and/or apps), it seems they've come up with a worthwhile innovation for their filesystem.
In reading the documentation to NTFS a while back I noticed it had fields available to do Symbolic and Hard links since NT 3.51. If I'm to be remembering correctly there is specific mention of this in the NT 4.0 Resource Kit. The documentation states that Symbolic links can be used if created but that no tools exist that allow their creation.
The program isn't debugged until the last user is dead.
Nothing exists exept atoms and empty space; everything else is opinion.
blah blah blah....
Something that occurred to me - ok, so you can make a backup of a file (eg foo.conf -> foo.conf.old), and if you edit foo.conf, then SiS splits them back to two seperate files. Wonderful.
But in the meantime, before you've edited foo.conf and the two files are effectively hard-linked together, what happens when Windows crashes (as happens now and again) - foo.conf could get corrupted. And since foo.conf.old is still linked to it, my nice backup is now corrupted too. Wonderful.
This innovation (and yes, I do think it's a new idea - the automation and all), _must_ have an easy way to disable it.
My school has just this problem! We are running Win95*cough*Lose95 with Novell software to network us all. However there about 70 computers and mataining all of the software, adding new things keeps our system administator up late. I have been begging them to switch to Linux because, as another side effect of running Lose95, about 1/3 of our computers are crashed, always!
---
What exactly are the commercial possiblilities of Ovine Aviation?
Maybe these creative innovators are moonlighting at Amazon.
carlos
--
As a matter of fact, I am a lawyer. But I play an actor on TV.
[quote]During the next 1-1/2 years, Bolosky, ..., and three of his researchers worked full time ... to build the technology, now known as the Single Instance Store.[/quote]
.. that they are passing it off as innovative really sucks ..
1 1/2 years? Maybe there's a little bit more to it than I originally thought.
[quote]The Single Instance Store recognizes that there's duplication, coalesces the extra copies and stores the bits once instead of several times[/quote]
They already have soft links don't they ? in the form of shortcuts, so why are they bothering with this
How long till some poor NT admin keeps a local backup of his 10Gb SQLServer data files and forgets to turn this new feature off. Sorry Mr. user, could you wait a couple of minutes while I just recopy this file you want to change....
Special Relativity: The person in the other queue thinks yours is moving faster.
I've been sitting here for 5 minutes trying to think of a witty/sarcastic thing to type, but words fail me.
I think you have the idea. My reading is that it will be registry based and will only affect the system portected files - which you cant change anyway. I think most /.'s unfortunatly dont understand the concept of system protected files, thence it could well be a good idea to store protected, never changing files in one place only. Remember, the portected dll's cant be changed I also bet there is a heavy touch of Active Directory here - also note that AD becomes the file system IF you go to the full AD setup. Sys protected files also have the self repair with them, so there is some logic in basing the files in one place - plus of course there is the 300 meg in .inf files that could be centrally located too But God help you if the network collapsed. THEN, this would be a very bad idea and the sys admin would be looking for a new job.
"Old Rallydrivers never die - they just fail to book in on time"
M$ also recently took credit for "developing" DHCP and mentions that you need their active directory to use dynamic DNS effectively. Check the exec summary at http://www.microsoft.com/techNet/showcase/w2kinfdd .asp Lets hope those engineers never discover RFC's or we'll never hear the end of the advances M$ is making.
Thank you for making this point. I get so irritated with MVS (which I haven't every used before but like immensely). Thank You, Thank You, Thank you.
I thought about this for years (and implemented it three years ago, but not in an automatic way).
You scan a filesystem, compute MD5s for each file and hard link the same ones. It can save a _vast_ amount of space in some case. It only work for read-only files, but, fortunately, it is there where you have most duplicate (Archive of my dev projects, mostly. I was reluctant to tgz them)
I used it to have more compact archives. Sure an automatic way of doing this coupled with copy-on write wouldbe really nice in specific cases. And no, it is not similar to symbolic links.
But the real way of dopping it would probably to do this just above the disk level (because at file level there are a *lot* of problem: when the same file is shared by 2 users, where are you storing the owner ?)
Cheers,
--fred
1 reply beneath your current threshold.
I am sure that by automatic... they mean an defrag-esque program. BTW, defrag takes me about... oh 9 hours to complete, and All I have is a 10 Gigger in my winbox... 5 gigs are free... like I am going to trust them? Can you IMAGINE the datastructure they would have to create to do this? MEMORY HOG!!!!
Hashes are not good for this... I have 2 passwords at home that crypt to the same thing. Hashes at least one ways... is there another kind?)can have collissions. I would hate to have my word.exe be replaced with a porn.gif
Why argue about where it came from or how it works. I think if they didn't have such a bloat OS they wouldn't have such a problem. I cannot see any real advantage to this. Like one of you guys said, a P3 800 on windows 2000 with "features" will run like a p2 400 on windows 95. I think its inevitable unix style operating systems are far superior. The fact that Microsoft's operating system is actually DOS and that Windows is the GUI with tons of bloat (all the drivers and such should be held outside of the GUI). Untill we see something like 'DOS NT'. With all their so called inovations the thing is going to explode because it is so bloated.
""I'd much rather see the most reliable and usable operating system than the most whizzy-bang operating system," Cutler says. "To increase reliability we have to make choices. For every 10 bugs we fix, we may introduce three more. But do you want to ship with 10 bugs, or do you want to ship with three?"
"Do you want one more new feature," Lucovsky concurs, "or do you want to fix more bugs?""
Man, I guess I'd better quit using *nix now, huh? What a sham. How Clintonesque, or rather, Gatesesgue....
So, M$ is taking credit for something that UN*X has had for decades. I suspect that this may be only the first story about their cloning old UN*X technology into W2K (they put support for mount points in there too, I hear).
Ignoring the damage this might do to a few idiots that believe everything M$ tells them (and I do know a few), this does illustrate an important point: Win2K is 10 to 20 years behind the development curve of other OSes. M$ is just now struggling to incorporate features that have been in UN*X since just about forever, because the rebirth of the UN*X design philosophy in Linux now represents a serious threat to the continued existence of the empire. The fact is that while Microsoft tries to catch up with long existing design features, Linux and the BSDs have been making very significant advances in things like process management, SMP utilization, scalability (1 OS instead of 3), ultra-high performance kernel-level web servers (khttpd), and the list goes on.
By the way, geeky technical things aren't all that M$ has to work on. The Windows GUI still sucks large portions of **edited for the kiddies** when compared to KDE, GNOME, the classic MacOS GUI, or Aqua. Heck, I'd rather use FVWM!
Anonymous Luddite: "What do you think of the dehumanizing effects of the Internet?"
Andy Grove: "Not Much."
I thought they reached their technological pinnacle years ago, when they invented a user interface based on graphical icons and a pointing device. Good to see they're still on the cutting edge.
http://www.farmerbob.org
The point of DLLs is that they are shared and not static libraries. If application [a] used qwerty.dll and is running then you fire up application [b], application [b] will use [a]'s version of the DLL.
This is NOT flamebait...just some comments. So, don't get all bent out of shape and don't kill my karma for wanting to have this clarified. First of all, the "open source" world did not start 4 years ago. It started over 20 years ago. Programs like vi and emacs have roots that are much older than Linux has been present in the popular press and trade rags. Making the assumption the model of giving away source and protocols is dangerous. Think TCP/IP, HTTP, SMTP, UNIX, and a thousand other tools, languages, applications, and OSes that were available on the pre-popular internet. Second, that I know of, Bill Gates and Steve Ballmer have never written bug-free code in any language, C++, assembler, or otherwise. And, what makes a marketer qualified to cast judgement about the complexit of a languge. C++ might have some baggage and is certainly not my choice for a development project, but Algol was arcane and needlessly complex. C++ was revolutionary in bringing some (albeit very small) order to the process of software engineering by allowing developers to code in the context of objects in a system rather than thinking in terms of procedures. To get to the meat of your article...I don't think anyone believes that joe sixpack sitting at home with a cable modem will download Linux and hack himself out some kernel patches. I believe that you will find distributions that do not install the source during an install by default if you do some looking. Linux does not have to be closed source to be successful with the average user who trashes their autoexec.bat file...though don't you think there is a problem with having an autoexec.bat file in the frist place...Linux only has to hide the complexity from the average user. Microsoft does not have its stock price and market cap as a result of its closed source model. It has simply been the most exploitive user of the marketplace. Next time, match "considerable marketing expertise" with equally considerable time doing background research.
This is like those "revolutionary optical mice" that they released late last year. They use a laser instead of a ball on the underside of the mouse. Only cool thing about it is that when you turn one over to look at the laser (!), it reduces its power output.
Funny, I remember having one of those as a kid (somewhere around the time of King's Quest and Zork so say 1985) when we had a 4 inch tall 20MB hard drive. The only difference is that now they don't use the grid for a mouse pad. And they cost $60 in the stores!
It's too bad the general public (and popular technolog press) are so easily duped into believing that this is all cutting edge and new.
Yes, you're right. They are LEDs.
Don't get me wrong. I'm not directing religious fervor at MS. They have on occasion made some reasonable products -- the original flight simulator was cool. I was just bringing up the lack of real innovation. Optical mice have been around years...they just added a new interface (which is nice, esp the USB). It was lack of innovation that I was trying to reinforce.
The people at MS Research are doing some cool work (smart sym links isn't it).
Just didn't want to come across as bashing MS. Though judging by posts today, it seems that is pretty hard to believe. I'd much rather have better products and organizations stand on their own legs than have zealots out screaming injustice.
I don't think htat Microsofties are stupid enough to do this, they probably will do what other people said, create a new copy of the file if you change it, otherwise not only would there be a problem with backups files as you explain it but also security wise.
Let's just imagine that I copy /bin/login or it's equivalent to $HOME/mylogin or it's equivalent. If they don't copy it when I modify it then I could modify it to my version of login with, say a few extra instructions to mail the password of the guy to an anonymous Internet account if connected to the net, or to a crypted file in a public writable folder (well, yes directory, but keep remembering that we are in MS embrace and extend environment) where I can fetch it and decrypt it to know the passwords. Oh, i almost forgot, I also can add a few instructions that allow me to get roo^H^H^Hadministrator access.
I don't think MS guys are stupid enough to do such a dangerous thing, so they probably did implement something to avoid that. However, if they did do it in this straightforward and wrong way we will have some funny /. stories in the future ;)
"The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers." Bill Gates,
Sounds like a good way for one program to fuck up the entire network, not just one machine. Now that's the kind of inovation M$ is best at...
If this is all automated by the os. What happens when you want to back up files? Perhaps you have a reason for using multiple copies of something, and now this new 'feature' could send stuff to hell. On the other hand...
hm, how innovative. and weren't symlinks there long before the symlinks that microsoft's now inventing (sic) and publicising in the press pass section? an innovation for Windows, perhaps, but this is nothing new. and really, it's not new. microsoft has long known to be trying to mislead the public. remeber the linux myths article? when they said their OS has lower TOC when compared to UNIX, they actaully compared with solaris which is of course not right since linux != solaris (gee, i guess they were too dense to figure that out.) they made other studid comparisons, that just don't work out in reality. this one is newer. remember the dot truth is out there propaganda campaign? they want to make sun microsystems machines seem bad when compared with microsoft ones, when in fact we all know this is not true. numerous ISPs rely on Sun solaris for their servers. just let me try to recall _ONE_ isp using M$ windows NT/2000 ... mislead the public. twist the truth. deny the accusations. that's microsoft.
No come on people, we can make money from this.
I'm going to email BillG right now "Hey Bill, I got this great idea. How about if we make W2K such that we can load a different UI on it. I'm thinking of calling it 'W'".
or "Bill, I got this great idea. How about being able to load a printer driver without rebooting the OS".
Hell we could make a mint
AjR: An NT admin at work, and a Linux admin elsewhere. I know where I'd rather be right now...
...Upgrade now to Schrodingers Dog...
Lotus came up with this idea at least 4 years ago with the single copy object store in Lotus Notes. Not everyone likes it as if you loose or corrupt your object store you are in deep do dos. This is nothing like symbolic links in *nix as this is a background operation. M$ turned symbolic links into 'shortcuts' and implemented them very poorly.
FF was the first thing to go after installing Office. Huge performance hit....scandisk restarting over and over....random HDD writes that brought everything else to a halt... Thanks but no thanks, I'll take care of my data myself.
What if you modify the file all the others are symlinked to? If there's no way to tell (in the name of 'user friendliness', M$ probably set it up so the avg. user cannot tell which is the link and which is the actual file), it would probably change the data there, while the links point to the same place on the FAT. Thus, utter chaos. Although this could have some use (ie everyone has the same copy of the file, namely the most recent) it seems to me that if I wanted this, I could just use *nix and set up symlinks. Thank god for M$ 'innovation' which ends up screwing up an excellent idea. (BTW, isn't this what 'shortcuts' in the 9x / NT4 interface were supposed to be for anyway?)
That's it. I'm no longer part of Team Sanity.
it took 4 microsoft engineers, 1.5 years to develop this revolutionary idea ... maybe they'll get the patent too :) I'm currently working on a break-through idea of my own -- why not make directories appear as folders. think of all the time and energy it'll save in explaining things to non-technicals. Should only take a few years (by myself). I smell a patent, baby. /Daemo
"Perhaps I need a cup of coffee before I try to post stories." you're in denial buddy... perhaps you need a bowl of hot grits down your pants!
I'm hemos., aka Jeff. Bates.. I help run this site, along with Rob. Malda.. I handle books, and generally posting storie
*Not having any experience with WIN2K* Well, seems to me that SIS should see the registry files and the registry back up files as duplicates of the same file. So, if you damage the registry while installing software, or just poking around, you would corrupt the back up as well. End result - Reinstall. Sounds like a typical MS feature to me
Well this is scary. It's kind of an enforced symbolic link. You create a file, and if it's the same as another file, a link to the other file is saved instead of your actual file. What happens if you modify your file? Does your file get saved separately at that point, or does it change the file you're linked to? If the former, it's hardly a symbolic link (in the Unix sense), if the latter, it's just a Bad Idea (tm). That said, if implemented well, I think this could be a very effective way of saving disk space.
kiku wa ittoki no haji kikanu wa matsudai no haji
Well... if it's so easy to do in perl, someone really ought to do it ASAP before the patent lawyers show up... relevant reference: http://www.assurdo.com Look for the info on PerlFS. /Brian
It is an automated feature of the server that automatically merges identical files into links. This is done transparently to the user. The innovation isn't from the fact it's symlinks, but from the fact it's an automated system. I'd imagine a part of it is a "copy-on-change" as well. Automatically copying the file to a new location when a change is made.
So yes... Throw them a bone, it is an innovation. Now.... Let's get it running under Linux. :)
Poorly researched and obscenely biased. Tell me, editors, did you bother to actually read the article? What they did actually _is_ innovative. I'll be the first to admit, usually when MS brags about an innovation, it is a thinly disguised patent land grab, but not this time. It sounds like a very good idea that was slightly misunderstood by the reporter.
First, that this was probably necessitated by bloat and redundancy in the OS itself, rather than in expected redundancy of user files
Second, also in the article:
Translation: "We came up with some of our own ideas". Kind of an odd thing to be so self-congratulatory about.When comparing files, the SIS will treat the strings "Microsoft" and "hot grits" as identical.
When saving duplicate data in the common store, 10 backup copies will be made, reducing the 80 to 90% savings to, say, -40%.
When saving data, any occurrence of the string "Micro" will be converted to "hot". Likewise, "soft" to "grits". (Yeah, like you didn't see that one coming from a mile away.)
Before saving any data to the common store, it will first be read aloud, using Microsoft's innovative text-to-speech engine. Of course, the system may lock up until the data transfer is complete...
Three words: "Reboot to continue."
SIS will automagically monitor your hard drive performance, identify the sectors most likely to experience imminent hardware failure, and put the common data store there.
Rather than risk data corruption on a local hard drive, the common data store will be kept on the net someplace, on an old retrofitted Mac 512 that's only up three hours a day. The system may lock up while waiting to write to remote disk.
Ideas, anyone?
normal(adj)- people who don't sit on slashdot all day wondering why everyone else isn't building robots [DECS]
Totally... It's insane. When the masses realize that they're paying for Bill's bunker in this cycle of bloat, hardware upgrades, bloat, hardware upgrades.... If they're capable of realizing this, maybe they'll realize it's crap. But, that's putting a lot of faith in the average person's ability to grasp concepts in general. Ugh...
"Would it kill you to put down the toilet seat?" -- Maya Angelou
"The result is a feature that frees up as much as 80 to 90 percent of the space on a server."
At first I didn't believe that 90 percent of server space was taken up storing duplicate copies of files. But then I looked at my own machine, and sure enough, I have 121,378 copies of the exact same naughty GIF image. By making 121,377 of them into Shortcuts, I freed up an enormous amount of drive space! Thanks for the idea, Microsoft!
When lusers send the 4.7 meg "Chimp picks nose while skydiving naked.mpg" to everyone in the company.
Scarce, scared, scarred, sacred... -Col. Bruce Hampton
Think of it: When do you get the problem of multiple equal files?
My answer: Only when you use windows and it's poor design forces you to.
In all computing there is a potential problem with multiple copies of the same data. One solution was to invent executables that use shared libraries so every executable doesn't contain their own copy of strcpy(). Another was to invent symbolic and hard links. Yet another was to invent network mounts. Etc.
These inventions are enough. How many of you really have a *problem* with multiple equal files on your Linux boxes? And how many have the problem on their windows boxes?
There is no need for this on UNIX, therefore nobody cared to invent it before. On Windows poor design does create a need. But this invention is a cure for the symptoms, not for the matter.
Is it just me, or is placing the file system into the registry, or something similiar a REAL bad idea?
But knowing ms, its going to be a cold day in hell before they do something logical like, actualy place the sym link somewhere on the disk. Im assuming (you know what assuming does right? it makes ass's out of u and m$ :P~) that they will place it in a registry, or something equaly senless like that.
I really have to laugh at M$ and gates.
The idea of only having one of each file on a harddrive is hardly new - the Amiga (and loads of other computers) have had this system for eons.
A specific folder for different types of files (datatypes, fonts, drivers, etc) has been around since Amiga introduced them. One file, which every program accesses - at the same time if needs be - and which, if the installer knows its version is newer is replaced.
M$ is playing serious catchup to the rest of the industry. It may have nice new technology and up-to-date specifics but the way these items are used and implemented is straight out of the ice-age!
Do yourselves a favour M$ - take a look at the company Gateway Bought and then Sold on after taking the patents it wanted for a digital convergence machine, the Amiga!
Maw, haw, haw. Gutted now? No. Give it 6 months - you will be!
TTFN all, (and please excuse my raving)
Bifford the Youngest
*AMEN*
The question is, why do so many duplicate files exist? They exist, because MS did not have symbolic links before. So the "innovation" is just automatic compensation for previously missing symbolic links. They did it in the wrong order!
copy-on-write link "file system"
Then use Google.
Read some nice things about about file system research like File System Assimilation.
And finally find a post done in 96 in the Linux Kernel list. Its here and it discusses this subject on links and copy on write.
Enjoy, Xmal
NT (or more correctly NTFS) has had hardlink support for ages. This was needed for it's "POSIX" subsystem (which is now decrepit). This feature is used heavily by the Cygwin project(http://sourceware.cygnus.com/cygwin/) to emulate unix hard links.
;)
;)
There is no equivalent of a symlink in NT though. Shortcuts are nothing like symlinks as people have pointed out
You can use sym and hard links within Cygwin but it emulates the symlinks by creating a file with a pointer to the correct location. Everything works as planned if you use Cygwin hosted tools (bash, cp, mv, etc) but it gets horribly confused when you move symlinked things arond with explorer
--
Of course, there are other explanations for this as well; the most probable one is that someone's managed a DNS attack on www.microsoft.com, as the error message I get is "Non-existant host/domain" and nslookup fails to return an IP for www.microsoft.com. But still -- this proves that nobody's invulnerable; even the biggest of giants with an image to protect can still fall victim to well-known problems...
There's a lesson here. Don't be arrogant or overconfident with regard to security. There are more attackers out there than you know about, and they have far more time on their hands than you do. Just patch holes as quickly as you can, and don't try to cover up problems; deal with them, be honest about them, and move on.
I'm rambling. I'll shut up now.
-----
The real meaning of the GNU GPL:
The real meaning of the GNU GPL:
"The Source will be with you... Always."
The biggest effect I can see this "innovation" having is to prevent the ntfsdos.sys driver and linux ntfs support from being able to work, at least for a while. Someone will presumably have to reverse engineer the hash function? Perhaps not for simple read access.
As far as space saving goes, what if your drive doesn't have a lot of duplication? Won't this database of signatures take up a significant amount of space? How big are these signatures going to be? If they are a fixed size, what happens with very small files, like some ini files that are less than 1K in size? I can see circumstances where this feature will actually increase disk usage.
Politas
And just how do you go about giving a PHP script root access? That's the only reason I started moving to Perl. You cant' do anything truly administratively soely in PHP.
People make duplicates of files for good reasons(it's a bad idea to assume they are clueless). They may want to have a fair copy of a document before it is mangled by the commitee process. They may wish to check out a document to create a version fork.
I imagine we're talking about the filesystem equivalent of copy-on-write memory management here.
When you copy the the document, it makes a link. To the user it just looks like a copy. When you modify the "copy", the filesystem sees that it's no longer the same data, and it becomes a real copy.
You could begin to do this on levels smaller than the file level, I guess, so that files which were identical only in places would have the advantage of compression: I dunno whether MS do though.
--
This is meant to be used in is a file system full of standard Windows system images. It's called a remote install server, so you would have a dozen copies of Windows, each of them for a different hardware configuration. So in this setup, it is possible to realize big gains with single instance storage. Particularly because this would largely be a read only type setup.
IMHO, this is Microsoft coming up with a solution to a problem that they created in the first place and calling it a breakthrough technology. Nothing could be further from the truth.
Why do you need a remote install server? One of the primary reasons is that companies have discovered it is far cheaper to support Windows machines by re-imaging them then by solving problems with the operating system or applications.
Why do you need many different system images? Because the OS is ultra-customized to the hardware, which means it's damn difficult and not worth the effort to ever migrate a system image to new or different hardware.
Other operating systems don't suffer from either of these failings. When something breaks, it is usually a pretty simple matter to check a log file, find the _useful_ error message and fix the problem and maybe restart a daemon. All this may be done without a reboot.
When you want to upgrade to faster hardware, you can move the disk, or even image the disk and lay the image down on the new disk, fire it up and through the magic of loadable modules be up and running in short order. I've done this with my home firewall and been back up and running in under an hour.
While this remote install server will certainly make life easier for Corporations using Windows, it would be better if Microsoft worked on the real problems and not the symptoms.
NT has had symlinks for a long time. No hardlinks though. This seems more like `automatic' hardlinks to me...
;)
Anyway, they added mountpoints to Win2K too. Finally the end of drive letters...
And they're skipping WINS in favour of DNS (with their own extensions though).
They're getting there. In a few decades NT might actually become a nice UNIX. Now if I could just see the kernel code for the syscalls I think behave funny, like in any decent kernel
This article looks almost as disgusting as this one which a co-worker emailed me with the subject "Don't read this after eating your lunch...you might lose it".
I can't decide if Microsoft is just that ignorant to computer history, or if they are that uncaring about the facts. Considering marketdrones run that place, my money is on the latter...
I was doing this in 1994 as part of a Usenet binary newsgroup unposting program. To save web server space, the program kept a table of checksums of every file (frequently caused by cross-posting) and replaced identical files with hard links to a single file. Should've patented the damn thing, just for the satisfaction of the privilege of sending a cease and desist order to Microsoft.
;-)
That would be the only use of a software patent that my conscience would allow...
Oh well. Mac people were doing network backups based on a similar concept almost a decade ago, which predates my work. Can't remember the name of the package, though.
Hmmm...I guess after 10 years, Microsoft gets to copy other people's good ideas in their own products, so the Single Instance Store is right on schedule. Gosh, I feel old now.
-- I avoid spam by accepting only OpenPGP encrypted or signed email at this address. Clear-signed, RFC2015, heck, even
just breezed thru but it sounds a bit like HSM to me - basically when online disk storage reaches a threshold, the oldest files are migrated to near line storage like rw optical disks or whatever and automatically replaced with a link - when near line storage reaches a threshold the older files are migrated off to tape libraries or something - the end user still sees all their files, just the older ones take a little longer to retrieve.
try { do() || do_not(); } catch (JediException err) { yoda(err); }
now it sounds like 'common files' or shared libraries. Whatever. Remember what Tom Edison said about 'genius'? It's .1% innovation and 99.9% perspiration (actually 1% inspiration and 99% perspiration) so they do a lot of sweating in the marketing and self promotion dept, I'm sure. Msft is their own little, head-up-ass, mutual-admiration-society, self-congratulatory backslapping world, fer sure.
try { do() || do_not(); } catch (JediException err) { yoda(err); }
My ideas were "ability to make folder names different colors"
I thought this was one of the innovations tauted by Mac OS 8, for reals.
by Mike Buddha -- Someday the mountain might get him, but the law never will.
Not to mention that great invention of theirs: IPv6. It's almost ready!
-- Don't Tase me, bro!
Many backup systems eliminate duplicate files, for example, and some of them actually have file system interfaces. You can get scripts that will scour your file system and cross-link duplicate files under UNIX on Freshmeat (or write your own in Perl in a few minutes). The idea of copy-on-write for file systems is not new either.
I think many people have thought about putting this into the file system, and probably many of them concluded that it wasn't such a good idea on UNIX systems. It complicates the file system implementation unnecessarily for an uncertain gain.
Brace yourself for the patent, however. Microsoft is sure to have patented this, and there is a good chance that the patent will stand, no matter how much other related prior art there is. The argument will be simple: nobody else has actually implemented this in a widely used commercial file system, and we are wildly commercially succesful with it. That's generally enough.
_______
2B1ASK1
- Like another poster noted, why the hell do I want another M$ process running in the background on my machine.
- What if other non-MS software needs the file to exist someplace, even if it is a duplicate, and Win2K symlinks it out-ta there?
- What about data replication? I might actually want to store the file in two places -- even if it is bit for bit the same.
- Do I really trust their OS to check dependencies in anything other than MFC code (where I don't trust them at all by the way -- too many bad experiences)?
So even if this is Microsoft's idea of innovation, IMHO it's a bad one.Maybe instead they should have focused on things like "shared, thread safe libraries" and open standards, similar to 'Nix.
...Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
Gee, maybe that's what it takes to be a Microsoft millionaire. Take an old idea to the newly announced "chief architect", who can the bless it and announce it to the world as innovation. repackage it into a buggy OS, and sell it to the world...
By the way Rob, it's not just you. I almost choked on my drink this morning when I saw this story as well.
...Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
Back when I dealt with herds of windows 3.1 boxes I used to keep duplicates of critical .dll files in a separate directory. If I installed special software on a workstation, I would make a quick backup of the .dlls on that workstation. It seemed like every so ofen there was a bit of harddrive corruption and it struck .dll files and my little backup caches of .dlls came in handy more than once.
This innovation might be nice for servers that have tape backups and raids, but for the average workstation this could be very bad. Sometimes you want the same file in 2 places for real, and not just for pretend.
--- If you don't want to know the answer, don't ask the question.
Moderate "funkman"'s comment up... I see the light! :)
It's very simple: Microsoft finally decided that having all DLL's in the same place was a Bad Thing. So they found an alternate solution, and that's what this innovation is: a way to recover the *increase* in space that will be used up by multiple copies of the same DLL. That's how they came up with the 80-90% figure.
This actually makes a lot of sense and ought to be made integral to our favorite *nix. Hard links almost do it, but there is no copy-on-write functionality. How hard would it be to add?
Ever hear of sparse files? Many Unix filesystems support them, as does Novell NetWare, and, I think, WinNT.
:-)
(1) Open a new file.
(2) Seek to location 4000000000 (four billion).
(3) Write a single byte of non-zero data.
(4) Close the file.
If your OS+FS supports sparse files, it will only allocate storage for the one non-zero byte. If not, you now have a 4 gigabyte file full of zeros. Yet the file length will always be reported as 4 GB.
Now go through and write a single non-zero byte for every disk storage block in the file. The file length will not change, but it's disk usage will increase by roughly four billion.
In the real world, resources are very often oversubscribed. Get used to it.
dragonhawk@iname.microsoft.com
I do not like Microsoft. Remove them from my email address.
The point that everyone seems to be missing is that they don't say that it is used throughout win2k but that it is used on the windows remote installation server. In this situation it makes more sense, for example several apps that have the same librarys would only result in one physical copy. In that situation the backup problems you discribed wouldn't be an issue.
If i'm wrong and it does exist in win2kpro perhaps someone could tell me how to turn it on!
-- "Outside of a dog, a book is a man's best friend. Inside of a dog, it's too dark to read."
Presumably it'd use MD5 or something, to determine the signature, each time a file is written, then use some decent database algorithm with a binary search, to look for identical signatures, then it'd mark the second file as a copy and use a link. So it'd only fingerprint a file once and when saving it'd just be a ram-based searching algorithm.
DFS/AFS has had this concept of a backup filesystem for quite a while. It's not even a
COW, (afaik) it actually tracks the binary deltas from a sync point, which if you extended it to individual files instead of a whole filesystem like M$ seems to have done really could save some more significant space. They always sold it in the DFS world as the online backup being "almost free" space wise.
Very cool, and comes in handy when you rm -rf the wrong file and realize it instantly. However, I would think automating this feature would create more problems than it solves.
Amen to that my brother.
I have to wonder about the claims of saving up to 80 - 90 percent of the space on the server.... uh...that can't be possible can it? They don't talk about the size of the database that contains the *file signatures* OR they don't talk about what happens when your box crashes and that databse becomes corrupt... sheesh...
The real killer for me is how near the bottom of the article they *hint* thay they are the ones who developed IpV6... AND...GOOD NEWS FOLKS... you can download it for free from our website.. yeah...like the *nix community hasn't had IpV6 support for some time now.
The MS marketing machine rolls on...
Just for fun check out Bill Gates and Paul Allen dumping MS stock like there is no tommorrow
Sure, they are automatic links - but only for files. There's still no way to do directory linking that looks like a part of the file system (current "shortcuts" to directories are not really usable by applications).
Since directory version management is mostly what I've used symlinks for, it seems that once again they have taken the most useless aspect of a nice idea like symlinks and expanded it into yet another annoying automatic feature that will, at some point, destroy some of my files.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
"Automatic symbolic links" were implemented in a Bell-Northern Research proprietary OS called SLIC(which ran in the SL-1 PBX) around 1985.
If the Linux community had an image of fairness and open mindedness they wouldn't be such a joke.
.sig is my small personal effort to combat the multi-million dollar Buzzword campaign that M$ is spewing forth. Is it just me or does the face of the main actor in all their commercials just scream Fear, Uncertainty, and Doubt?
.sig...
Ease up there Tonto. Looks to me like the shallowness of the article was pointed out fairly quickly and accurately (many eyes, shallow bugs). What you need to realize, before you flame an entire community, is that the most vocal and vehement among us, are also usually the youngest and dumbest. Much like society in general. You can't give the same credence to every post here, all posts ARE NOT created equally.
And I have a number of reason to hold a circle jerk around M$, mostly for the years of aggravation with using their products, and then their "competitiveness" that made the only possible competitor one that wasn't a company and couldn't be bought, marketed, or FUDed into non-existence.
So back-off bizatch, or login so I can see a history of your posts and see if I'm replying to an idiot, a troll, or Big Billy G himself.
Oh, and my
hmm, time to change the
--
+&x
"
.backup (or a folder defined for backups or...) that exempts files from the autolinking.
Let's say, at the end of the day, I copy a folder which contains files I have been working on to a backup folder on the same hard drive. The program deletes all of my copies, replacing them with
symbolic links. The next day after a few hours work, I realize I need to revert to one of my backup files, but it's been changed to a symbolic link to the file I'm using now. Presto! "
A simple solution to this problem, would be to have an extension
LetterRip
I believe we are in violent agreement, however, about Unix hardlinks. That is why I described hardlinks as two filenames pointing to the same inode (or substitute "starting FAT sector").
Actually, in Unix, all normal files are hardlinked. The vast majority of them simply show one directory entry link per inode. We usually reserve the term "hard link" for an inode with multiple directory entries, like /usr/bin/vi and /usr/bin/ex.
--The basis of all love is respect
YES! This is annoying as hell, especially when you are doing video capture on your IEEE 1394 card and cannot afford delays introduced into your disk writes. #%$#&^$&^$ that damn fastfind! I will be sooooo happy when I have all my video editing stuff moved over to Linux.
Isaac
The Bolachek Journals
Seems likely that they will be operating on an entirely different level here. Abstract the FS up again and some of this sorts itself out. It's also likely that this mainly intended for use on servers where backups are done in a more rigorous fashion.
When you copy the the document, it makes a link. To the user it just looks like a copy. When you modify the "copy", the filesystem sees that it's no longer the same data, and it becomes a real copy.
Thanks for the correction. If it works that way, it actually is pretty cool.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
>So then what happens when your file system is full? Modify a file, even make the file smaller than it once was, ...
;-)
You're right, it's a tricky case, but I don't think MS's trick makes it all that much worse than it has already been for ages. In general, you don't know when a write to a file might cause a new block to be allocated - creating the potential for an ENOSPC - where there was only a hole before. In order to know that, your program would have to know when it's crossing a block boundary, and building that kind of knowledge into a program is generally a bad idea. Of course, any modern OS/FS allows you to explicitly preallocate space, and any even non-idiotic implementation on MS's part would prevent files containing such explicit preallocations from being "coalesced". Of course, the folks at MS might well be idiots.
Another problem that has also existed for ages is that many programs write a file X by actually writing out tempfile Y, then renaming Y to X. There are actually some good reasons for doing this, but in the process the potential for an ENOSPC is increased. C'est la vie.
The trick MS is using doesn't necessarily create any new opportunity for this type of error, and it's something any decent program needs to deal with anyway.
Slashdot - News for Herds. Stuff that Splatters.
>The other piece implements the links,
>...
>there are no copy on write
Here's the piece you're missing, Sparky: "implementing the links" might or might not involve copy-on-write semantics. You don't know, I don't know, the evidence isn't there for us to know.
Now, here's what we do know. If they just follow links without doing COW, you're right: it's easy to do, it's nothing special, etc. It's also pretty darn useless, and even worse than useless, in ways that other posters have pointed out. Maybe MS is stupid enough that it took them this long to do something so trivial, and that they're willing to release such a broken version of the functionality. You and I might _want_ to think they're that stupid, but maybe they're not. These are the same people who wrote a brand-new filesystem that doesn't have "undetected data corruption" problems if you turn on async metadata writes - unlike the most commonly used filesystem on Linux. Unlike you, they do have Clue One, even if they're not gods and the MS marketroids got a little carried away portraying them as such.
We can't tell from one marketing bumsheet which way they actually did it, but he theory that they actually took the time to implement COW - doing the right thing for once - fits the evidence we do have much better than your "MS sux, perl roolz, crontab=transparent" theory does.
Slashdot - News for Herds. Stuff that Splatters.
>Am I the only one who sees some possible big, huge, gaping security holes here?
No, you're not. Your concern is right on target; this is one of many cases where you simply cannot link the files. Metadata, especially security metadata, affects the interpretation of data, so the same data with a different owner or permissions is in a very important sense not really the same data after all. I hope and expect that the folks who implemented this are well aware of such concerns.
>Encrypted files. Sounds like these would break the system
It depends on where the link interpretation happens relative to where the encryption/decryption happens, but you're probably right. There is an obvious solution, though: don't use this feature on EFS.
>Swap space. From what I know, M$ systems store swapped data as "files." Suppose two of these had the same content?
Actually, this would work as long as the virtual memory subsystem was above the link interpretation - which is a tough call given the hairy interdependent way these things work on NT. Nonetheless, there are plenty of reasons to make swap files exempt from this sort of linkage, and it would be easy to do so.
>Speed considerations.
Yep. Major hog there.
Slashdot - News for Herds. Stuff that Splatters.
The name of a file is not stored *in* the file or even in the same physical location as the file on the hard drive. In fact, the file itself (due to fragmentation) may even be spread out over the span of the hard disk. But this is beside the point. The point is that the two entries in the filesystem (read: filenames) could point to the same blocks of data on the physical hard drive, allowing more than one filename to share the same bytes, bit for bit. This is essentially how links work under Unix.
-----------
"You can't shake the Devil's hand and say you're only kidding."
-----------
"You can't shake the Devil's hand and say you're only kidding."
I could write a one-liner with find and perl which would do something *like* this with hard-links (they're really not talking about symlinks here, as deleting the original would result in a lost file).
But, I'm not sure that that's what they mean. If they mean that they have a copy-on-write system that manages duplicate files, that would be awesome, and I would praise MS for actual innovation (NetApp has something like this, but it's more manual... still the single biggest reason to by a NetApp, though).
All in all, I'd like to see some more technical info.
As many have observed, W2K reliaes massively on disk caching because the physical write performance sucks abysmally. If they're checking for redundant files on each write, we now know why.
Lacking <sarcasm> tags,
The beauty of this program is of course that it uses _all_ available CPU when it does this. Multithreading & multitasking OS my a...
Hmm...so does this mean my OLE.dll version 1.0 and my OLE.dll version 5.0 will be conveniently munged for me? Or are they actually doing a binary comparison on file operations? And I assume this change will mean that Win2k++ will come with yet another file system, say...LARD64?
It's 10 PM. Do you know if you're un-American?
London --
A budding British inventor today unveils a stunning friction-circumventing invention that will ease moving heavy objects and revolution transportation.
The "Wheel" is a simple but clever idea involving sections cut from a cylindrical shape being employed to roll over surfaces. When attached to the end of a stick, which the inventor calls an "axle", wheels allow for speedy movement over a range of surfaces with none of the severe undertray ablation and huge energy output associated with pushing large lumps of stuff along the ground.
Investment from an unnamed company in Redmond, Wa. allowed for continued devlopment of the wheel concept, and it bullish projections suggest that the old "Push the bastard thing along the ground" approach favoured in Redmond may soon be rendered obsolete by wheel-using devices.
This Windows 2000 Magazine article explains the system in detail.
Put simply, SIS was designed for NT administrators for the purpose of cloning OS installations to client desktops. With NT 4.0, you're forced to use a third-party disk imaging tool, or use NT's "unattended setup" mode where the installation program installs NT from a script. In large corporations, admins often roll out hundreds or thousands of identical NT installations, and even with just a few copies this process is a huge pain. (I'm saying this as a human being, not as an NT admin. I prefer Linux.)
With Windows 2000, however, you use the Remote Install Server software, which is set up to host all the images you want to copy to desktops -- for example, you can create an image containing a stripped-down NT with Internet Explorer and DHCP client; and another image containing an installation tailored to laptop use. Additionally (as I understand it), the RIS software generates a boot disk containing network drivers and the stub code to load the image from the RIS server.
However, when Microsoft designed this software, they discovered that images take up a lot of disk space -- and that the most of the files are repeated files through the images.
The solution, SIS, is a background service that employs a piece of code amusingly called the "groveler" to scan designated parts of your NTFS volumes. Duplicate files are moved to a special, hidden, top-level system folder called the "SIS Common Store", and a symbolic link (which Win2000 calls junctions) is left behind pointing to the actual file.
It's worth pointing out that SIS was not originally designed to work as an all-purpose, automatic symlinker. Since files are physically moved to a special location on each disk, it could potentially wreak havoc with backup software, as well be extremely confusing to users and admins alike. You can set up SIS for all your files, but none of my documentation indicates that this is particularly wise -- also, there is little to gain from "compressing" regular volumes this way.
Laptops need this! The hundred or so people who work in my building use laptops, so we've got our office with us whether we're at our desks, out at customers, working from home, or on the road/train/airplane. That means I really *do* need my own copies of most of my software on my own machine, and I need to be able to back my stuff up on a file server so that when my laptop's disk gets crashed, I can restore all my stuff, and restore it efficiently rather than reinstall &^%*^% MSOffice and all my other software, and so I've got the version of everything that's on *my* laptop, not some server that may have newer or older stuff.
Would this be easier if MSOffice and other popular software packages had the decency to keep all their static content in one place (e.g. C:\readonly) and their changeable stuff somewhere else (ideally, somewhere else *standard*), so you only need to back up the changeable parts? Sure! But that ain't gonna happen, especially at Microsoft, but not with a lot of the other software vendors out there today. It's much easier to build an interesting and occasionally useful admin tool that to fix corporate culture.
A lot of the non-software on my computer is training material and presentations in MSPowerBloat format - many of my coworkers have copies of identical material, but we really need them for portability.
Would all of this be easier if we used vi + LaTeX or HTML editors for word processing and GIFs/JPGs or Really Good Postscript for pictures, so the standard software was 5% as large and the presentations were browseable? Yup. But this is Corporate America :-)
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
I don't think it's as simple as that. If the user changes priceless.doc, then it will be stored in a different 'store' than the backup.
The only problem you'd have is if there was corruption in that area, then both files would be lost. Ofcourse, there's also a chance that both files could have been corrupted at once anyway. Backing up on the same harddrive is usually to make sure changes can be undone, not to make sure the file is lost, so it's not a "big" problem.
And yes.. I don't much like microsoft but i hate FUD even more.
I think that the poster meant "hash" as in "associative array" where one key maps to one value (at least, in perl these are called "hashes" now.) You're talking about a "hash", like as in a one-way crypt of some data.
Of course, it's entirely possible that I'm out of my gourd on this, and everyone is talking about scrambled data, at which time I have NO clue as to how this would work WRT this symbolic linking scheme.
Corrections? Additions? Clarifications? Comments?
Much Love,
"S"HM
*****
(I refuse to spellcheck out of contempt for your belief system)
1. Like another poster noted, why the hell do I want another M$ process running in the background on my machine.
Well if they've got any sense it'll most likely be an optional process. I'd imagine you don't have to use it.
2. What if other non-MS software needs the file to exist someplace, even if it is a duplicate, and Win2K symlinks it out-ta there?
Well seeing as how his would be done transparently by the OS there shouldn't be any problem here for processes using the APIs. Processes relying on lower-level methods will obviously have to be rewritten to take this into account, but most apps will run perfectly.
3. What about data replication? I might actually want to store the file in two places -- even if it is bit for bit the same.
I'd imagine you can configure it to ignore certain files when it checks for duplicates. So if you have a file which you have to have duplicates then flag it in the options. If they don't allow this then I think that's a flaw.
4. Do I really trust their OS to check dependencies in anything other than MFC code (where I don't trust them at all by the way -- too many bad experiences)?
Why should it matter whether you use MFC, VCL or whatever? It'll all be done at the system level and shouldn't require any changes to existing code.
. All your libraries are on the C:/WINDOWS/SYSTEM directory. Everyone has their own copy of the Registry that points to where things are located. If you want to install a piece of software, in most cases you have to install it locally, or at least all the DLL files are local. Hence, if you have 1,000 people on the network using MS Word, you end up with 1,000 identical copies of MS Word instaled! If you want to upgrade software for all those users, you have to do it 1,000 times.
This is absolute rubbish. How did this post get moderated up so high?
"Those who do not understand Windows are condemned to criticize it, poorly"
If same file instances are tracked in the registry, you could probably create many many instances of the same file and overload the registry. This could be a potential denial of service attack.
By the same token, knowing that your file is a link to another file/global file also means you know weaknesses that exist in that configuration.
If another Dufus wanted to make an exact copy of my configuration file,jsut add a few extra features to it.
if they are sharing my .cshrc file for example, why not add: chown -R myuser ~
This is an example and maybe a bad one at that.. But think of the possibilities.
So what happens when that one copy I have of a file becomes corrupt? It appears that I would be royally screwed and all the 'backups' that I thought I made are gone too. This would be most unfortunate, I wonder if they thought of that?
Things you think are in the Constitution, but are not.
If they have done it right, when you save the file foo.conf, it will initially be saved as a normal file - i.e. don't mess with it too quickly. Remember that this is a server-based file system, so it gets a certain amount of give and take in what it does - as long as the user is able to load and save files to it transparently, it's fulfilling it's goals. foo.conf.old will still be symlinked, but it will be symlinked to a server copy that is held in a database under a unique key. If, on the save of foo.conf, the server recognises that this file was previously symlinked, it should run it's hashing function over the file to see whether it has changed, and probably also do some sanity checking in the form of looking at the file length. Even if the symlink checking is done later asynchronously, there should not be a problem, as long as the hashing function can't be fooled into giving the same signature to different files of the same length.
Of course, there may be bugs in the system. I'd worry about making sure that there are no opportunities for a user to obtain a truncated file during a file request, just because the server has just switched the file from a real file to a symbolic one. I'd also be wary of try to implement a scheme to save duplication on parts of files, although I can see that if you run some sort of binary diff on two files and save the diff and the base files you may also save a lot of space.
Cheers,
Toby Haynes
Anything I post is strictly my own thoughts and doesn't necessarily have anything to do with the opinions of IBM.
After having read this story, and then the press release, and then saying "huh?", I decided to make a contest out of this.
It's at <http://www.spinnwebe.com/spinn/mscon test.shtml>. You have to create your own innovative Microsoft innovation. The prize is a latte mug.
My ideas were "ability to make folder names different colors" and "can make a DVD spin counter-clockwise".
You could run out of disk space by deleting a byte from a file !
Ok so the file is, say, 800mb and is a duplicate of another.
I delete one byte from the end of one instance the os magically
tries to make a new copy and finds no disk space.
Great.
Its as annoying as compressing filesystems which lie about
the free space that they have.
I still don't think this is an 'innovation'. A coworker of mine regularly makes fully standards compliant CD-ROMs with 2 gigs on them (it uses basically the same exact method for getting the space, except using cross-linking and such tricks). The basic difference is that they do it on a live filesystem, something which I really don't think I'd like to see due to the increased risk of data loss with a single sector failure.
----------------------------
I don't really know how existing compressed filesystems are implemented, but since storing pointers to existing copies of data is an old compression technique, I always assumed this was part of it.
Now I suppose it's possible that existing compression only takes advantage of redundancy within a file. In that case extending this to the whole file system might be considered innovative - but I would ask why they didn't do that in the first place.
It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail. - Abraham Maslow
What would I do, then? I'd take a tree-based system, such as ReiserFS, and turn it into a graph-based one. Instead of automatically setting up file-level symlinks, to entire files, I'd have block-level symlinks, chaining file components.
If a component in one file changes, just create a fresh copy, move the links for that file over, and save the revisions to it.
By doing this, I could actually see people saving 80%-90% on disk space, as there are lots of files on many computers with identical segments.
However, the only way that I could see Microsoft claiming that kind of saving, for the system they are describing, is if Windows 2000 has large numbers of identical files in it. Oh, it does? That would explain the 50 million lines of code.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
A shortcut is not equivalent to a symlink. A symlink is handled by Unix at the OS level, when path traversal is done. Handling of ".." is done by the shell.
In Windows, shortcuts are handled at the shell level, and they interact badly with path-name traversal.
The point being that W2K will automatically notice if multiple files are bit-for-bit copies of each other, and store the file once with symbolic links in other folders. It's the automatic part that makes this an M$ innovation.
BTW, I am *not* advocating M$ at all, just pointing out a yet-another-misconception in the Slashdot title...
Eric
Read the article. They also invented text-to-speech. I guess those programs I got with my first 8-bit SoundBlaster card were stolen from Microsoft by a future CreativeLabs employee with a time machine. He stole the programs from Windows 2005 and travelled back to 1993 where he relabled the program and gave it away with overpriced sound cards.
Aah, change is good. -- Rafiki
Yeah, but it ain't easy. -- Simba
I hate to be the guy to burst your ego (nah, I don't really -- but it sounds polite), but you're wrong.
... when the primary changes, the other does too.
Shortcuts are files that contain data about a file they want pointed to.
Hard links are actually pointing to the equivalent of a FAT entry for the file in question.
File starting at sector 301 = "blah"
/home/myfiles/blah.txt is a link to 301
/home/yourfiles/blah.txt can be alink to 301
... they don't (actually) link to "each other" but to the same space on the drive
... read up on linking.
- Michael T. Babcock (Yes, I blog)
Tongue in cheek:
If SlashDot were running on Windows 2000, all seven hundred copies of the following article would be coalesced into a single copy:
by Various (dont.spam.me.I.cant.run.spam.filters.myself@some
(User Info) http://winblows.sucks/
It seems to me that this is another example of the Slashdot Editors getting carried away again; I mean, clearly they didn't read the original article or check their facts.
The original article states that this is an automatic process, and finds identical file copies as candidates for symbolic linking plus copy-on-write.
Now, all we need is a semantic copy detector.
(Single Instance Store saves space on Slashdot! =anagram> Cheapness: so overloading tactless nastiness.)[
Altogether, nine separate groups within Microsoft Research contributed more than 15 innovations to Windows 2000, including everything from the computer code that identifies bugs and security attacks to the underlying technology that enables computer applications to encrypt and decrypt confidential information.
...
Typically, Microsoft centers its research on innovations that will be ready for development three to seven years in the future.
So let me see this means that each group took three to seven years coming up with 1.5 innovations that already exist on Unix systems. No wonder people call them Microsloth.
My read of the press release is that the links are created dynamically and automatically. Keep in mind, this may be marketting-garbled mush, but it sounds like they are using a daemon to dynamically assign symlinks whereever duplication is found.
They claim this will save 80%-90% hard drive space. I'm very skeptical of that, even if it is all they are claiming it is.
Is there a patent? Mayhaps someone can write a filesystem which implements this. I'm really doubtful that this is anything that will more than marginally affect effective hard drive capacities, and at some cost in overhead, but it might be worth playing with on a UNIX.
I'm impressed. At this rate I reckon they must come up with one innovation roughhly every 50000 man hours of coding.
Makes you wonder how any of these small companies do it. ;-)
Working for the (other) man
I wouldn't call it innovative, but it's clearly not symbolic links. For one thing, it's not explicit. It all happens under the hood. For another, it has all this database of hashes to enable copy-on-write symantecs. Please try not to bait people so much!!!
;-)
That being said, as a sys-admin, I would want to be able to disable this. While I'm sure the copy-on-write feature uses sufficiently unique hashes of the files to identify changes, there are definitely cases where I *want* to have physically seperate copies, particularly if files are on different partitions (maybe this is only done at a partition level, who knows?).
I did like their other "innovations": IPv6 (gee, I can get that for Linux can't I?), text-to-speech (yup, also for Linux), a statistics based trouble-shooting tool (of course, the stats would have to be setup before Win2000 was deployed, so you can imagine how accurate they'll be), etc. I mean who do they think they're kidding? Not only does Linux have similar facilities to most of their "innovations", but ALL of these innovations are available elsewhere as add ons to Windows! Oh, wait, I forgot, innovation is when Microsoft takes other's technology and bundles it with Windows.....
sigs are a waste of space
What actually happens is that when an object is stored in the system - the system checks to see whether it is identical to another object, and if so - just stores a reference to the other object. This is achieved using a "signature" or in non-M$ language - a hash.
It is actaully a reasonably good idea although I can't see how it could have taken so long to implement it. Now it is quite possible that this has been done before, but it certainly isn't just symbolic links!
--
Associated Press
2000-03-02
Today, in an unprecedented show of candor, top Microsoft spokesmen admitted that Windows 2000 ("W2K") is largely redundant.
"Yes, it's true. We have so much duplicated crap in W2K that we're developing new technologies to deal with the sheer volume of bloat", said David Spiker, in a Redmond, WA press conference this morning.
"I mean, honestly. We've incorporated two thirds of RedHat Linux and thirty percent of FreeBSD. They're the same thing, really, so why store two separate copies?"
Other industry figures weren't too impressed with Microsoft's (Symbol: MSFT) new direction.
Said Richard Stallman, a leading freeware author, "Come on. We've had gzip for years. They can just compress everything." ("Gzip" is a cryptic program that runs on older "Unix" systems, which are similar to Microsoft's innovative "DOS" operating system).
At least some insiders, though, cheered the move. An unnamed employee of Sun Microsystems said that "it's about time they squeeze the crap out of that pig", soundly endorsing Microsoft's creativity and initiative. "I mean, really, their competitors don't have a tenth of the [features] to squeeze, let alone a reason to come up with new [systems] to squeeze it."
Side note to Dave: That's what you get for leaving us. :)
Dewey, what part of this looks like authorities should be involved?
Now we know why Windows needs to be rebooted every time that a significant event occurs: the Single Instance Store collapses all solutions into a single answer of "REBOOT" to fulfil their goal of massive saving of storage space, so when the trouble-shooting tool from their Decision Theory and Adaptive Systems Group uses its advanced statistical model to deduce that the most probable solution, naturally it returns the same result every time.
What hope does Tux have against such ingenuity!
;-)
"The question of whether machines can think is no more interesting than [] whether submarines can swim" - Dijkstra
Problem is, from the sketchy details in the article, this looks like a hack to fix a problem that traces all the way back to the broken disk storage model inherited from DOS - a problem that doesn't even exist under UNIX.
The UNIX model creates a single storage tree with the ability to mount new devices (disk partitions or NFS exported directories) at any point in the heirarchy. It's a model designed from the ground up around the concept of a network. The main advantage of this model is that it's simple to install a single copy of an application in a standard location on a network server (/usr/local or /opt or whatever) and simply mount this location on all the individual workstations. This way, you have one copy of the software (the "batch of bits") that can be used by any number of hosts.
With DOS, everything was assumed to be local. All your libraries are on the C:/WINDOWS/SYSTEM directory. Everyone has their own copy of the Registry that points to where things are located. If you want to install a piece of software, in most cases you have to install it locally, or at least all the DLL files are local. Hence, if you have 1,000 people on the network using MS Word, you end up with 1,000 identical copies of MS Word instaled! If you want to upgrade software for all those users, you have to do it 1,000 times.
With a storage model like that, it's no wonder MS is looking for ways to save space on identical copies of software. If it's static data (like installed software, then backups are less of an issue. I would think that the backup would grab a copy of the copy and write it out as if it was a unique file.
Still, it's nothing more than a patch to a fundamentally broken architecture!
Your Servant, B. Baggins
It actually makes sense to install this on a machine that has a lot of user drive space.
The software will make symbolic links automaticly, so when lusers all save
that "Wassup" commerical to the user
drive the machine doesn't run out of
space.
It's not just symbolic links, it's automatic
symbolic links.
Shaun Nelson - Bastard Operator (From Hell / For Hire)
I feel a certain ``sameness" in this discussion about how SIS duplicates UNIX symlinks -- a ``sameness" that recurs with every development that MS announces as a ``brand-new innovation."
Let's look at another much-heralded inovation from MS's past -- multitasking -- & compare how its reception mirrors the reception to SIS:
1) MS announces -- with much enthusiasm & pride -- a new development. In 2000 it was SIS; in 1992, it was multitasking under Windows.
2) Based on their enthusiasm & the amount of pride, many knowledgeable computer users expect it to be the same -- or better -- as an existing useful feature under other OS's. In 2000 SIS is compared to UNIX symlinks; in 1992 people expected preemptive, multi-user multitasking.
3) After some examination, it is discovered that the MS innovation is not as useful as first thought. SIS replaces duplicate files with a pointer, & is not actually a symlink; windows multitasking is co-operative -- a second program or process can't get its share of the processor until the first one decides it's finished.
4) The real, but marginal, added good of this innovation is soon countered by the flaws or bugs it introduces. SIS can frustrate a user making backup copies, & can reduce CPU performance as it checks for duplicates; a locked program in a co-operative multi-tasking system can lock the OS just as tight as under a single-tasking OS like DOS -- forcing a reboot.
5) Users have no way around these flaws. ``Every time I move a 40MB file to my Y2K box it slows down to a crawl because its confirming this file is not a duplicate. Byte by byte." -- ``I lost two hours of work because a program GPFed & forced me to reboot the entire system."
6) Expectations of software written by a certain company are once again lowered.
Okay, I admit the last is speculative. But a lot less than it might seem.
Geoff
I think I see a trend here. Maybe for them it really would be easier to muzzle the entire internet than to produce p
I think if you actually read the whole article, this is an innovation. The program checks for files that are duplicates, then replaces the duplicates with a symbolic link to a single file. This happens automagically, so the user never notices, or knows. It's the second part of this that may cause problems.
Let's say, at the end of the day, I copy a folder which contains files I have been working on to a backup folder on the same hard drive. The program deletes all of my copies, replacing them with symbolic links. The next day after a few hours work, I realize I need to revert to one of my backup files, but it's been changed to a symbolic link to the file I'm using now. Presto!
I think this has potential, and I think it could be a good idea, but the gods live in the details.
This just confirms(paraphrased): "Those who don't understand unix are doomed to reinvent it, poorly" This is just hilarious.
Actually, this is more than symlinks, in that it is done by a background agent on the user's behalf. Thus, they take a useful feature and make it extremely dangerous.
People make duplicates of files for good reasons(it's a bad idea to assume they are clueless).
They may want to have a fair copy of a document before it is mangled by the commitee process. They may wish to check out a document to create a version fork. So, I see a great html file, and decide I want to use it as a template for my own document, throwing out the contents. I go home, and overnight the system replaces my copy with a link. Then I edit the file and blam -- I just changed somebody's web page. Yuck.
Of course there may be safeguards, and maybe the users should use a document management system. The problem is that document management systems are sometimes overkill; where they are used the problem being solved doesn't exist. As far as safeguards are concerned, they complicate a simple process to solve a problem that in the end is not very important at all.
The irony is that they are optimizing the wrong thing. Disk space for user file storage is cheap. Even if you have enough users to drop 20-30K$ on a hardware RAID, it is still cheap relative to the time to administer the system and cheaper still with respect to user time.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
My first thought was that they might have done something slightly different from a familiar old symlink by applying "copy on write" semantics. While I couldn't find anything in the article that unambiguously states whether they're doing this, I did find:
>The first piece searches for duplicate files, computes a signature for each file and stores these signatures in a database. It then compares the signatures in the database and merges duplicate files.
This is actually kinda nice. It's going out and looking for copies, instead of relying on you to create the links explicitly. In order for this to be truly transparent you'd have to use COW, of course. There's also a concern about how much CPU time and bus bandwidth will get used finding and tracking these duplicates. Lastly, let's not forget that it would be trivial to create a daemon to do the same thing on any UNIX.
In short, it's an interesting idea, but perhaps still a bad one. In any case, I don't think it's "just a reinvention of the symlink" even though MS has tried to take credit for a lot of rather ancient computing ideas.
Slashdot - News for Herds. Stuff that Splatters.
>Symlinks are great, but not ALL duplicate files should become them.
I strongly suspect that the real innovation here is using the ancient virtual-memory trick of "copy on write" to files. In your example, foo.conf and foo.conf.old would be links to the same data at first, but the moment you write foo.conf the link gets broken and a fresh copy of the data is created automagically so your updates to foo.conf don't affect foo.conf.old. Problem solved.
For reasons described in my earlier post I'm not sure this is a great idea, and it certainly isn't likely to "free up as much as 80 to 90 percent of the space on a server, allowing users to store as much as five to 10 times the information" as MS claims, but I'd be amazed if even MS would allow the problem you describe to occur. It's too obvious even for then. In fact, this may provide the answer to the "why did it take them so long" question. Plain old symlinks would be easy to add (been there, done that, on an MS platform) but adding the COW behavior could be tricky.
Slashdot - News for Herds. Stuff that Splatters.
If this is combined wit copy on write it is a good idea. If it isn't, it is obvious that it creates the possibility of things being changed identically when only a local copy should be changed.
Another obvious issue is that it clearly isn't as flexible as the Unix combination of copies, hard links and symbolic links. I can choose whether to make a copy which will thenceforth be a separate file, create a hard link which will be a separate name for the same data, or create a symbolic link which is a reference to another file which may or may not exist. This is typical of the differences between Windows and Unix. The Windows approach is powerful, but does not leave as much flexibility in the hands of the user or programmer. Unix assumes that if you know what you are doing, you can make the choice for yourself.
The net will not be what we demand, but what we make it. Build it well.
Well, the article sortof suggests that they set up the links automatically, which would be an innovation perhaps.
This article isn't terribly technical, so I can't be sure, but as I read it, this has one major flaw: If it automatically detects duplicate files and symbolically links them, it will ruin your ability to backup a file by creating a copy of it somewhere.
Symlinks are great, but not ALL duplicate files should become them. If I have file foo.conf, and I back it up to foo.conf.old, but don't change it right away [Or maybe that doesn't matter, if this "feature" is constantly running], their SIS program will symlink foo.conf and foo.conf.old to be the same file. Then when I change foo.conf, and hose my system, I can't restore it by using foo.conf.old, because that file was changed when I changed foo.conf!
Can anyone give more details about how this works?
So your app uses MSVB500.dll? Then your target machine will get an extra copy of it. Have 5 different applications (from 5 different vendors) which each use MSVB500.dll, then you will have 5 copies of MSVB500.dll. This sounds like the road to true bloat but at least you end up with a more robust application because other peoples' installations of common dlls won't break your app. With this automatic symbolic linking, much of the hard drive space which could have been wasted by redundant dlls can now be reclaimed. This could be a real good thing.
.lnk files are worthless unless you're doing 1 of 2 things:
1) launching an executable
2) launching something that is a registered filetype (i e, C:\>foo.lnk where foo.lnk points to foo.txt)
This is extremely limited.......
for example, you can't make a shortcut to a library if that library isn't in your path.
just for kicks, make a text file in win32, then make a shortcut to it.
then:
c:\>type foo.txt
you get a bunch of high ascii garbage with the path of the target mixed in there.
compare
$echo this way works >foo.txt
$ln -s foo.txt foo.realsymlink
$cat foo.realsymlink
OR, drop the -s and then try moving the target file around. Try that with your shortcut.lnk
The behavior described in the article is neither a Unix symlink, nor a Unix hardlink. It is something I have never heard of in Unix, an automated symlink. Frankly (and I am a Unix weenie), this looks like a true innovation.
In Unix filesystems, each file has an "inode" number unique to the filesystem. The directory entries all point to inodes. Thus, two different directory entries can have the same inode, and thus the same bits are accessible from multiple places. Note, for example, that the vi and ex programs are hardlinks to the same executable--the editor simply reads the name it was called with to determine whether it should behave as vi or as ex.
Hardlinks do not really exist to save space, they exist to link two directory entries at the hip. If one file (inode) has two links (filenames), then grabbing it by one filename and editing it will cause changes which will be visible when you pick it up by the other filename. Note that, because of this, Unix hardlinks are manual. The filesystem doesn't spontaneously create hardlinks; it takes a user process to do this.
Microsoft's scheme is implicitly handled by the filesystem code.
This implies that this is happening automagically, without user interferance. At worst, this means that the SIS is creating hardlinks on the fly. I doubt this because it would create Mothra-sized bugs as two files get "married" as links and never "divorced". Think about it: users often copy a file byte for byte (causing SIS to link them together), and then edit one and use the other as an unchanging backup.
My guess is that SIS is linking files on the fly when it recognizes them as equal, and then unlinking them (copy-on-edit) as a file is edited to be different from its linkmates.
This is simply Microsoft eliminating redundancy in its filesystem. Compression algorithms eliminate redundancy all the time--that's how they save bytes.
Some Unix flavors do a similar thing in core. When loading up a program, the bits of the binary can be stored once in memory no matter how many invocations the program currently has. If eight people are running Emacs, memory is storing eight Emacs data segments, but only one copy of the Emacs binary.
This is something one could implement in Linux filesystem code. Each inode would need its own checksum, and there would have to be a one-or-more-to-one relationship between inodes and hardware representations--that is, two different inodes would be able to share the same sectors.
When a file got edited, the FS would determine whether the sectors were shared with one or more other inodes--if so, you have to "divorce" by copying the sectors elsewhere and pointing the inode to the new sectors.
When the edit finished, the FS would recalculate the checksum, then look for all other inodes with the same checksum. For any matches, do a byte-for-byte diff to make sure--if so, then point the inode at the same sectors as the old inode and mark the new inode's sectors for reaping.
The tradeoff is between filesystem space and write performance (read performance is probably unchanged). It takes better minds than mine to determine under what circumstances the tradeoff is worth it.
--The basis of all love is respect
In Win2K an application has a choice of using the system dlls, which are protected and can't be written over except by a service pack, or it's own private version of a DLL. So if your app requires a specific version of msvcrt.dll, you can install it in the application directory and it will use that copy instead of the system copy.
For a complete explanation of this: Check out this article