The point is that some companies add value or "wealth" to the economy("creating new things that people can use"), while others don't (typically either entertainment or 'maintenance' jobs such as replacing a blown light bulb). Candy falls under "entertainment", even though it is tangible. The difference between tangibles (real manufacturing) and "virtual property" is that tangibles have a much higher variable costs to fixed costs ratio, i.e. they cost an amount of money to produce that is (generally speaking) proportional to the number of units produced. With electronic "virtual" goods, the fixed costs may be similar, but the variable (manufacturing) costs are close to zero - it's almost like being able to 'print money'. (Of course this is also generally true for most software products e.g. Windows.)
The real question though is whether these "service" guys are adding value to the economy (i.e. adding value to society) by creating artificial scarcity in an electronic arena where there is no natural scarcity of something. If that's the case, then they're not adding anything new to society - they're just artificially inserting themselves as a middleman somewhere where there doesn't actually need to be one, and trying to create new laws to justify and back up their presence, when society doesn't actually need them at all. The effect of that is only to make a few people rich for doing nothing, and makes the economy run more inefficiently (i.e. more expensive to produce the exact same output) as it could without these guys.
Some might argue that you can never add value to society by artificially creating scarcity, as artificial scarcity must always mean something becomes more expensive than it needs to be, and thus will make the economy less efficient. See in theory, the money being used to keep people in jobs by artificial scarcity alone could be used to rather create new jobs that actually add new value/wealth to society. But then, would software companies be able to exist if wouldn't be able to cover their fixed costs? Would the game companies have an incentive to create the "virtual space" that allows virtual goods to develop a perceived financial value? I think this is comparable to the art world, btw, where a painting can get a very high perceived dollar value in some domain by virtue of various factors such as good marketing, supply and demand, etc. In truth most paintings probably have about as much practical value as a sword in an online game. The current laws are good enough for the art world, and I think the current laws are at least adequate enough to protect suppliers of 'virtual goods'. We have plenty of very rich suppliers (and overpriced virtual goods) to prove that. If the laws need to change, it should be to protect consumers more, not suppliers.
Yes, exactly, that's my entire point. ??? What are you trying to say? A lot of people here on/. seem to think that OS X for Intel is unlikely to be able to run on AMD.
Uh, congratulations, there are about 6000 languages in the world. Only 5976 to go now!
Seriously, the complexity of morphological analysis has NOTHING to do with the ENGINES. There are very good engines available, but an engine is useless without a collection of rules for each language. And morphological rules are not only HIGHLY language-specific, but also extremely complicated - it takes a lot of resources to compile accurate and comprehensive rules for each language. There are literally thousands of rules that must be compiled and tested for each language. That's why only a few of the major languages are covered.
Moreover, it's still very limited. It doesn't handle text with errors in it very easily, for one thing (e.g. missing letters, typos, grammar errors.. especially non-mother-tongue speech is full of such errors). It becomes exponentially orders of magnitude more complex if the text has many errors in it. Modern language use is also very "mixed", e.g. Chinese or Arabic text often mixes in bits of other languages, place names, person names, etc.
Then there are Unicode processing issues, e.g. surrogate pairs for characters outside the Basic Multilingual Plane where for example thousands of Chinese Proper Names are located.
That engine you point to doesn't much apart from stemming.
Oh, and none of this does any disambiguation for you --- oops. Although some of them do limited tagging - whoopee.
But nice try at debunking my post. If not rather lame, uninformed and totally incorrect.
I'm not justifying the guy's spin or his methods - I don't agree with them either --- I'm just saying your counter-arguments are bogus, and I'm sure you could have come up with better counter-arguments to diss his claim if you wanted to sound less like someone equally zealously eager to dismiss his claims.
There are many reasons why different peoples' mileage may vary for things like application startup times. I remember I once had a Win2K system on which Internet Explorer always took about a minute to load. Mozilla loaded in only a few seconds. Not unlike this guy's discrepancy. But eventually I discovered the cause: that system had a few hundred fonts installed, and IE enumerates (and presumably analyses) all fonts on the system each time you open it. Deleting most of the fonts solved the problem.
The toolbar is separate to OSA.exe - you're thinking of the "Office Shortcut Bar".
The Office functionality resides mostly in ActiveX object libraries, not in DLLs. The actual windows and other user-interface elements like toolbars and menus that appear when you open Word, are actually small/quick to create... you seem to be under the impression that the user-interface elements are what constitute the bulk of Office from a processing perspective.
Actually in some older versions of Word, the size of a.doc file could fluctuate randomly even for the exact same document, just saved several times in a row. I've seen cases where a half-meg.doc file would suddenly save as a one-meg file, and then shortly thereafter as a 400k file, then a bit later as a 600k file, and so on. The problem may have nothing to do with metadata.
OfficeXP doesn't seem to have this problem anymore.
Oh, you're wrong btw, Office doesn't save the "revision history" in.doc files by default --- I call BS/astroturfer/clueless/all-of-the-above.
Its more likely because openoffice was freshly installed, but ms-office was "installed more than a year ago".
So what? That will be the case for 90+% of Microsoft's customers, so that sounds perfectly fair to me. Or do you think MS customers should do total clean reinstalls several times a year? In which case I suggest you've been drinking Bill's Kool-Aid.
If he doesnt even do a clean install, he surely doesnt defrag his HD...
Firstly, the Microsoft Office files don't BECOME fragmented, because almost none of them change. Secondly, Microsoft claimed in their own marketing that Windows includes automatic background defragmentation and optimisation of applications for fastest possible load time, which obliviates both your "points".
From Microsoft's own site:
"What Are the Advantages of Running the Osa.exe File?" "When you use the Osa.exe file to initialize shared code, the Office XP programs start faster."
Voila - that's why Word loads so fast, and you don't need to take my word for it.
Spelling it "M$" is required for paid astroturfers; it makes readers 'trust them', when in fact their posts promote MS software zealously. TripMaster Monkey is a known astroturfer.
Considering that (according to Steve Jobs) they've been building OS X on Intel for the last five years, it seems like a pretty obvious move to also compile on an AMD box while you're at it... so I would be surprised if it didn't run on AMD, even though they obviously wouldn't have announced that since they're so buddy-buddy with Intel for the PR.
It was so uninteresting to you that you clicked on "Read More" and replied? Honestly, if it's not interesting to you, why not just skip over the headline/blurb on the front page, like the rest of us do for topics that don't interest us?
Re:A couple of reasons for it not to work
on
The Evil in E-Mail
·
· Score: 1
So just to add to that --- given that these technologies can only work primarily for English and a few other major European languages --- who are they truly intending to watch with this? The terrorists, or their own populace?
A couple of reasons for it not to work
on
The Evil in E-Mail
·
· Score: 3, Interesting
- Many languages are conjunctive/agglutinating in nature (e.g. Turkish, Finnish, Swahili). This means that words of sentences aren't isolated (like most European languages) but are in fact formed from 'parts' that change depending on the surrounding words. Moreover, modifying pre-/suffixes are used as inflections for e.g. verb paradigms. This results in language that effectively have literally billions or even an infinite number of possible "words". It is impossible to do keyword-based analysis on such languages without a full morphological parser for each language to break a word into its 'parts' - such a parser is a massive task.
- Chinese is the opposite, it is a totally "isolating", meaning each word is distinct with no inflections, and because different characters are used for different words there are NO SPACES between words. So you cannot begin to analyse Chinese data at all unless you have a full "Chinese segmenter" to locate word boundaries.
The need to do further disambiguation further complicates all of this analysis.
There is pretty much no way for this type of analysis to be really accurate under the current level of written language analysis technologies.
Actually, you'd be surprised to know that you find more racial hatred travelling round the US than travelling around Africa. And most of the obvious exceptions, such as the Rwandan genocide, were in fact primarily fuelled by the colonialists (e.g. before Belgium's interference in Rwanda, the Hutus and Tutsis lived peacefully together for centuries). And the spillover violence into places like the Congo are actually secondary conflicts caused by the Rwandan genocide.
I know you think you sound smart commenting on Africa, but unfortunately you reveal that you know nothing about the place.
In a truly free market, yes. This isn't the case though when a monopoly provider illegally violates anti-trust laws to control the market and control and block access to the market.
OK, thx, I think I see then. So the 'MD5 weakness' comes into it then because the Wang and Yu paper gives them a way, they say, to calculate the two different blocks with the same hash relatively quickly i.e. in a few hours.. according to the article: "Based on [WY05], we implemented an attack to find random collisions for the MD5 compression function. It took just a few hours on a customary PC."
Re:Which is the bigger irony:
on
Ajax On Rails
·
· Score: 1
Microsoft's own technology being used by Google to loosen Redmond's deathgrip on the market
This is nothing new to Microsoft - they have long been creating tools for their own competitors. Microsoft create a platform and sell development tools for the platform. However MS is also an ISV for that same platform (e.g. MS Office). Hence there is an inherent conflict of interest - the platform side is developing and selling the tools that will be used by competitors to compete with the ISV side.
This is one of the reasons MS makes such bad APIs, and why their API documentation is frequently outright incorrect - it slows down competitors, so they can stay "one step ahead".
I'm not quite sure how Microsoft plans to sell the OS
Same way they sold XP, and same thing that made XP so successful: by getting OEMs to bundle it with all new PCs/laptops sold. The vast majority of XP sales were because "it came with the new computer" - only a very tiny percentage of users actually upgraded existing computers to XP, and XP offered virtually no addition value (i.e. little incentive) to upgrade.
Most people will buy Longhorn simply as a result of the natural process of buying new laptops/PC.
Longhorn will be a (financial) success regardless of if it has any new features or not, so I don't think MS care too much, which is why they're never really in a hurry to add new features. Strategy is more important. A new "look and feel" is all that's necessary to distract people (and the media) from looking at anything else, and making people think that it's really a brand new OS. And most people think computers = Windows anyway.
Microsoft cannot "become irrelevant" in the desktop market so long as virtually every new computer sold has Windows on it by default. Longhorn will be a great success, financially.
Unless you had already scientifically proven it though (i.e. worked out the MD5 problems yourself), then your "prediction" amounts to nothing more than a lucky guess. You had no way of knowing, and you could just as easily have been wrong. Historical rates of finding crypto flaws have no prediction value whatsoever because each "event" is completely statistically independent. In other words, you may have been right, but you knew nothing. You were right by chance.
Which part of the above is MD5-specific? It sounds like the above would be possible regardless of the hash algorithm used? Or is it that they've found a method to find X and Y that both hash to the same value quickly enough with MD5. From your description it sounds like X and Y can be very small though, in which case it shouldn't take too long to find X and Y "brute-force" for any hash algorithm?
The point is that some companies add value or "wealth" to the economy("creating new things that people can use"), while others don't (typically either entertainment or 'maintenance' jobs such as replacing a blown light bulb). Candy falls under "entertainment", even though it is tangible. The difference between tangibles (real manufacturing) and "virtual property" is that tangibles have a much higher variable costs to fixed costs ratio, i.e. they cost an amount of money to produce that is (generally speaking) proportional to the number of units produced. With electronic "virtual" goods, the fixed costs may be similar, but the variable (manufacturing) costs are close to zero - it's almost like being able to 'print money'. (Of course this is also generally true for most software products e.g. Windows.)
The real question though is whether these "service" guys are adding value to the economy (i.e. adding value to society) by creating artificial scarcity in an electronic arena where there is no natural scarcity of something. If that's the case, then they're not adding anything new to society - they're just artificially inserting themselves as a middleman somewhere where there doesn't actually need to be one, and trying to create new laws to justify and back up their presence, when society doesn't actually need them at all. The effect of that is only to make a few people rich for doing nothing, and makes the economy run more inefficiently (i.e. more expensive to produce the exact same output) as it could without these guys.
Some might argue that you can never add value to society by artificially creating scarcity, as artificial scarcity must always mean something becomes more expensive than it needs to be, and thus will make the economy less efficient. See in theory, the money being used to keep people in jobs by artificial scarcity alone could be used to rather create new jobs that actually add new value/wealth to society. But then, would software companies be able to exist if wouldn't be able to cover their fixed costs? Would the game companies have an incentive to create the "virtual space" that allows virtual goods to develop a perceived financial value? I think this is comparable to the art world, btw, where a painting can get a very high perceived dollar value in some domain by virtue of various factors such as good marketing, supply and demand, etc. In truth most paintings probably have about as much practical value as a sword in an online game. The current laws are good enough for the art world, and I think the current laws are at least adequate enough to protect suppliers of 'virtual goods'. We have plenty of very rich suppliers (and overpriced virtual goods) to prove that. If the laws need to change, it should be to protect consumers more, not suppliers.
Yes, exactly, that's my entire point. ??? What are you trying to say? A lot of people here on /. seem to think that OS X for Intel is unlikely to be able to run on AMD.
Uh, congratulations, there are about 6000 languages in the world. Only 5976 to go now!
Seriously, the complexity of morphological analysis has NOTHING to do with the ENGINES. There are very good engines available, but an engine is useless without a collection of rules for each language. And morphological rules are not only HIGHLY language-specific, but also extremely complicated - it takes a lot of resources to compile accurate and comprehensive rules for each language. There are literally thousands of rules that must be compiled and tested for each language. That's why only a few of the major languages are covered.
Moreover, it's still very limited. It doesn't handle text with errors in it very easily, for one thing (e.g. missing letters, typos, grammar errors .. especially non-mother-tongue speech is full of such errors). It becomes exponentially orders of magnitude more complex if the text has many errors in it. Modern language use is also very "mixed", e.g. Chinese or Arabic text often mixes in bits of other languages, place names, person names, etc.
Then there are Unicode processing issues, e.g. surrogate pairs for characters outside the Basic Multilingual Plane where for example thousands of Chinese Proper Names are located.
That engine you point to doesn't much apart from stemming.
Oh, and none of this does any disambiguation for you --- oops. Although some of them do limited tagging - whoopee.
But nice try at debunking my post. If not rather lame, uninformed and totally incorrect.
I'm not justifying the guy's spin or his methods - I don't agree with them either --- I'm just saying your counter-arguments are bogus, and I'm sure you could have come up with better counter-arguments to diss his claim if you wanted to sound less like someone equally zealously eager to dismiss his claims.
There are many reasons why different peoples' mileage may vary for things like application startup times. I remember I once had a Win2K system on which Internet Explorer always took about a minute to load. Mozilla loaded in only a few seconds. Not unlike this guy's discrepancy. But eventually I discovered the cause: that system had a few hundred fonts installed, and IE enumerates (and presumably analyses) all fonts on the system each time you open it. Deleting most of the fonts solved the problem.
The toolbar is separate to OSA.exe - you're thinking of the "Office Shortcut Bar".
The Office functionality resides mostly in ActiveX object libraries, not in DLLs. The actual windows and other user-interface elements like toolbars and menus that appear when you open Word, are actually small/quick to create ... you seem to be under the impression that the user-interface elements are what constitute the bulk of Office from a processing perspective.
You didn't read very carefully - GP saved the document as a .doc file from OOo, not in OOo's native gzipped XML format.
Actually in some older versions of Word, the size of a .doc file could fluctuate randomly even for the exact same document, just saved several times in a row. I've seen cases where a half-meg .doc file would suddenly save as a one-meg file, and then shortly thereafter as a 400k file, then a bit later as a 600k file, and so on. The problem may have nothing to do with metadata.
OfficeXP doesn't seem to have this problem anymore.
Oh, you're wrong btw, Office doesn't save the "revision history" in .doc files by default --- I call BS/astroturfer/clueless/all-of-the-above.
Its more likely because openoffice was freshly installed, but ms-office was "installed more than a year ago".
So what? That will be the case for 90+% of Microsoft's customers, so that sounds perfectly fair to me. Or do you think MS customers should do total clean reinstalls several times a year? In which case I suggest you've been drinking Bill's Kool-Aid.
If he doesnt even do a clean install, he surely doesnt defrag his HD...
Firstly, the Microsoft Office files don't BECOME fragmented, because almost none of them change. Secondly, Microsoft claimed in their own marketing that Windows includes automatic background defragmentation and optimisation of applications for fastest possible load time, which obliviates both your "points".
There has never been a utility to keep Office in ram
I call BS.
From Microsoft's own site: "What Are the Advantages of Running the Osa.exe File?" "When you use the Osa.exe file to initialize shared code, the Office XP programs start faster."
Voila - that's why Word loads so fast, and you don't need to take my word for it.
Spelling it "M$" is required for paid astroturfers; it makes readers 'trust them', when in fact their posts promote MS software zealously. TripMaster Monkey is a known astroturfer.
Considering that (according to Steve Jobs) they've been building OS X on Intel for the last five years, it seems like a pretty obvious move to also compile on an AMD box while you're at it ... so I would be surprised if it didn't run on AMD, even though they obviously wouldn't have announced that since they're so buddy-buddy with Intel for the PR.
It was so uninteresting to you that you clicked on "Read More" and replied? Honestly, if it's not interesting to you, why not just skip over the headline/blurb on the front page, like the rest of us do for topics that don't interest us?
So just to add to that --- given that these technologies can only work primarily for English and a few other major European languages --- who are they truly intending to watch with this? The terrorists, or their own populace?
- Many languages are conjunctive/agglutinating in nature (e.g. Turkish, Finnish, Swahili). This means that words of sentences aren't isolated (like most European languages) but are in fact formed from 'parts' that change depending on the surrounding words. Moreover, modifying pre-/suffixes are used as inflections for e.g. verb paradigms. This results in language that effectively have literally billions or even an infinite number of possible "words". It is impossible to do keyword-based analysis on such languages without a full morphological parser for each language to break a word into its 'parts' - such a parser is a massive task.
- Chinese is the opposite, it is a totally "isolating", meaning each word is distinct with no inflections, and because different characters are used for different words there are NO SPACES between words. So you cannot begin to analyse Chinese data at all unless you have a full "Chinese segmenter" to locate word boundaries.
The need to do further disambiguation further complicates all of this analysis.
There is pretty much no way for this type of analysis to be really accurate under the current level of written language analysis technologies.
Actually, you'd be surprised to know that you find more racial hatred travelling round the US than travelling around Africa. And most of the obvious exceptions, such as the Rwandan genocide, were in fact primarily fuelled by the colonialists (e.g. before Belgium's interference in Rwanda, the Hutus and Tutsis lived peacefully together for centuries). And the spillover violence into places like the Congo are actually secondary conflicts caused by the Rwandan genocide.
I know you think you sound smart commenting on Africa, but unfortunately you reveal that you know nothing about the place.
Just a thought about free markets.
In a truly free market, yes. This isn't the case though when a monopoly provider illegally violates anti-trust laws to control the market and control and block access to the market.
Even when much of the money was made illegally?
Hmm, yes, I probably was :)
Uhm .. that can't be, sizeof(int) on DOS is 16 bits.
Unless you meant the Watcom compiler for DOS4GW? BIG difference.
DOS = 16-bit real-mode. DOS4GW = 32-bit protected mode.
OK, thx, I think I see then. So the 'MD5 weakness' comes into it then because the Wang and Yu paper gives them a way, they say, to calculate the two different blocks with the same hash relatively quickly i.e. in a few hours .. according to the article: "Based on [WY05], we implemented an attack to find random collisions for the MD5 compression function. It took just a few hours on a customary PC."
Microsoft's own technology being used by Google to loosen Redmond's deathgrip on the market
This is nothing new to Microsoft - they have long been creating tools for their own competitors. Microsoft create a platform and sell development tools for the platform. However MS is also an ISV for that same platform (e.g. MS Office). Hence there is an inherent conflict of interest - the platform side is developing and selling the tools that will be used by competitors to compete with the ISV side.
This is one of the reasons MS makes such bad APIs, and why their API documentation is frequently outright incorrect - it slows down competitors, so they can stay "one step ahead".
Not a bad list, thx. But will it offer anything that, uh, competitors can't already do?
I'm not quite sure how Microsoft plans to sell the OS
Same way they sold XP, and same thing that made XP so successful: by getting OEMs to bundle it with all new PCs/laptops sold. The vast majority of XP sales were because "it came with the new computer" - only a very tiny percentage of users actually upgraded existing computers to XP, and XP offered virtually no addition value (i.e. little incentive) to upgrade.
Most people will buy Longhorn simply as a result of the natural process of buying new laptops/PC.
Longhorn will be a (financial) success regardless of if it has any new features or not, so I don't think MS care too much, which is why they're never really in a hurry to add new features. Strategy is more important. A new "look and feel" is all that's necessary to distract people (and the media) from looking at anything else, and making people think that it's really a brand new OS. And most people think computers = Windows anyway.
Microsoft cannot "become irrelevant" in the desktop market so long as virtually every new computer sold has Windows on it by default. Longhorn will be a great success, financially.
Unless you had already scientifically proven it though (i.e. worked out the MD5 problems yourself), then your "prediction" amounts to nothing more than a lucky guess. You had no way of knowing, and you could just as easily have been wrong. Historical rates of finding crypto flaws have no prediction value whatsoever because each "event" is completely statistically independent. In other words, you may have been right, but you knew nothing. You were right by chance.
Which part of the above is MD5-specific? It sounds like the above would be possible regardless of the hash algorithm used? Or is it that they've found a method to find X and Y that both hash to the same value quickly enough with MD5. From your description it sounds like X and Y can be very small though, in which case it shouldn't take too long to find X and Y "brute-force" for any hash algorithm?