While I suspect in this case the situation is that Yahoo are talking crap, I'd observe that the study is flawed.
It assumes that if a page is indexed by both Google and Yahoo then a given search string will either return the page in both search engines or in neither; this is not necessarily true. Both Google and Yahoo can find pages which do not contain a search string (e.g. 'failure' famously finds GWB's bio, a page which doesn't contain 'failure'). If Google's indexing algorithm were more generous than Yahoo's with this sort of thing then you'd expect to find Google returning more results for a given search even if Yahoo had indexed more pages. Extreme example: I set up PDAllenSearch which has indexed only a few million pages, but returns any page containing the search string or directly linked to one which does. That probably returns more results than Google for most queries, and with a sensible ranking algorithm it won't look too bad.
First off - presumably the police 'wanting to access files' means 'when they have a search warrant' (if not, then this really would be stupid). So there is not much difference between an encrypted file and a locked safe in this case: you must supply police with keys to the safe, you must supply the key to the file. That's really just a common-sense extension of what exists. The only difference is that if you refuse to supply police with keys to a safe, then you get done for obstruction and the police apply sufficient force to the safe; if you refuse to supply the key to a file the police cannot access the data (assume decent crypto). In either case, you're only going to try to obstruct if you have something to hide, and the law shouldn't try to protect you.
However, it might not be your safe, it might not be your encrypted file. In which case, you have to explain why there is an encrypted file on your computer which you claim to be unable to decrypt. If you say someone asked you to hang on to it, then you will get done for aiding and abetting if the police feel like it, or possibly holding stolen goods if the data turns out to be valuable. Same as with a physical object you're holding for someone. If you say it came to be on your computer via a freenet type program, you might get away with it if you can convince people that you didn't request the file, but you'd have to hope the jury was technically competent (which they most likely aren't) or it would be easy for prosecution to bullshit. You'd also probably be lying unless it was a small encrypted file. If it came to be on your computer via a virus (or you can argue that convincingly if possibly mendaciously) then you may well get away with it, there is some precedent for that.
Hydrogen in a pressure tank is more likely to produce the giant fireball than petrol; sodium is less likely to do so, you'd have to crash your car in a lake, and even then the reaction wouldn't produce much of an explosion. It would make clean-up a bit harder after a crash, though.
Storing hydrogen in large quantities is a pain - you need a big pressure tank, which is fine if you're looking at fixed objects but not if you want a hydrogen powered car. Then you have to worry about accidents - if the tank breaks in a crash, you'll likely have something set off the hydrogen and cause an explosion.
Whereas sodium doesn't need to be stored under pressure and is less dangerous - partly because even if the container cracks most of the sodium will stay put, partly because sodium will tend to just burn in air.
"I'm fairly sure that the PowerPC, too, has an individual CPU ID. Every high end microprocessor does, just as every network device has its unique MAC address."
Being as everyone knows MAC addresses are unique and hard-coded into the network device. There are no routers with flash memory containing the MAC addresses. Nor have certain companies ever produced several exactly-identical network devices with hard-coded MAC address, and this has never resulted in great amusement when two ended up on the same net.
PDA: Hi Boss, I was reading/. over the weekend and this guy made a great point. As an engineer I have the choice not to work on MS-based projects. So I'm going to write web apps which don't bother fixing the broken MS implementations of standards, they'll work fine on Firefox.
BOSS: WHAAAT?!
PDA: It's great news - I can get the code out in a quarter of the time and with less bugs. In fact, I rewrote the existing site just over the weekend to prove my point!
BOSS goes red.
BOSS: You mean YOU are the reason why our sales are down 90% over the weekend? You're fired! SECURITY!!! Escort this person off the premises immediately!
Because your boss cares about selling product to IE users. If the users find bugs when your site is in their browser, they will not think 'If only I was using Firefox'. They will think 'What a piece of crap this site is, if the website is this bad I can't trust the product either, better find an alternative solution'.
In any case, when you're talking memory leaks you have a serious case of the pot calling the kettle black - hopefully the FF leaks will be sorted sooner than the IE ones (since FF seem to care about them and God knows MS don't) but right now FF is at least as leaky as IE.
It's kind of nice that the moz developers are extending the Javascript specs; and it'd certainly be nice if MS were to follow their lead.
But anyone who thinks that this will have any impact on browser use needs to think again.
OK, there will be a few websites that will use the new stuff and either break IE or put up a 'get a decent browser' page. But most people can't afford to throw away a vast majority of their market. So they need to write code that will work in IE; and if you've already written code that works in IE and Mozilla you won't bother redoing it just for Mozilla; even if the Mozilla-specific code could be half the length and easier to follow. So users won't have any incentive to switch to a Mozilla browser.
You have to wait because this guy would like to at least give people a little while to patch the problem, before standing up and telling the world and all its skript-kiddiez how to exploit it?
if you want a guess - either via Sniffer Thread A can read registers used by Crypto Thread B running on the virtual second processor, or the branch predictor doesn't clear fully and can be made to disgorge its contents to thread A.
Reading TFA, it seems people voted for 'less dependence on vendors' because MS et al release version 5 and stop supporting versions 1-3.
Which is all very well, and no doubt it matters when your version 3 software causes you a problem and the tech support smiles at you and says 'we don't do that anymore'.
But if I run the Linux 1.0 kernel, am I going to find anyone who can be bothered to remember how to sort out my problem? No. Everyone's moved on, I will get no support.
Difference: I have a simple solution, namely download the up to date kernel, sorted in six hours and for free. Whereas the guy with version 3 proprietary software has to open his wallet and wait for a few days for the CD.
'Countable' does not mean finite. There are infinitely many Turing machines, but it is the smallest possible infinity - there are as many Turing machines as there are counting numbers (positive integers). Whereas 'uncountable' does not just mean infinite; it means a larger infinity than the countably infinite. The things I said were uncountable happen to be the same size as the set of real numbers.
Why not sue Tiger Woods - or maybe his parents - for it?
I can see why a company might complain about a competitor who uses the same / a very similar word for the name of their competing product, but I don't see how a sports team can complain that an operating system by the same name is a business problem.
Rename the.txt file to.doc, and MS Word will open it happily enough. The clueless HR people just want the nice.doc extension, they don't care whether it's actually.doc format.
If the OSS package screws up and loses your data, then you can't do anything.
If the MS package screws up and loses your data, then you _might_ be able to get some recompense, although in practice you'd better be a big company willing to spend a lot on landsharks.
Suppose I offered you a free pedal bike which would never need repairs or maintenance for your 50 mile daily commute.
Would this ultimately prove to be better value for you than buying a new car every few years, paying insurance and petrol, etcetera?
Probably not - you don't have time to cycle 50 miles every day, you don't want to get wet when it rains, et cetera. So the pay solution is better than the free one. Can easily happen. MS would like to convince you that training people to use OSS is expensive, that OSS is unreliable and will lose your valuable data, et cetera.
Of course, there are lots of OSS packages that _are_ as good as or even better than the pay solution. MS would like you not to believe that.
No, there are only countably many Turing machines (unless you're using some odd non-standard definition).
There _are_ uncountably many possible tapes, same reason as uncountably many bit sequences. But there are only countably many Turing machines:
Any Turing machine T can be simulated by an universal Turing machine, taking as input a _finite_ sequence which describes T. It follows that there are at most as many Turing machines as there are finite sequences. There are only finitely many length n sequences, for any given n. So the cardinality of the set of all Turing machines is a countable union (over all n in the positive integers) of finite sets (Turing machines represented by length n sequences). It's trivial to show that a countable union of finite sets is countable.
If you prefer, observe that a Turing machine which halts uses only a finite portion of the tape; a Turing machine which does not halt never produces an output so we can ignore it.
First question: no. The number: \sum_{n=1}^{\inf} 10^{-n!} is transcendental; it's a Liouville number. But the digit string is all zeroes (in base 10) except for 1 at position 1!, 2!, 3!,...
pi and e may be absolutely normal (i.e. every possible digit sequence in any base occurs about as often as you'd expect if the digits were random) but this is AFAIK not proven. It's also conjectured that every irrational algebraic number is absolutely normal.
It's very simple to define what random (unbiased) data is. If you are given the first n bits of the random data, and told to guess the n+1'th bit, then you have a probability 0.5 of being right, for all n.
Any pseudo-random number generator fails this test. A PNRG is a program, which has finite length and produces an output (I don't care about its input, consider this part of the program if you want). So you run the following (slow, but this is a thought experiment) algorithm: use the PNRG to generate k bits. Run through all programs of length at most k/2 in length then lexicographic order, and discard those which do not generate the same k bits as the PNRG. Take the first program which was not discarded. As k becomes larger, eventually you will see the chosen program is the same one every time, and you may become more confident that that program is the PNRG program. That allows you to guess the next bit correctly with high probability.
However, there does exist data which you can't compress like that. For example, google 'Chaitin's constant'.
Alternatively, observe that there are uncountably many infinite sequences of bits, but only countably many programs. Hence there are sequences (almost all sequences) which are not the output of any program.
A string of alphanumeric characters has just over 6 bits of randomness per character (ignoring capitalisation). A sentence in English has IIRC about 2 bits of randomness per character, even though it uses the same set of characters; most strings of characters are not English words.
So you can reasonably say that a string of 100 characters is more random than a 100-character English sentence.
Probability is not as easy to define as you might think - try to explain what you mean by an event having probability x, without talking about other probabilities.
First you work out what you're going to do, and draw it out on a bit of paper. Then you start coding. When you have a function/class/whatever finished, then you write down what it's meant to do in a separate document. Then you comment the function from what you've written. Comments every line is crap, commenting only the start of the function is crap unless the function is both short and simple.
Then when you finish you should have code that makes sense. When you need to change a function, remove all the comments from that function, re-code, rewrite comments doc, recomment.
As far as function and variable names go - try to make names descriptive when possible mod not having to type twenty letters for a variable you use once every three lines or so through the whole code. On the other hand, if you write a function which you'll use maybe four times in your whole code, then there is no reason why the function name shouldn't be twenty letters or more if that helps.
With indent styles - someone has to start the project, if there is a clash of indent styles, then stick a README in the directory stating that the following style... will be used. If anyone then plays games, they will get bollocked by the boss. If the boss doesn't like it, of course, then you should have used his approved indent style.
You often will find that you have a piece of code which does something clever in a non-obvious way, and you can't see any way to reduce it to code which you can easily document as above. In which case, either write a good description of what is happening and put the whole lot as a comment immediately before the clever bit, or preferably leave it as a text file Desc_FlibFn_FooClass.txt and put a reference to that file in the code comments.
>Why the hell would anyone go to "Start" to logout?
In Win95, this was confusing and no doubt caused problems. In Win98 a few people had trouble. By now, everyone knows that to logout you have to go to start. Yes it is stupid, yes it is unintuitive. But everyone now knows that you do that. If MS removed logout/shutdown from the start menu and put it somewhere else many people would click Start, look, curse, then go find where log out was moved to. It's not really something that would be worth changing. Very much like the QWERTY layout is not all that efficient, but everyone knows it so forcing everyone to change would not be clever. You might possibly like Dvorak and be willing to buy a Dvorak keyboard (or pull the keys and rearrange a standard one), but if Dell decided it would sell all its computers with Dvorak as standard it would lose money.
Frankly, if someone produces an OS which meets your specs (and doesn't cost a million quid) then I could care less whether it's free / open source. One off payment of a couple hundred isn't worth worrying about.
On the other hand, in the world of operating systems that actually exist, it's nice to be able to think that the annoying bug can be fixed - and if you want to you can fix it yourself.
While I suspect in this case the situation is that Yahoo are talking crap, I'd observe that the study is flawed.
It assumes that if a page is indexed by both Google and Yahoo then a given search string will either return the page in both search engines or in neither; this is not necessarily true. Both Google and Yahoo can find pages which do not contain a search string (e.g. 'failure' famously finds GWB's bio, a page which doesn't contain 'failure'). If Google's indexing algorithm were more generous than Yahoo's with this sort of thing then you'd expect to find Google returning more results for a given search even if Yahoo had indexed more pages. Extreme example: I set up PDAllenSearch which has indexed only a few million pages, but returns any page containing the search string or directly linked to one which does. That probably returns more results than Google for most queries, and with a sensible ranking algorithm it won't look too bad.
First off - presumably the police 'wanting to access files' means 'when they have a search warrant' (if not, then this really would be stupid). So there is not much difference between an encrypted file and a locked safe in this case: you must supply police with keys to the safe, you must supply the key to the file. That's really just a common-sense extension of what exists. The only difference is that if you refuse to supply police with keys to a safe, then you get done for obstruction and the police apply sufficient force to the safe; if you refuse to supply the key to a file the police cannot access the data (assume decent crypto). In either case, you're only going to try to obstruct if you have something to hide, and the law shouldn't try to protect you.
However, it might not be your safe, it might not be your encrypted file. In which case, you have to explain why there is an encrypted file on your computer which you claim to be unable to decrypt.
If you say someone asked you to hang on to it, then you will get done for aiding and abetting if the police feel like it, or possibly holding stolen goods if the data turns out to be valuable. Same as with a physical object you're holding for someone.
If you say it came to be on your computer via a freenet type program, you might get away with it if you can convince people that you didn't request the file, but you'd have to hope the jury was technically competent (which they most likely aren't) or it would be easy for prosecution to bullshit. You'd also probably be lying unless it was a small encrypted file.
If it came to be on your computer via a virus (or you can argue that convincingly if possibly mendaciously) then you may well get away with it, there is some precedent for that.
Hydrogen in a pressure tank is more likely to produce the giant fireball than petrol; sodium is less likely to do so, you'd have to crash your car in a lake, and even then the reaction wouldn't produce much of an explosion. It would make clean-up a bit harder after a crash, though.
Storing hydrogen in large quantities is a pain - you need a big pressure tank, which is fine if you're looking at fixed objects but not if you want a hydrogen powered car. Then you have to worry about accidents - if the tank breaks in a crash, you'll likely have something set off the hydrogen and cause an explosion.
Whereas sodium doesn't need to be stored under pressure and is less dangerous - partly because even if the container cracks most of the sodium will stay put, partly because sodium will tend to just burn in air.
Yes, and in that same column we see:
"I'm fairly sure that the PowerPC, too, has an individual CPU ID. Every high end microprocessor does, just as every network device has its unique MAC address."
Being as everyone knows MAC addresses are unique and hard-coded into the network device. There are no routers with flash memory containing the MAC addresses. Nor have certain companies ever produced several exactly-identical network devices with hard-coded MAC address, and this has never resulted in great amusement when two ended up on the same net.
Bright And Early Monday Morning:
/. over the weekend and this guy made a great point. As an engineer I have the choice not to work on MS-based projects. So I'm going to write web apps which don't bother fixing the broken MS implementations of standards, they'll work fine on Firefox.
PDA: Hi Boss, I was reading
BOSS: WHAAAT?!
PDA: It's great news - I can get the code out in a quarter of the time and with less bugs. In fact, I rewrote the existing site just over the weekend to prove my point!
BOSS goes red.
BOSS: You mean YOU are the reason why our sales are down 90% over the weekend? You're fired! SECURITY!!! Escort this person off the premises immediately!
Because your boss cares about selling product to IE users. If the users find bugs when your site is in their browser, they will not think 'If only I was using Firefox'. They will think 'What a piece of crap this site is, if the website is this bad I can't trust the product either, better find an alternative solution'.
In any case, when you're talking memory leaks you have a serious case of the pot calling the kettle black - hopefully the FF leaks will be sorted sooner than the IE ones (since FF seem to care about them and God knows MS don't) but right now FF is at least as leaky as IE.
Would you care to take a guess at how many web apps get written for non-IE intranets, as compared to how many get written for everything else?
You might as well say that cars don't need handbrakes because you and your mates never ever drive anywhere that's not flat.
It's kind of nice that the moz developers are extending the Javascript specs; and it'd certainly be nice if MS were to follow their lead.
But anyone who thinks that this will have any impact on browser use needs to think again.
OK, there will be a few websites that will use the new stuff and either break IE or put up a 'get a decent browser' page. But most people can't afford to throw away a vast majority of their market. So they need to write code that will work in IE; and if you've already written code that works in IE and Mozilla you won't bother redoing it just for Mozilla; even if the Mozilla-specific code could be half the length and easier to follow. So users won't have any incentive to switch to a Mozilla browser.
You have to wait because this guy would like to at least give people a little while to patch the problem, before standing up and telling the world and all its skript-kiddiez how to exploit it?
if you want a guess - either via Sniffer Thread A can read registers used by Crypto Thread B running on the virtual second processor, or the branch predictor doesn't clear fully and can be made to disgorge its contents to thread A.
Reading TFA, it seems people voted for 'less dependence on vendors' because MS et al release version 5 and stop supporting versions 1-3.
Which is all very well, and no doubt it matters when your version 3 software causes you a problem and the tech support smiles at you and says 'we don't do that anymore'.
But if I run the Linux 1.0 kernel, am I going to find anyone who can be bothered to remember how to sort out my problem? No. Everyone's moved on, I will get no support.
Difference: I have a simple solution, namely download the up to date kernel, sorted in six hours and for free. Whereas the guy with version 3 proprietary software has to open his wallet and wait for a few days for the CD.
Ah, I see.
'Countable' does not mean finite. There are infinitely many Turing machines, but it is the smallest possible infinity - there are as many Turing machines as there are counting numbers (positive integers). Whereas 'uncountable' does not just mean infinite; it means a larger infinity than the countably infinite. The things I said were uncountable happen to be the same size as the set of real numbers.
Suggest you use Google.
But you might be able to go after Coca-Cola under trade descriptions - after all, the cola is no longer coca.
Why not sue Tiger Woods - or maybe his parents - for it?
I can see why a company might complain about a competitor who uses the same / a very similar word for the name of their competing product, but I don't see how a sports team can complain that an operating system by the same name is a business problem.
Rename the .txt file to .doc, and MS Word will open it happily enough. The clueless HR people just want the nice .doc extension, they don't care whether it's actually .doc format.
So how is that untrue?
If the OSS package screws up and loses your data, then you can't do anything.
If the MS package screws up and loses your data, then you _might_ be able to get some recompense, although in practice you'd better be a big company willing to spend a lot on landsharks.
Suppose I offered you a free pedal bike which would never need repairs or maintenance for your 50 mile daily commute.
Would this ultimately prove to be better value for you than buying a new car every few years, paying insurance and petrol, etcetera?
Probably not - you don't have time to cycle 50 miles every day, you don't want to get wet when it rains, et cetera. So the pay solution is better than the free one. Can easily happen. MS would like to convince you that training people to use OSS is expensive, that OSS is unreliable and will lose your valuable data, et cetera.
Of course, there are lots of OSS packages that _are_ as good as or even better than the pay solution. MS would like you not to believe that.
No, there are only countably many Turing machines (unless you're using some odd non-standard definition).
There _are_ uncountably many possible tapes, same reason as uncountably many bit sequences. But there are only countably many Turing machines:
Any Turing machine T can be simulated by an universal Turing machine, taking as input a _finite_ sequence which describes T. It follows that there are at most as many Turing machines as there are finite sequences. There are only finitely many length n sequences, for any given n. So the cardinality of the set of all Turing machines is a countable union (over all n in the positive integers) of finite sets (Turing machines represented by length n sequences). It's trivial to show that a countable union of finite sets is countable.
If you prefer, observe that a Turing machine which halts uses only a finite portion of the tape; a Turing machine which does not halt never produces an output so we can ignore it.
First question: no. The number: ...
\sum_{n=1}^{\inf} 10^{-n!} is transcendental; it's a Liouville number. But the digit string is all zeroes (in base 10) except for 1 at position 1!, 2!, 3!,
pi and e may be absolutely normal (i.e. every possible digit sequence in any base occurs about as often as you'd expect if the digits were random) but this is AFAIK not proven. It's also conjectured that every irrational algebraic number is absolutely normal.
It's very simple to define what random (unbiased) data is. If you are given the first n bits of the random data, and told to guess the n+1'th bit, then you have a probability 0.5 of being right, for all n.
Any pseudo-random number generator fails this test. A PNRG is a program, which has finite length and produces an output (I don't care about its input, consider this part of the program if you want). So you run the following (slow, but this is a thought experiment) algorithm: use the PNRG to generate k bits. Run through all programs of length at most k/2 in length then lexicographic order, and discard those which do not generate the same k bits as the PNRG. Take the first program which was not discarded. As k becomes larger, eventually you will see the chosen program is the same one every time, and you may become more confident that that program is the PNRG program. That allows you to guess the next bit correctly with high probability.
However, there does exist data which you can't compress like that. For example, google 'Chaitin's constant'.
Alternatively, observe that there are uncountably many infinite sequences of bits, but only countably many programs. Hence there are sequences (almost all sequences) which are not the output of any program.
A string of alphanumeric characters has just over 6 bits of randomness per character (ignoring capitalisation). A sentence in English has IIRC about 2 bits of randomness per character, even though it uses the same set of characters; most strings of characters are not English words.
So you can reasonably say that a string of 100 characters is more random than a 100-character English sentence.
Probability is not as easy to define as you might think - try to explain what you mean by an event having probability x, without talking about other probabilities.
Documenting code's not that hard.
First you work out what you're going to do, and draw it out on a bit of paper. Then you start coding. When you have a function/class/whatever finished, then you write down what it's meant to do in a separate document. Then you comment the function from what you've written. Comments every line is crap, commenting only the start of the function is crap unless the function is both short and simple.
Then when you finish you should have code that makes sense. When you need to change a function, remove all the comments from that function, re-code, rewrite comments doc, recomment.
As far as function and variable names go - try to make names descriptive when possible mod not having to type twenty letters for a variable you use once every three lines or so through the whole code. On the other hand, if you write a function which you'll use maybe four times in your whole code, then there is no reason why the function name shouldn't be twenty letters or more if that helps.
With indent styles - someone has to start the project, if there is a clash of indent styles, then stick a README in the directory stating that the following style... will be used. If anyone then plays games, they will get bollocked by the boss. If the boss doesn't like it, of course, then you should have used his approved indent style.
You often will find that you have a piece of code which does something clever in a non-obvious way, and you can't see any way to reduce it to code which you can easily document as above. In which case, either write a good description of what is happening and put the whole lot as a comment immediately before the clever bit, or preferably leave it as a text file Desc_FlibFn_FooClass.txt and put a reference to that file in the code comments.
>Why the hell would anyone go to "Start" to logout?
In Win95, this was confusing and no doubt caused problems. In Win98 a few people had trouble. By now, everyone knows that to logout you have to go to start. Yes it is stupid, yes it is unintuitive. But everyone now knows that you do that. If MS removed logout/shutdown from the start menu and put it somewhere else many people would click Start, look, curse, then go find where log out was moved to. It's not really something that would be worth changing. Very much like the QWERTY layout is not all that efficient, but everyone knows it so forcing everyone to change would not be clever. You might possibly like Dvorak and be willing to buy a Dvorak keyboard (or pull the keys and rearrange a standard one), but if Dell decided it would sell all its computers with Dvorak as standard it would lose money.
Frankly, if someone produces an OS which meets your specs (and doesn't cost a million quid) then I could care less whether it's free / open source. One off payment of a couple hundred isn't worth worrying about.
On the other hand, in the world of operating systems that actually exist, it's nice to be able to think that the annoying bug can be fixed - and if you want to you can fix it yourself.
DL source... check
Add a couple of bits of code from Bagle and Netsky... check
Compiles... check
Crack the server, upload the new release...
Welcome to 100 million Firefoxes overnight!