One of my favorite anti-Israeli pieces of propaganda that I read explained that the IDF used Blitzkrieg tactics. My reply to this was to explain that whomever didn't deserved to lose the war. Just because the Nazis did something doesn't mean it was wrong - at least on the engineering/technical side.
As a statistician - first rule of thumb - distrust any estimate that doesn't have a probably range.
As an avid news junkie - remember the millions that were going to starve in Afghanistan? Didn't happen. Tend to distrust similar estimates because of that.
Finally, look at the quality of life of the Kurds. If the US/UK is halfway decent in post-war administration (no guarantees), the rest of Iraq should be brought up to a similar level pretty quickly (i.e. enough food&medicine). They stop being hostages to world opinion.
When Denmark (population about 6.5 million, iirc) has the same say on policy as the Bay Area (population about 6.5 million, iirc). However, Denmark has a vote, and the Bay Area only has influence through the US (about 1/40th). The UN is not in the slightest bit democratic.
This, however, does not mean that the US shouldn't have tried harder to get UN cooperation - France's compromise seemed reasonable....
My guess would be that it has more to do with governmental contacts than expertise about computers - Gore must have a pretty good list of contacts throughout government by now, and if that can help Apple, why shouldn't they tap him to be on their board of directors? Beats another lawyer....
And, if you are teaching about science, would you rather people remember the science and how the science fits together to describe something, or a bunch of life stories that have nothing to do with the science?
Now, going through science by chronological development, and talking about Aristotle's theory of the elements, and problems therein, and the advantage of basing science on math and the developments of the renaissance would be an interesting approach. It also gets across that science is theories, and constantly in flux.
Knowing Einstein was a poor student doesn't help to understand Relativity, though. The experiments to find the "Ether" help to understand the need for Relativity.
Its more an issue of how much noise you can make. IIRC, you can broadcast a low strength signal whenever you want - this is akin to I can talk wherever I want, but I can't use amplified sound wherever I want.
The government already regulates your ability to talk at a very loud volume (i.e. broadcast) at ceratin places. And I'm glad they do:)
Evolution has been observed. Take some e-coli, look at how well it digests lactose and glucose. Then grow it for a few thousand generations on glucose. It now digests glucose more effeciently, and has lost much of its ability to digest lactose. That's called evolution, albeit on a small scale.
A friend of mine was between quality assurance jobs - tried interviewing with a game company - walked through a wearhouse of 18-22 y.o. male game testers, being the only female in the building (to her knowledge). Did not enjoy the feeling of every head turning to watch her walk to the manager for her interview.
Imagine the class action suit you could follow. Even if the lawyers take a 50% commission, how many people do you think would sign up for $250 for no work, just needing to produce an e-mail they were sent? The 10,000 recepients would net the lawyers up to 2.5 million dollars if the company could pay....now the few thousand dollars in court costs to go through discovery is peanuts.
One also wonders if you could get an injunction kicking any mail sent through a server off the 'net in the US....though the implications of this would be kinda scary....
I mentioned this earlier in the discussion - I'm repeating myself because it also applies here...
Using a list of the spam-sending IP's and Bayesian methods, one could assign a high prior probability of a message being Spam. The affect would be to slow down the connection on less evidence if its from a suspect IP address, and to require more evidence if its from an IP address that you trust. Thus you preferentially slow-down suspect computers, and allow your friends to get away with more spam-like messages before tarring them.
Also, forgot to mention before, its not the traffic that is being analyzed, but the spamminess of the message.
Bayesian methods would work well for this (mind you, I'm a pretty staunch frequentist on most issues). You could set up a prior probability of a message being spam based on where it is being sent from (one could even create a centralized list somewhere, such as exist for which IP's send a lot of spam) - if the message is from a suspect server, start off suspecting its spam - if its from your friend's mail server, be more skeptical. Then taking any of the piece-by-piece approaches, update your probability of spam, and act accordingly. This should help minimize the delerious affects on innocent servers, who just happen to send the odd piece of mail that looks like spam.
As I understand the system, it is meant for those receiving spam, not those unwittingly relaying it. The basic idea is that the laggier the network, the longer it takes to send a message. So if your mailserver pretends to be laggy, it will take more time for a computer to send Spam. Thus, less spam is sent. It has the added advantage of since it accepts every message (though it takes longer if it thinks the message is spam), there is no cost to the user for false positives. Set up the system on enough mailservers, increase the time it takes to send spam, and you decrease the volume of spam that can be sent from one computer, thus increasing its cost to send. As an additional benefit, those systems with open relays will be slowed down significantly if they are being used for spam, hopeuflly getting the sysadmin to do something about it.
This, of course, assumes I'm reading the article correctly:)
And there are those of us with taste who recognize that Law and Order is an excellent radio drama with optional visuals (had a reputation for watching at least 2 eps. a day in college), compulsively watch Buffy (I've missed one episode to date), and have a life.
Several former roomates derided me for watching Buffy. Most of them became fans. One flagged an article on Buffy for me which was my introduction to Salon.
I've never heard of Stata, so I can't compare. SAS is another standard package - its used more in industry than academia. I've never used it, so I can't compare to R...but most of the methods I read about are implemented in R much quicker than in SAS or MATLAB
As a statistician, I prefer R. Matlab's approach to statistics is to implement a bunch of formulas one could look up - R (or S-plus - I prefer the open source version) gives an interface that is closer to doing statistics. R has far more routines implemented than minitab (or Matlab, if one sticks to statistics). Additionally, most of the interesting applied statistical research that I've seen is implemented in R.
So the Hurst exponent works if you are assuming that what you are looking at resembles a random walk, and you want to know how far from a random walk you are.
What, in my opinion, this is missing, for determining randomness in a time series:
Robustness. This only works for a small set of time series. For instance, an MA(1) series (X(t) = e(t)+e(t-1), e(t) is i.i.d. for all t) will, if I'm understanding Hurst exponents properly, have a Hurst exponent of 0. However, an MA(1) series is almost completely random - in fact, you know less about future values from the present in an MA(1) series than in a Brownian Motion.
The ability to test a hypothesis. Sure, you can have a sense of how far your series is from a Random Walk, but how far do you have to be for it to be more than chance variation, so you can say you are (1-p)% sure it isn't a random walk?
Thank you. Hadn't heard of Hurst Exponents before. A question:
How far must the Hurst exponent deviate from 1/2 for you to consider something non-random? You need to be able to answer this to claim the ability to test for randomness.
Also:
Consider a random walk generated by a step of a size generated by a pseudo-random number generator. This will appear to be made of independent incremental steps of a set variance. Thus the Hurst exponent will be indistinguishable from 1/2. However, this is an entirely deterministic system. So the Hurst exponent can be tricked, and my original observation that the determination of randomness is dependent on your model holds.
I would also like to take this opportunity to thank you for introducing me to Hurst Exponents. I found a couple of interesting sites (here and ) because of this discussion.
Quack!
After a quick websearch:
There are other tests, such as examining the series of 1's and 0's representing whether a given value is above the series' mean. This can be tested against a known distribution if the data is random and identically distributed. However, a pseudo-random number generator would pass this test on reasonable sample sizes, and yet is entirely deterministic.
Quack!
What you might be thinking about is looking at the mutual information of the time series - that is looking the the Kullback-Leibler Divergence between the series and itself at a lag of k. The smaller the divergence, the closer the distributions are, and hence you have some measure of predictability. This does not give you any insight into what this relationship ship, only that there is a relationship. However, to the best of my knowledge (I could be mistaken) there are not yet methods of hypothesis testing developed for this - so you can say there is the least randomness at a certain lag, but that doesn't mean you can tell if it sufficeintly non-random to be considered more than chance.
There is no one equation that will tell you if something is totally random or if it is predictable - predictability depends on your model. There are a number of non-linear systems that will appear completely random if analyzed with linear models (i.e. pseudo-random number generators of the WWII era).
Quack!
Being a statistician I'd like to point out that you can define a random walk with any distribution that you want. The simple random walk is the sum of random variables that are either +1 or -1 (think flipping a coin, take a step to the left on a heads, and a step to the right on a tails). If each of your steps has a gaussian distribution, then you are observing a Brownian Motion on a lattice (i.e. at integer times). No reason why you can't use a heavy-tailed distribution to define your random walk.
Insofar as this is my first post, I think I have something to say for the mass of lurkers on forums such as/.
I rarely post to forums because I rarely have any opinion or piece of knowledge that hasn't already been mentioned that I think is worth several hundred people's time to read. In a typical thread I find 2-4 comments I feel make my criteria of being worth posting (if I were the poster). Not wanting to be taken for a troll, I'll quickly add that I enjoy a substantial portion of the posts I read - just most of them don't meet the high bar I've set for myself for posting to a popular forum. Its not that I don't want to share my opinions, but that if everybody shared their opinion we'd have a lot of noise about stuff most of us don't care about, so I set a high threshold for myself.
That, and I'm not in computers so I rarely have much knowledge to add to the discussions:)
Now the politics....Ashcroft scares me....
As an avid news junkie - remember the millions that were going to starve in Afghanistan? Didn't happen. Tend to distrust similar estimates because of that.
Finally, look at the quality of life of the Kurds. If the US/UK is halfway decent in post-war administration (no guarantees), the rest of Iraq should be brought up to a similar level pretty quickly (i.e. enough food&medicine). They stop being hostages to world opinion.
Now, does that justify the war......
This, however, does not mean that the US shouldn't have tried harder to get UN cooperation - France's compromise seemed reasonable....
My guess would be that it has more to do with governmental contacts than expertise about computers - Gore must have a pretty good list of contacts throughout government by now, and if that can help Apple, why shouldn't they tap him to be on their board of directors? Beats another lawyer....
Now, going through science by chronological development, and talking about Aristotle's theory of the elements, and problems therein, and the advantage of basing science on math and the developments of the renaissance would be an interesting approach. It also gets across that science is theories, and constantly in flux.
Knowing Einstein was a poor student doesn't help to understand Relativity, though. The experiments to find the "Ether" help to understand the need for Relativity.
I'm pretty sure this wasn't encrypted - 'h' is appearing far too often....
Maybe me? By some miracle I still don't get any spam, despite having my e-mail address on several web pages....
The government already regulates your ability to talk at a very loud volume (i.e. broadcast) at ceratin places. And I'm glad they do:)
Evolution has been observed. Take some e-coli, look at how well it digests lactose and glucose. Then grow it for a few thousand generations on glucose. It now digests glucose more effeciently, and has lost much of its ability to digest lactose. That's called evolution, albeit on a small scale.
A friend of mine was between quality assurance jobs - tried interviewing with a game company - walked through a wearhouse of 18-22 y.o. male game testers, being the only female in the building (to her knowledge). Did not enjoy the feeling of every head turning to watch her walk to the manager for her interview.
Imagine the class action suit you could follow. Even if the lawyers take a 50% commission, how many people do you think would sign up for $250 for no work, just needing to produce an e-mail they were sent? The 10,000 recepients would net the lawyers up to 2.5 million dollars if the company could pay....now the few thousand dollars in court costs to go through discovery is peanuts. One also wonders if you could get an injunction kicking any mail sent through a server off the 'net in the US....though the implications of this would be kinda scary....
Using a list of the spam-sending IP's and Bayesian methods, one could assign a high prior probability of a message being Spam. The affect would be to slow down the connection on less evidence if its from a suspect IP address, and to require more evidence if its from an IP address that you trust. Thus you preferentially slow-down suspect computers, and allow your friends to get away with more spam-like messages before tarring them.
Bayesian methods would work well for this (mind you, I'm a pretty staunch frequentist on most issues). You could set up a prior probability of a message being spam based on where it is being sent from (one could even create a centralized list somewhere, such as exist for which IP's send a lot of spam) - if the message is from a suspect server, start off suspecting its spam - if its from your friend's mail server, be more skeptical. Then taking any of the piece-by-piece approaches, update your probability of spam, and act accordingly. This should help minimize the delerious affects on innocent servers, who just happen to send the odd piece of mail that looks like spam.
This, of course, assumes I'm reading the article correctly:)
Thanks for the feedback - makes me feel better sticking to R when my datasets can always fit on a CD with plenty of room to spare:)
And there are those of us with taste who recognize that Law and Order is an excellent radio drama with optional visuals (had a reputation for watching at least 2 eps. a day in college), compulsively watch Buffy (I've missed one episode to date), and have a life. Several former roomates derided me for watching Buffy. Most of them became fans. One flagged an article on Buffy for me which was my introduction to Salon.
I've never heard of Stata, so I can't compare. SAS is another standard package - its used more in industry than academia. I've never used it, so I can't compare to R...but most of the methods I read about are implemented in R much quicker than in SAS or MATLAB
As a statistician, I prefer R. Matlab's approach to statistics is to implement a bunch of formulas one could look up - R (or S-plus - I prefer the open source version) gives an interface that is closer to doing statistics. R has far more routines implemented than minitab (or Matlab, if one sticks to statistics). Additionally, most of the interesting applied statistical research that I've seen is implemented in R.
What, in my opinion, this is missing, for determining randomness in a time series:
Robustness. This only works for a small set of time series. For instance, an MA(1) series (X(t) = e(t)+e(t-1), e(t) is i.i.d. for all t) will, if I'm understanding Hurst exponents properly, have a Hurst exponent of 0. However, an MA(1) series is almost completely random - in fact, you know less about future values from the present in an MA(1) series than in a Brownian Motion.
The ability to test a hypothesis. Sure, you can have a sense of how far your series is from a Random Walk, but how far do you have to be for it to be more than chance variation, so you can say you are (1-p)% sure it isn't a random walk?
How far must the Hurst exponent deviate from 1/2 for you to consider something non-random? You need to be able to answer this to claim the ability to test for randomness.
Also: Consider a random walk generated by a step of a size generated by a pseudo-random number generator. This will appear to be made of independent incremental steps of a set variance. Thus the Hurst exponent will be indistinguishable from 1/2. However, this is an entirely deterministic system. So the Hurst exponent can be tricked, and my original observation that the determination of randomness is dependent on your model holds.
I would also like to take this opportunity to thank you for introducing me to Hurst Exponents. I found a couple of interesting sites (here and ) because of this discussion. Quack!
After a quick websearch: There are other tests, such as examining the series of 1's and 0's representing whether a given value is above the series' mean. This can be tested against a known distribution if the data is random and identically distributed. However, a pseudo-random number generator would pass this test on reasonable sample sizes, and yet is entirely deterministic. Quack!
What you might be thinking about is looking at the mutual information of the time series - that is looking the the Kullback-Leibler Divergence between the series and itself at a lag of k. The smaller the divergence, the closer the distributions are, and hence you have some measure of predictability. This does not give you any insight into what this relationship ship, only that there is a relationship. However, to the best of my knowledge (I could be mistaken) there are not yet methods of hypothesis testing developed for this - so you can say there is the least randomness at a certain lag, but that doesn't mean you can tell if it sufficeintly non-random to be considered more than chance.
There is no one equation that will tell you if something is totally random or if it is predictable - predictability depends on your model. There are a number of non-linear systems that will appear completely random if analyzed with linear models (i.e. pseudo-random number generators of the WWII era). Quack!
Being a statistician I'd like to point out that you can define a random walk with any distribution that you want. The simple random walk is the sum of random variables that are either +1 or -1 (think flipping a coin, take a step to the left on a heads, and a step to the right on a tails). If each of your steps has a gaussian distribution, then you are observing a Brownian Motion on a lattice (i.e. at integer times). No reason why you can't use a heavy-tailed distribution to define your random walk.
Insofar as this is my first post, I think I have something to say for the mass of lurkers on forums such as /.
I rarely post to forums because I rarely have any opinion or piece of knowledge that hasn't already been mentioned that I think is worth several hundred people's time to read. In a typical thread I find 2-4 comments I feel make my criteria of being worth posting (if I were the poster). Not wanting to be taken for a troll, I'll quickly add that I enjoy a substantial portion of the posts I read - just most of them don't meet the high bar I've set for myself for posting to a popular forum. Its not that I don't want to share my opinions, but that if everybody shared their opinion we'd have a lot of noise about stuff most of us don't care about, so I set a high threshold for myself.
That, and I'm not in computers so I rarely have much knowledge to add to the discussions:)