Why would it raise these questions? I don't think anyone would disagree that computers are far better at matrix algebra than humans could ever be, why isn't that the test? The ability to invert matrices differentiates from the other orders more so than language does anyway. Why this arbitrary test? It doesn't seem to have anything more to do with 'consciousness' than an ATM does. I'm not trying to discredit the hard work and progress here, but jumping to consciousness is probably not going to happen in software.
If the statistically significant difference is a clinically insignificant measure.
Then you were underpowered for a certain outcome, sure. But that just begs the question, and my initial claim was that post-hoc power analysis is worthless, which I stand by. Also, remember that just because a study does not show an effect, does not imply it is underpowered.
And with all due respect, I am well aware of surrogate outcome trials, and the trial in particular that you brought up, assuming you are talking about WHI. My boss was on the DSMB for that study.
Not awfully shabby, small study though. No power analysis (how many patients would be needed to validly determine if an 18% difference in 'outcomes' was real). Note the hedging on outcomes - here is the real problem with the study.
This is statistical nonsense. Post-hoc power analyses are a one-to-one mapping to the observed p-value, and serve no real purpose. One of many references to this is found here:
I believe you may be confused about what power is measuring. Power is, in English, the ability to detect a hypothesized effect given certain statistical properties of the experiment. The p-value is a measure of whether or not the difference in outcomes is real, to wit, it is the probability that we see results as extreme or more extreme than these given the null hypothesis is true. In this case, it's very small (I assume the p-value is missing a leading decimal point in your quotes.) Therefore, we reject the null hypothesis. Power analysis has nothing to do with this really, so your first paragraph does not make sense to me.
Importantly, all that increasing the sample size would do is to make the CIs around the odds ratio smaller really. How can a study that shows significance have too small a sample size? That makes no sense whatsoever!
And your claim that improving the correct treatment decisions makes no difference to the patients is an interesting one if it is true...Could this part of the study be underpowered? Post-hoc analysis won't help...
Finally, how do you define a "lousy" p-value? If these were predefined hypotheses and the level of significance was reached, then these are statistically valid conclusions...
OK, I'm sick of this. Some pedant who probably doesn't know UMVUE from UMP always chimes in when someone mentions the words 'average' and 'median' within 1000 syllables of each other.
I have a Master's degree in Statistics, a BS in mathematics, and work as a statistician.
There is really not strict mathematical definition of 'average'. There is a concept of averages as measures of central tendency. However, I've just consulted three of my theoretical statistical inference texts, and not a single one of them has an index entry for the word 'average'. They of course have index entries for 'mean' and 'median'.
Both mean and median are types of averages, neither inherently 'better' than the other. You won't find the word 'average' used in much technical literature because of this. You specify your statistic more precisely than that.
So the next time you see the word 'average', don't freak out about it. If someone doesn't specify what they mean, ask them, that's an important question, and something you should think about. You're just arguing semantics and come off as uninformed, if not a bit annoying.
An interesting idea to be sure but I know I certainly am not that consistent when I type, so I'm skeptical of how well this may work.
That's precisely what some statistical methods are designed to do, find patterns about the inconsistencies. I haven't read this proposal, so can't comment more, but 'leaning' in the presence of variation is basically what modern statistics is all about.
Every few years, I take a stab at installing whatever 'user friendly' distro of Linux exists at the time. I actually just installed Ubuntu 7.10 on a laptop of mine two nights ago. Overall, the experience is much improved. Actually, drastically improved over my last attempt several years ago. My wireless card just worked, which used to be the main hassle (I know why.).
The only problem I now have is with dual monitor support. It seems like a hodge-podge of ideas, nowhere very clearly defined. I don't know if I need Xinemara, TwinView, or both? I've tried countless combinations of "vsync to blank" (3 different locations), setting the vertical refresh rate (3 different values depending on where I look), none of which are 60 hz. There are many lockups while trying to change these settings through the nvidia driver settings.
I realize none of this is Ubuntu's fault, per se. Still, my multiple monitors works flawlessly in Windows without any fuss. It just seems obvious what to do there for me.
So while there have been great strides, I am excited to see the continual improvement in areas like these.
I did keep Ubuntu on the laptop and plan on using it, just with only one monitor for now.
If 7 dollars is too much for an unenhanced SNES game, what do you think a fair price is, 6 dollars? They can only go so low. I spent 7 dollars on a McDonald's value meal at O'Hare this weekend. I spent 6 bucks on a coffee this morning. THOSE are outrageous prices. Getting to play ActRaiser/SMW/Mario Kart/etc. again sure as hell gives me more value for my dollar than that Big Mac did. If you won't get 5-7 dollars worth of enjoyment out of it, don't buy the game.
This is only 'not possible within nature' if you make some weird divide when defining nature between humans and everything else in the world. I realize that in the past this was a common thing to do, especially in many religions. But can someone explain what is 'not natural' about humans? Why are the structures we build in cities any 'less natural' than a bird building a nest?
Why does every single time someone mention 'sample size', they get modded up? Look, the reason you calculate sample size for a study is so that you have an adequately powered trial to show some hypothesized effect size. If their paper is well written, they will have a small section on what they were trying to prove, and why N=270 would give them enough power to do it. All you have is one number and a gut feeling. As someone else said, what should their sample size have been then? It's completely dependent on what they were measuring. If they were able to reach statistical significance on a prespecified hypothesis, then obviously N=270 was enough!
From the Netflix Prize FAQ, they say how they currently do it:
"Straightforward statistical linear models with a lot of data conditioning."
The Netflix programmers shouldn't necessarily get special recognition for using least-squares modeling, but feel free to pass on your praise to Gauss, Legendre, Galton, and Fisher.
What's amazing is how hard it is to improve drastically on these 150-year-old statistical techniques.
At my school, classes called 'critical thinking' were logic courses. You learned the difference between valid and invalid arguments, the classical logical fallacies, etc. In that sense, it can certainly be taught.
'Predict' has a specific meaning in statistics and machine learning. It definitely does *not* mean accurately predicting outcomes in every situation. Not to belittle this group's work, because it is no doubt important and complicated, but it is not going to magically 'predict the unpredictable'.
Also, within medical research, a clear divide must be appreciated between randomized, controlled, clinical trials and epidemiology. A well-run clinical trial is about as good of an experiment as you can do. Patients and doctors remain blinded to the actual treatment so biases are not introduced.
Epidemiology is almost always done retrospectively, and while it may have its uses, there are *always* going to be possible confounding variables when patients are not randomized before receiving a treatment.
So please don't confuse clinical trials with epidemiological studies, the former are regarded as the "gold standard" for showing efficacy of new drugs and devices, while the latter would never serve that purpose.
20 years from now, people are going to be laughing as hard and reminiscing at our current technology and ads for it.
"4 GB of memory, lol, amazing they could do anything with that!! Coders must have been gods back then to get any performance out of those machines. I miss those days! Sigh...."
tabbed paradigm
Who do you think you are?
Surely not Kuhn
You probably thought it deep
When Neo said "no spoon"
I'd finish this poem
but there's no word to rhyme
with so pretentious a concept
as 'tabbed paradigm'
Why would it raise these questions? I don't think anyone would disagree that computers are far better at matrix algebra than humans could ever be, why isn't that the test? The ability to invert matrices differentiates from the other orders more so than language does anyway. Why this arbitrary test? It doesn't seem to have anything more to do with 'consciousness' than an ATM does. I'm not trying to discredit the hard work and progress here, but jumping to consciousness is probably not going to happen in software.
Would a security expert really by "stunned" by this? Sounds like business as usual to me.
Can someone write an emulator so that we can we WIZ on our Wiis?
If the statistically significant difference is a clinically insignificant measure.
Then you were underpowered for a certain outcome, sure. But that just begs the question, and my initial claim was that post-hoc power analysis is worthless, which I stand by. Also, remember that just because a study does not show an effect, does not imply it is underpowered.
And with all due respect, I am well aware of surrogate outcome trials, and the trial in particular that you brought up, assuming you are talking about WHI. My boss was on the DSMB for that study.
Not awfully shabby, small study though. No power analysis (how many patients would be needed to validly determine if an 18% difference in 'outcomes' was real). Note the hedging on outcomes - here is the real problem with the study.
This is statistical nonsense. Post-hoc power analyses are a one-to-one mapping to the observed p-value, and serve no real purpose. One of many references to this is found here:
http://www.childrensmercy.org/stats/weblog2005/PostHocPower.asp
I believe you may be confused about what power is measuring. Power is, in English, the ability to detect a hypothesized effect given certain statistical properties of the experiment. The p-value is a measure of whether or not the difference in outcomes is real, to wit, it is the probability that we see results as extreme or more extreme than these given the null hypothesis is true. In this case, it's very small (I assume the p-value is missing a leading decimal point in your quotes.) Therefore, we reject the null hypothesis. Power analysis has nothing to do with this really, so your first paragraph does not make sense to me.
Importantly, all that increasing the sample size would do is to make the CIs around the odds ratio smaller really. How can a study that shows significance have too small a sample size? That makes no sense whatsoever!
And your claim that improving the correct treatment decisions makes no difference to the patients is an interesting one if it is true...Could this part of the study be underpowered? Post-hoc analysis won't help...
Finally, how do you define a "lousy" p-value? If these were predefined hypotheses and the level of significance was reached, then these are statistically valid conclusions...
For instance, I don't think most people refer to sending email as using the web.
You must not get out much.
OK, I'm sick of this. Some pedant who probably doesn't know UMVUE from UMP always chimes in when someone mentions the words 'average' and 'median' within 1000 syllables of each other.
I have a Master's degree in Statistics, a BS in mathematics, and work as a statistician.
There is really not strict mathematical definition of 'average'. There is a concept of averages as measures of central tendency. However, I've just consulted three of my theoretical statistical inference texts, and not a single one of them has an index entry for the word 'average'. They of course have index entries for 'mean' and 'median'.
Both mean and median are types of averages, neither inherently 'better' than the other. You won't find the word 'average' used in much technical literature because of this. You specify your statistic more precisely than that.
So the next time you see the word 'average', don't freak out about it. If someone doesn't specify what they mean, ask them, that's an important question, and something you should think about. You're just arguing semantics and come off as uninformed, if not a bit annoying.
A similar method of attack, layer 1 hijacking has been around at least 10 years now.
An interesting idea to be sure but I know I certainly am not that consistent when I type, so I'm skeptical of how well this may work.
That's precisely what some statistical methods are designed to do, find patterns about the inconsistencies. I haven't read this proposal, so can't comment more, but 'leaning' in the presence of variation is basically what modern statistics is all about.
Every few years, I take a stab at installing whatever 'user friendly' distro of Linux exists at the time. I actually just installed Ubuntu 7.10 on a laptop of mine two nights ago. Overall, the experience is much improved. Actually, drastically improved over my last attempt several years ago. My wireless card just worked, which used to be the main hassle (I know why.).
The only problem I now have is with dual monitor support. It seems like a hodge-podge of ideas, nowhere very clearly defined. I don't know if I need Xinemara, TwinView, or both? I've tried countless combinations of "vsync to blank" (3 different locations), setting the vertical refresh rate (3 different values depending on where I look), none of which are 60 hz. There are many lockups while trying to change these settings through the nvidia driver settings.
I realize none of this is Ubuntu's fault, per se. Still, my multiple monitors works flawlessly in Windows without any fuss. It just seems obvious what to do there for me.
So while there have been great strides, I am excited to see the continual improvement in areas like these.
I did keep Ubuntu on the laptop and plan on using it, just with only one monitor for now.
Some things are way off: 'The car accelerates to 150 mph in the city's suburbs, then hits 250 mph in less built-up areas
Speak for yourself...
"free access to its entire iTunes music library in exchange for paying"
You keep using that word. I do not think it means what you think it means.
If 7 dollars is too much for an unenhanced SNES game, what do you think a fair price is, 6 dollars? They can only go so low. I spent 7 dollars on a McDonald's value meal at O'Hare this weekend. I spent 6 bucks on a coffee this morning. THOSE are outrageous prices. Getting to play ActRaiser/SMW/Mario Kart/etc. again sure as hell gives me more value for my dollar than that Big Mac did. If you won't get 5-7 dollars worth of enjoyment out of it, don't buy the game.
This is only 'not possible within nature' if you make some weird divide when defining nature between humans and everything else in the world. I realize that in the past this was a common thing to do, especially in many religions. But can someone explain what is 'not natural' about humans? Why are the structures we build in cities any 'less natural' than a bird building a nest?
This article http://www.phds.org/reading/elites.html always seemed good to me. It's been 15 years since it has been written now.
Why does every single time someone mention 'sample size', they get modded up? Look, the reason you calculate sample size for a study is so that you have an adequately powered trial to show some hypothesized effect size. If their paper is well written, they will have a small section on what they were trying to prove, and why N=270 would give them enough power to do it. All you have is one number and a gut feeling. As someone else said, what should their sample size have been then? It's completely dependent on what they were measuring. If they were able to reach statistical significance on a prespecified hypothesis, then obviously N=270 was enough!
I just heard that the governor has signed a law in Alabama raising the drinking age to 35. He wants to keep alcohol out of the high schools.
just a joke.
From the Netflix Prize FAQ, they say how they currently do it:
"Straightforward statistical linear models with a lot of data conditioning."
The Netflix programmers shouldn't necessarily get special recognition for using least-squares modeling, but feel free to pass on your praise to Gauss, Legendre, Galton, and Fisher.
What's amazing is how hard it is to improve drastically on these 150-year-old statistical techniques.
At my school, classes called 'critical thinking' were logic courses. You learned the difference between valid and invalid arguments, the classical logical fallacies, etc. In that sense, it can certainly be taught.
'Predict' has a specific meaning in statistics and machine learning. It definitely does *not* mean accurately predicting outcomes in every situation. Not to belittle this group's work, because it is no doubt important and complicated, but it is not going to magically 'predict the unpredictable'.
This is actually the type of story I love to see on Slashdot. A nice break from yet another "YRO" stuff.
Also, within medical research, a clear divide must be appreciated between randomized, controlled, clinical trials and epidemiology. A well-run clinical trial is about as good of an experiment as you can do. Patients and doctors remain blinded to the actual treatment so biases are not introduced.
Epidemiology is almost always done retrospectively, and while it may have its uses, there are *always* going to be possible confounding variables when patients are not randomized before receiving a treatment.
So please don't confuse clinical trials with epidemiological studies, the former are regarded as the "gold standard" for showing efficacy of new drugs and devices, while the latter would never serve that purpose.
I'm not certain as I am still kinda new here...
You must be new here.
20 years from now, people are going to be laughing as hard and reminiscing at our current technology and ads for it.
"4 GB of memory, lol, amazing they could do anything with that!! Coders must have been gods back then to get any performance out of those machines. I miss those days! Sigh...."