Interesting article, but this is something that has been happening and will continue to.
Technology being put to use to seek out enemies of the state for the world governments is nothing new.
Atleast it is a good thing that companies are making good money in the process. Your privacy? That was lost long ago.
It was only a matter of time before this happened. Atleast be glad that we've not yet reached the stage where they'd bother having your entire genome sequence to create solutions and replacements for you:-)
Perhaps the author of the article has just read Cryptonomicon or something.
Get over it, companies will track you, governments will monitor it. And there will be people who will beat both, and people who will be susceptible to both. Unfortunate, but hey, paranoia does not help either.
And oh, first post?
Reminds me of...
by
gpinzone
·
· Score: 5, Interesting
...how the Bayesian spam filters operate (on a much smaller scale). They find predictors of "spam" like these guys find predictors of "terrorists."
If the false positives of this system finding terrorists are as low as the ones that identify spam, is it really unreasonable to consider that probable cause for an investigation? At least, until the 0.000001% slips by and causes a lawsuit for wrongful arrest.
Re:Reminds me of...
by
Anonymous Coward
·
· Score: 2, Interesting
With a spam filter, the penalty for false positive is perhaps a lost sale or an annoyed friend/coworker.
With a terrorist classification filter, the penalty for a false positive could cost some innocent person days/weeks in prison and thousands of dollars in lost wages and legal fees. And thats assuming they are a US citizen. A non-citizen could be held indefinitely complely destroying any career they might have.
Re:Reminds me of...
by
gpinzone
·
· Score: 3, Interesting
Yes, but remember that the current methods aren't much better. I mean, right now there's lots of complaints about how the USA is racially profiling Middle Eastern men. Whether or not this profiling is justified could be based on a report of such a filter.
The issue isn't whether or not we should use data mining to profile individuals or groups. Profilling will occur no matter what. What these methods do are help find parameters that more accurately identify candidates rather than just assume all Middle Easterners are automatically guilty until proven otherwise.
to what end?
by
loveandpeace
·
· Score: 2, Interesting
the more i read about data mining, the more it seems to provide a conectinvity and interaction leap, a step we are really due, in a technological sense.
when the internet was new and all (shortly after Al Gore invented it), there was much talk of how Big Brother would swoop in and turn us into ones and zeros, monitor our every move, and control us through the new portal. that hasn't happened yet (though Ashcroft is trying).
doese it seem that data mining is more harmful (making us all into terrorsts for buying fireworks and seeing born on the fourth of july in the same day) than good (allowing better prediction of supply and demand to lower costs and raise productivity)?
profiteering?
by
SHEENmaster
·
· Score: 5, Interesting
Today, however, companies that excel in connecting the data dots are finding a lifeline in a customer whose IT ineptitude is matched only by its means: the U.S. government, which will spend $53 billion on information technology this year. The Federal Government's inability to share and analyze information became clear in the months after the 9/11 attacks.
While I want argue against the governments inability to do anything but waste money, I do think that these "anti-terrorism" dealies are going too far. We know that they are spending $53 billion on information technology. When they spend it on a hammer or a toilet seat I know that something is getting done, but "information technology" makes me suspicious.
Granted my opinion is largely a result of window flags selling in excess of twenty dollars and not hearing the results of such spending. In fact, I haven't heard of a single terrorist act averted since 9/11. It couldn't hurt to inform us when the spending pays off; could it?
Is this information actually getting results, or is it just profiteering of the corporations that we so love to slander and libel?
-- You can't judge a book by the way it wears its hair.
Data mining for consumers?
by
Anonymous Coward
·
· Score: 1, Interesting
"Throughout the '90s, data mining spread from one industry to the next, enabling companies to know more about customers' needs and to zero in on the characteristics that distinguish the customers they want from those they do not. A credit-card company using a system designed by Teradata, a division of NCR, found that customers who fill out applications in pencil rather than pen are more likely to default. A major hotel chain discovered that guests who opted for X-rated flicks spent more money and were less likely to make demands on the hotel staff, according to privacy consultant Larry Ponemon. These low-maintenance customers were rewarded with special frequent-traveler promotions. Victoria's Secret stopped uniformly stocking its stores once MicroStrategy showed that the chain sold 20 times as many size-32 bras in New York City as in other cities and that in Miami ivory was 10 times as popular as black. Aspect Communications, based in San Jose, Calif., sells a program that identifies callers by purchase history. The bigger the spender, the quicker the call gets picked up. So if you think your call is being answered in the order in which it was received, think again."
Couldn't the consumer use such information to get a better deal? Also of course there's the "abuse" aspects for the businesses, amd governments that use this.
Makes me think of Bowling For Columbine
by
flopsy+mopsalon
·
· Score: 2, Interesting
I couldn't help noticing the Time.com article made reference to crime and terrorism, particularly the September 11 WTC/Pentagon attacks (which happened over a year ago), and to the recent Washington Sniper killings (which ended months ago), in spite of the fact that this article would have been jst as fascinating if they had simply used the business examples as illustration.
In the movie 'Bowling For Columbine' Michael Moore speculates that one of the root causes of gun violence in the US is the type of fearmongering the US media engages in in an effort to keep their sales/ratings up.
It looks like Time.com's gratuitous exploitation of US fears of crime and terrorism might be an example of this.
Open Source DateMining!
by
cosmosis
·
· Score: 4, Interesting
Ok, I've been annoyed for years at the disparity between corporations and customers in who knows what about who. I think its time someone came up with a P2p, open source, reputation system in which we can turn the lens of datamining back on them. Technologies like Cuejack combined with the efforts of groups like Transparency International, can help bring about Participitory Capitalism.
Data Mining as used by Colombian Drug Cartels ...
by
Anonymous Coward
·
· Score: 4, Interesting
Here is a real life story about data mining and its potential for brutal consequences. This was a very early application. Those who were fingered were killed. Of course, they adopted our new (lack of) due process rules a decade ago...
Data mining companies
by
MrWa
·
· Score: 2, Interesting
So "Data-mining companies have been among the hardest hit in recent years" is claimed by Time.com, which goes on to use MicroStrategy as a prime example of a company that skyrocketed in value and plummeted in the "tech crash" later. Oh, and by the way, they also overstated earnings. What these articles about the "tech crash" need to do is normalize the comparisions, because these companies that balloned in value so much, then crashed, probably just experienced a slight correction due to the stupid values they attained to begin with!
As for datamining itself: more power to them. The government gaining the ability to mine the data it already have should mean that we don't need more organizations, more intrusive investigations, etc. Every report or credible news item about post-9/11 studies indicates that we already had enough information, so there should be no need to create new laws that allow for more information to be collected. Just use what you have already, kthx.
What would be nice is if this data-mining allowed Muslims living in the U.S. to stop having to wrry whenever they go outside. Look at the information publicly available, that may provide patterns of "nonobvious" connections, and let people live thier lives in peace, regardless of background.
As a consumer, everything I do in public I consider public information. If a business uses this to better serve me, all the better. Maybe this will mean I don't have to watch feminine ads on TV, or the phone gets answered faster when I call. Maybe it just means that the customer rep knows my name and what I bought already.
Digging For Autism Correlations
by
Baldrson
·
· Score: 2, Interesting
This is a case where what was "mined" was not just the raw data but various arithmetic combinations of statistical variables derived from the data. There needs to be some additional work to make the figure of merit, not just correlation but statistical significance. I couldn't find Perl modules that provide "alpha" (probability the null hypothesis is true) for correlations.
At the end of the article, it mentions data mining helping to catch the DC snipers. Whoooooooa.
The cops had profiled a white male Christian terrorist, and that's all they were looking for. You didn't catch the article, but the real perps were stopped **10** times at roadblocks, they were in custody that many times.
And they were let go, their skin color contradicted what the data mining told them. They weren't caught until a Maryland state trooper leaked the license plate, then a trucker at a rest stop made the collar.
Data mining won't solve the stupidity of leaders like Chief Moose.
Re:Before You Jeer...
by
arasinen
·
· Score: 2, Interesting
Another good book that explains the basics of data mining is Principles of Data Mining by Hand et al.
It is perhaps not the most simple book around, but it covers a lot of important issues. Furthermore it doesn't ignore the role of computer science, as two of the authors have a CS background.
You won't find explicit instructions about how to build your own Google, but it surely does wonders for your insight.
-- [ Antti Rasinen ]
Data Mining is the wrong term
by
nrobert
·
· Score: 2, Interesting
Ther term data mining is misleading. Mining is more a matter of sifting through lots of junk to get at the valuable material. That's not exactly what 'data mining' is about.
If you want valuable information and you know what you're looking for, you just query. Find X in pile of data. That's mining. I know it's a semantic comment, but mining's not what we're talking about doing here.
Data mining is more like what geneticists searching for a genetic cause for a cancer are doing. Finding usable correlations and meaningful precursors. We don't call cancer-fighting biologists 'gene miners'. I think the term mining belittles a more complicated activity.
A better term? Data Correlating? Mining also just sounds brutish.
-- ---
Programmers do it with their digits!
Re:The problem with automatic identification
by
nrobert
·
· Score: 2, Interesting
I'm not sure the goal is to have the miner spit out names of confirmable terrorists with that kind of accuracy. You're comment is fair if you're looking for that kind of entirely automated solution, but that's not the goal. It doesn't need to be 100% accurate in order to mitigate risk and pay for itself. Neither does the J Crew web site product predictor.
The goal is definitely to help single out people that are worth further investigation. By motivated, thinking, observant humans. That's all.
I also think you might be a little bit reductionist in your estimate of 100 terrorists. It's quite possible that there are many more, though I suppose it doesn't matter because even if you're looking for just one person, it's still worth doing.
Given that you're looking for a reasonably good filter to find qualifiers for a round of investigation, a better metric to use might be the number of people you're willing to investigate as a ratio against those you hope to positively I.D. You might argue that you'd be happy to investigate 5,000 people just to find one 'terrorist'. If so, and you're looking for an estimated 100 terrorists, you can multiply to get the number of 'persons of interest' of 500,000 or.19% of the USA population. This % is much more achievable, and besides, then you use a different algo to ID which of these you should interview first or do MORE research on first.
It seems pretty managable to me. I also think your assessment of the 50% false negative rate is too rosy. It seems to me that the risks would be serious enough of even 1 getting away (as in scanning baggage for instance) that you'd want to cast the widest net possible and then narrow those carefully. False negatives may be more costly than you are suggesting.
-- ---
Programmers do it with their digits!
The Beast
by
macdaddy357
·
· Score: 3, Interesting
Does this data mining stuff remind anyone of the old urban legend about "The Beast?" A super computer in Antwerp of Brussels that knows everythin about everyone? Is that idea still as ridiculous as it was back in the day?
Interesting article, but this is something that has been happening and will continue to.
:-)
Technology being put to use to seek out enemies of the state for the world governments is nothing new.
Atleast it is a good thing that companies are making good money in the process. Your privacy? That was lost long ago.
It was only a matter of time before this happened. Atleast be glad that we've not yet reached the stage where they'd bother having your entire genome sequence to create solutions and replacements for you
Perhaps the author of the article has just read Cryptonomicon or something.
Get over it, companies will track you, governments will monitor it. And there will be people who will beat both, and people who will be susceptible to both. Unfortunate, but hey, paranoia does not help either.
And oh, first post?
...how the Bayesian spam filters operate (on a much smaller scale). They find predictors of "spam" like these guys find predictors of "terrorists."
If the false positives of this system finding terrorists are as low as the ones that identify spam, is it really unreasonable to consider that probable cause for an investigation? At least, until the 0.000001% slips by and causes a lawsuit for wrongful arrest.
the more i read about data mining, the more it seems to provide a conectinvity and interaction leap, a step we are really due, in a technological sense. when the internet was new and all (shortly after Al Gore invented it), there was much talk of how Big Brother would swoop in and turn us into ones and zeros, monitor our every move, and control us through the new portal. that hasn't happened yet (though Ashcroft is trying). doese it seem that data mining is more harmful (making us all into terrorsts for buying fireworks and seeing born on the fourth of july in the same day) than good (allowing better prediction of supply and demand to lower costs and raise productivity)?
Today, however, companies that excel in connecting the data dots are finding a lifeline in a customer whose IT ineptitude is matched only by its means: the U.S. government, which will spend $53 billion on information technology this year. The Federal Government's inability to share and analyze information became clear in the months after the 9/11 attacks.
While I want argue against the governments inability to do anything but waste money, I do think that these "anti-terrorism" dealies are going too far. We know that they are spending $53 billion on information technology. When they spend it on a hammer or a toilet seat I know that something is getting done, but "information technology" makes me suspicious.
Granted my opinion is largely a result of window flags selling in excess of twenty dollars and not hearing the results of such spending. In fact, I haven't heard of a single terrorist act averted since 9/11. It couldn't hurt to inform us when the spending pays off; could it?
Is this information actually getting results, or is it just profiteering of the corporations that we so love to slander and libel?
You can't judge a book by the way it wears its hair.
"Throughout the '90s, data mining spread from one industry to the next, enabling companies to know more about customers' needs and to zero in on the characteristics that distinguish the customers they want from those they do not. A credit-card company using a system designed by Teradata, a division of NCR, found that customers who fill out applications in pencil rather than pen are more likely to default. A major hotel chain discovered that guests who opted for X-rated flicks spent more money and were less likely to make demands on the hotel staff, according to privacy consultant Larry Ponemon. These low-maintenance customers were rewarded with special frequent-traveler promotions. Victoria's Secret stopped uniformly stocking its stores once MicroStrategy showed that the chain sold 20 times as many size-32 bras in New York City as in other cities and that in Miami ivory was 10 times as popular as black. Aspect Communications, based in San Jose, Calif., sells a program that identifies callers by purchase history. The bigger the spender, the quicker the call gets picked up. So if you think your call is being answered in the order in which it was received, think again."
Couldn't the consumer use such information to get a better deal? Also of course there's the "abuse" aspects for the businesses, amd governments that use this.
In the movie 'Bowling For Columbine' Michael Moore speculates that one of the root causes of gun violence in the US is the type of fearmongering the US media engages in in an effort to keep their sales/ratings up.
It looks like Time.com's gratuitous exploitation of US fears of crime and terrorism might be an example of this.
Ok, I've been annoyed for years at the disparity between corporations and customers in who knows what about who. I think its time someone came up with a P2p, open source, reputation system in which we can turn the lens of datamining back on them. Technologies like Cuejack combined with the efforts of groups like Transparency International, can help bring about Participitory Capitalism.
Power to the people!
Planet P Blog - Liberty with Technology.
www.enthea.org
Here is a real life story about data mining and its potential for brutal consequences. This was a very early application. Those who were fingered were killed. Of course, they adopted our new (lack of) due process rules a decade ago...
2 06 ,00.html
http://www.business2.com/articles/mag/0,1640,41
As for datamining itself: more power to them. The government gaining the ability to mine the data it already have should mean that we don't need more organizations, more intrusive investigations, etc. Every report or credible news item about post-9/11 studies indicates that we already had enough information, so there should be no need to create new laws that allow for more information to be collected. Just use what you have already, kthx.
What would be nice is if this data-mining allowed Muslims living in the U.S. to stop having to wrry whenever they go outside. Look at the information publicly available, that may provide patterns of "nonobvious" connections, and let people live thier lives in peace, regardless of background.
As a consumer, everything I do in public I consider public information. If a business uses this to better serve me, all the better. Maybe this will mean I don't have to watch feminine ads on TV, or the phone gets answered faster when I call. Maybe it just means that the customer rep knows my name and what I bought already.
So, I decided to mine almost 200 by-State demographic variables for correlates to autism by running through every combination of 2 variables via multiplication or division under a polynomial, exponential or null transformation -- then sorted them by their correlation to autism in the year 2000.
This is a case where what was "mined" was not just the raw data but various arithmetic combinations of statistical variables derived from the data. There needs to be some additional work to make the figure of merit, not just correlation but statistical significance. I couldn't find Perl modules that provide "alpha" (probability the null hypothesis is true) for correlations.
Seastead this.
At the end of the article, it mentions data mining helping to catch the DC snipers. Whoooooooa.
The cops had profiled a white male Christian terrorist, and that's all they were looking for. You didn't catch the article, but the real perps were stopped **10** times at roadblocks, they were in custody that many times.
And they were let go, their skin color contradicted what the data mining told them. They weren't caught until a Maryland state trooper leaked the license plate, then a trucker at a rest stop made the collar.
Data mining won't solve the stupidity of leaders like Chief Moose.
Another good book that explains the basics of data mining is Principles of Data Mining by Hand et al.
It is perhaps not the most simple book around, but it covers a lot of important issues. Furthermore it doesn't ignore the role of computer science, as two of the authors have a CS background.
You won't find explicit instructions about how to build your own Google, but it surely does wonders for your insight.
[ Antti Rasinen ]
If you want valuable information and you know what you're looking for, you just query. Find X in pile of data. That's mining. I know it's a semantic comment, but mining's not what we're talking about doing here.
Data mining is more like what geneticists searching for a genetic cause for a cancer are doing. Finding usable correlations and meaningful precursors. We don't call cancer-fighting biologists 'gene miners'. I think the term mining belittles a more complicated activity.
A better term? Data Correlating? Mining also just sounds brutish.
--- Programmers do it with their digits!
I'm not sure the goal is to have the miner spit out names of confirmable terrorists with that kind of accuracy. You're comment is fair if you're looking for that kind of entirely automated solution, but that's not the goal. It doesn't need to be 100% accurate in order to mitigate risk and pay for itself. Neither does the J Crew web site product predictor.
The goal is definitely to help single out people that are worth further investigation. By motivated, thinking, observant humans. That's all.
I also think you might be a little bit reductionist in your estimate of 100 terrorists. It's quite possible that there are many more, though I suppose it doesn't matter because even if you're looking for just one person, it's still worth doing.
Given that you're looking for a reasonably good filter to find qualifiers for a round of investigation, a better metric to use might be the number of people you're willing to investigate as a ratio against those you hope to positively I.D. You might argue that you'd be happy to investigate 5,000 people just to find one 'terrorist'. If so, and you're looking for an estimated 100 terrorists, you can multiply to get the number of 'persons of interest' of 500,000 or .19% of the USA population. This % is much more achievable, and besides, then you use a different algo to ID which of these you should interview first or do MORE research on first.
It seems pretty managable to me. I also think your assessment of the 50% false negative rate is too rosy. It seems to me that the risks would be serious enough of even 1 getting away (as in scanning baggage for instance) that you'd want to cast the widest net possible and then narrow those carefully. False negatives may be more costly than you are suggesting.
--- Programmers do it with their digits!
Does this data mining stuff remind anyone of the old urban legend about "The Beast?" A super computer in Antwerp of Brussels that knows everythin about everyone? Is that idea still as ridiculous as it was back in the day?
How ya like dat?