Wal-Mart's Data Obsession
g8oz writes "The New York Times covers Wal-Mart's obsession with collecting sales data.
Fun fact: 'Wal-Mart has 460 terabytes of data stored on Teradata mainframes, at
its Bentonville headquarters.
To put that in perspective, the Internet has less than half as much data, according to experts.'
That much information results in some interesting data-mining. Did you know hurricanes increase strawberry Pop Tarts sales 7-fold?"
and shopping there means your income has dropped 7-fold
Who says how much data the Internet has available?
Get your own free personal location tracker
would like to welcome our new (evil) data collecting overlords.
"Sanity is not statistical", George Orwell, "1984"
My company alone has over 50 terabytes of data available for download on the internet. Whoever thinks there's that little data on the internet is very poorly-informed.
you fools have no idea that I would never let you hurt the Wall-Mart
Someone at Walmart has ALOT of pr0n!
Even Walmart probably doesn't even know what all that data means. Think of the processing power needed to make sense out of it all. I'm sure there are countless interesting trends that are lost in that data ocean.
-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-
MaxPower (2263)
"I got it from a hair dryer."
When you have 460TB of data, how the hell do you even begin to search it?
Seems like they'd need to license map-reduce from google or something. (That's a distributed data correlation engine. With extremely high fault tolerence, to boot.)
the Internet has less than half as much data, according to experts
What's the word I'm looking for? Oh yeah - it's bullshit
...Microsoft has an astonishing amount of information collected from Windows Update users (none of it personally identifiable, of course).
I highly suspect Wal-Mart didn't get into the position it's in of being the largest retailer by being stupid, at least business-wise. This is the sort of project that allows them to stock a 120,000 square-foot big box store from JIT shipments every night, and why every Wal-Mart in a region looks the same. Though I would be interested to read more on the pop-tart to hurricane correlation...
they're storing them on a huge cluter of their $200 lindows systems. ;)
Marge, get me your address book, 4 beers, and my conversation hat.
Correlation doesn't imply causation!!!!!
I mean what if a third factor caused both the hurricanes and strawberry Pop Tart sales to increase 7-fold????
Somebody was going to blurt that bromide out at that statement, so it may as well be me.
Seastead this.
As a guest of WalMart I was able to enter their data center and see this Terraplex first hand. It's massive. It's thousands upon thousands of disks in ~8' frames, rows upon rows of racks. I walked down it and across it and up it and was simply awestruck by the idea of that many disks in one spot.
The gentleman who gave me the tour indicated they have something like 72 weeks (1 year plus 2 weeks) of purchase data on LIVE disk arrays, plus huge archives of the same data on tape. If you buy anything and use your credit, debit, or whatever card they can figure out your sales history obscenely quickly. Be afriad. Be very afraid.
I also got to see Walmart.com (Sun E15k) and Samsclub.com (A bunch of HP boxes in a smallish frame), they were creepy, in a sense... all those sales going on at once, converging on a spot not a few feet from me.
Comment removed based on user account deletion
If Walmart created a web interface for their data, would the amount of data on the Internet suddenly triple?
I think the expert they got their information from was full of baloney.
--
RumorsDaily
I've been reading the comments
I forgot, are we supposed to hate Wallmart?
On one hand they are a large corporate empire and on the other, they promote cheap linux computers.
arg, Im so confused
Yes I did. God help me!
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
A few years ago when I worked in retail, everything was going smoothly. Every night the managers would go around with electronic guns and see what needed ordering the next day. Except for the busiest times of the year the backroom was pretty much empty of stock, and on top of the aisles the extra stock was minimal.
Then one day, the managers were really excited, as we were going to have a computer order everything for us, from records of sales from before and it would "predict" what we would need. They said the extra stock on top of the aisles would be eliminated. We would be able to concentrate on customer service.
Well, the day came, and for a few months you could tell the computer was fighting with limited data. Some weeks would be rediculously overstocked on a few items, others, the leading sellers in the store would have empty shelves. When it finally settled down after a year, it was worse than before the computer.
The top of aisles were jammed to the ceiling with stock, there was never any room to put anything up there, and getting to the bottom for something you needed cost a lot of time. Plus, the backroom was packed with stock. You could hardly move around, and trying to find the last box of something buried underneath these huge piles was a task that killed your morale. During the slow months, one stocker for the whole store was enough for a night, now 3 were common to deal with all the stock.
Yeah, all that evil marketing data is really oppressing the masses and restricting the free flow of ideas.
Mathematics is made of 50 percent formulas, 50 percent proofs, and 50 percent imagination.
My brother sells mangoes to the Wal Mart Beast. He says it's all computerized, beginning with an order for the fruit, following the trucks, even the rotation of the ripening process in the warehouses is computer related. It's as close to virtual management as any company comes.
Anyone seen my jagged little pill?
The Law of truely large numbers.
Basically, the more data you have, the more likely you'll find weird coincidental correlations.
I guess these kinds of 'statistical finding' will become more and more prevalent in the future, given that we're living in an age where we're collecting ever-larger amounts of data, and have the resources to process all this data automatically.
It would be a good thing if people were a bit more sceptical of this kind of stuff. Correlation isn't causation.
Did you know?
EVERY TIME A LOAF OF BREAD IS BAKED,
APPROXIMATELY
150,000,000 YEASTS ARE
KILLED
Come to the award-winning 1987 film,
"The Very Small and Quiet Screams"
-- a cinematic electromicrograph of yeasts being baked.
A must for those who care about yeast, and especially for those who don't.
SPONSORED BY
Brown Anaerobe Rights Coalition (BARC)
Student Bakers for Social Responsibility
Coalition for the Elevation of Life (CELL)
Defend all life: "From greatest to least, from human to yeast!"
Help Fight SPAM today!
People who call themselves "experts" but are really just talking out of their asses do. Consider that The Internet Archive alone contains more than a petabyte (1024 terrabyte) of data, all of it accessible, and that they are adding on the order of 20 terrabyte a day, and you start realizing how much bigger the Web is.
Perhaps non redundant DATA?
I hate to sound like some pro-totalitarian next generation Big Brother, but it's not as if they are collecting personal information on customers without the customer's consent. Wal-Mart are just doing some major (I agree with obsessive though) market research so as they can optimise their stores to maximise profits, exactly the same as every other business in the world.
Fat people are hard to kidnap
Coworkers who have worked with Wal-mart IT tell me that Wal-mart does indeed have mountains of data. However, they have so much data that they do not know what to do about it. They can't interpret it all because there is just too much of it.
This makes me wonder... there must be some ideal point where a certain amount of data collected is worth the most money because you can act on that data. After that point, collecting additional data is increasingly more costly and counterproductive unless you invest in an infrastructure that lets you process more data. How does one figure out that ideal point? Just a thought.
Wal-Mart employees who use their employee discount cards have every purchase tracked and monitored.
Activity of the cards is ACTUALLY monitored for discrepencies in buying habits to find abusive employees who buy things for their friends?
Did you also know Wal-Mart's employee name badges have RFID tags (and have had for many years) that allow Wal-Mart to track where an employee is at any given time?
Another interesting tidbit, did you know at Wal-Mart's Jewelery warehouses they actually WEIGH the amount of metal in your body when you enter a leave? (And I don't mean they ask you to put things in a dish and weigh the dish - they scan YOU)
Another interesting thing, Wal-Mart has a fallout facility in Oklahoma that has a near-real-time backup of each BIT of that 460 terabytes of data?
Wal-Mart could survive a direct nuclear blast and still keep on a truckin'.
And, of course, if you're in a Wal-Mart home office - ISD building - distribution center - et al... and dial 911 - BOOM - you get Wal-Mart's private security? Niiice, hope it's not a real emergency, you first have to explain it to them - then if they deem it neccessary THEY will call the REAL 911!
Did you know hurricanes increase strawberry Pop Tarts sales 7-fold? ...and if you needed a 460 TB data array to tell you that then you're too stupid to live.
You're using her as bait, Master!
We learned a lot about Walmart and Data mining in my database 101 class. And the professor asks "Why do you think Walmart is so successful?"
And everyone says something about leveraging technology and JIT delivery, etc.
Professor Liu says "Nope. Location."
Walmart chose most of their initial locations in cities/regions where there was no other competition. Places where there was no Kmart, no department stores, no malls. And they flourished.
In the future, I would want to not be isolated from my friends in the Space Station.
The Internet definately has more data than Wal-Mart. Consider this old 2002 study. The "deep web" alone, comprised mostly of databases, comprises 91,850 TB of data. And this was a couple years ago. It doesn't include email or P2P either.
The definition they used for "Internet" was probably "web pages indexed with a search engine" which is definately not the entire Internet.
My company has 300,000 employees each of whom has about 40GB on their desktops. That's 12,000,000 GB which is 12,000 TB most of which is junk.
For which it stands, one store under God, indivisible, with sales and product for all.
From the article;
"You can see the pattern of Wal-Mart's mandates, and as Wal-Mart grows in power, it is getting more dictatorial.....Wal-Mart lives in a world of supply and command, instead of a world of supply and demand."
Take the cheese to sickbay, the doctor should see it as soon as possible - B'Elanna Torres, "Learning Curve"
Political parties are using consumer shopping patterns to figure out who to reach with 1-to-1 political messages.
Stuff like: women who buy from catalogs, eat "crunchy" peanut butter, own a cat and drive a minivan you are 87% more likely to react positively to prayer in schools as a "motivating issue."
I just made that up, but it's the sort of thing they find out. No tin-foil hats here - corporations and pollsters are shelling out millions of dollars for this stuff.
Here's a few google searches links to get you started:
Acxiom
Seisint
Uh, except that Google hasn't indexed all of the publicly available WWW. It's only indexed a small fraction of it. And the WWW isn't the Internet. They're different. Secondly, the Internet Archive alone has archived 1 petabyte of data so the figure of 230 terabytes of data on the Internet is obviously wrong.
Support the First Amendment. Read at -1
... do they have a freezer big enough for 460TB worth of drives?
Perhaps you should switch to Wal-Mart. I hear Wal-Mart Pharmacy has the cheapest anti-psychotic medications in the US.
Hugh Hefner?!? Dude, didn't think you'd be posting anonymously! Share the wealth, man :)
Condemnant quod non intellegunt.
That means that the internet has well over a petabyte of information on it, much of the information is probably the same but it is on the internet>
Also, don't forget that the internet includes Usenet and other services under the protocol, which has TONS of additional data. Chances are, the internet is not 230 terabytes large and the idiot who made that claim...is an idiot.
A blog like any other.
1.5 megabytes of data at walmart
/is/ tracked about customers. I worked with a Fort Lauderdale company a few years back that provided the back-end processing and data warehousing for many grocery discount card programs. They would routinely demonstrate that of the three-hundred data points they collected on a given consumer, one of them was the time of the month a woman had her period. Men weren't exempt either, as they tracked items such as condom sales and kept a score for us as well.
Understanding your method of assessing the data includes lumping data about vendors, data about shipping, inventory status (alone, a huge category), etc., 1.5 MB "per person" isn't huge. The error is in your model as most of the system contains data about things other than customers.
That said, you would be surprised what
The best thing a consumer can do to counteract this consumer surveillance is to toss junk into the system. Here are a few suggestions:
- borrow your mom's/mother-in-law's card and go on a shopping spree for frozen pizzas, candy corn, condoms and saran wrap.
- apply for new cards all the time. provide creative answers as to your address, occupation (animal disposal officer is one of my favorites - someone must be puzzled how many dead animals there are in my city from all the people with this occupation). BE SURE TO ONLY USE CASH with these cards so they don't get an identification anchor.
- spike the data with sustained purchases of one product for a period of time. this is especially fun at smaller retailers that use inventory management - keep buying them out of one product (preferably low cost and low shelf inventory so it is easier and cheaper to do). keep it up for 90 days. then stop buying it and go to another store.
The more you can junk up purchases (especially on anchored cards like friends, in-laws, etc. that have different buying habits), the less valuable the database is.
That much information results in some interesting data-mining. Did you know hurricanes increase [non-perishable food item] sales 7-fold?
It took them 460 terabytes of data to figure out that hurricanes make people buy more non-perishable food than usual?
Wow, data mining is "usefull"...
You can't take the sky from me...
Your number is wrong, from their faq:
The Internet Archive Wayback Machine contains approximately 1 petabyte of data and is currently growing at a rate of 20 terabytes per month.
That's 20 terabytes per month, not per day.
They got their Internet statistics from the Chinese government.
First of all, most Walmarts don't primarily sell food, they primarily sell loads of other stuff. In fact, what they sell is a lot of stuff that people might need to survive a hurricane, including various kinds of hardware, containers, lights, reading material. So a hurricane would naturally drive lots of people into Walmart. Naturally those people will buy food products while they're in there, and the standard Walmart sells mostly junk food. So it's not as if people are seeking out pop-tarts in hurricane season, but the massive influx of people buying all kinds of things will also increase the number of people buying non-perishable junk food.
Consider also that people will not be worrying about their diets when they're primarily worried about not being killed by their own rooftops...
Combine a bunch of these factors together, and yes, I can easily believe 7x.
It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
That's 20 terabytes per month, not per day.
Even with that number, I wouldn't want to be the Hard Drive specialist...
Interviewer:Would care to describe you previous job?
-Installing HDs 24/7.
I live in Soviet Canuckistan you insensitive clod!
As mentioned by a friend when referrering to his video clip collection (but it doesn't help the videos/films he makes):
"Oh, I have a few frigabytes of data."
"Frigabyte? What's that?"
"Oh, that's a friggin lot of data."
WalMart's 460 TB of data, shared among about 300M Internet users, would spread about 1.5MB to each person. That is, of course, a tiny amount of data - probably just the indices on each person's inbox, let alone their email data itself. Each of those people average storage capacity is over 20GB, on new computers, excluding upgrades which are probably usually about 80GB. So just typical end user computers alone account for at least 10,000 - 40,000 times WalMart's big data dump. And then of course there are all the other servers on the Internet, like the SABRE airline reservation system, the US Federal databases of publications, Google's image cache, all the albums and other MP3/SHN/FLACs in P2P, and of course the endless stream of porn.
WalMart is trying to make itself look like it is turning its customer data into success, and benefits for its customers. That serves to downplay its reliance on labor exploitation, monopolistic competition when it enters local markets, and political favors that structure labor and market laws to give it a competitive edge. And WalMart might just be believing the IT sales hype that it spends millions of dollars on. But that's no reason we should buy their IT BS as much as we seem to buy their wares.
--
make install -not war
... More than 640 Terabytes anyway, right?
(did I just say that out loud?)...
[RM101's mind boggles]
Dude, do you seriously have nothing better to do than spend this crazy amount of time feeding junk data into a supermarket computer? Go outside. Breathe the air.
I dunno, maybe you WILL lay on your death bed, not thinking of your wife, or children, but you'll be proud of how many hours you spent contaminating some database.
Sometimes it's best to just let stupid people be stupid.
Why should we be afraid of Wal-Mart? They're using their data to be more responsive to their customer. They want to make sure that if you want something, it's in-stock and ready to go.
What could they do with their data, really, that would hurt anyone? It wouldn't be like "Bob Smith is buying condoms again." It would be more like "there's a condom spike in area code 78750 every Thursday, let's ship more out."
People who are afraid of data aggregation are jumping at shadows. Nobody cares what you in particular are buying. An individual as a data point is useless, unless you're an exemplar or something like that (which would be unusual).
Let's face it, individuals just aren't that interesting. More importandly from Wal-Mart's point of view, there's no return on looking at individuals.
A friend who worked briefly @ a local walmart during the downturn in tech employment told me about the huge datacenters. Evidentially he was told this in training, or a manager filled him in. Basically they are an IBM shop from what he said.
The systems have the layout of every walmart store in them, and the stores respond to orders from the main office to move products around on the shelves. The systems will tell various stores to move products into different places, and anaylyze the results. If a store is making more money with XYZ sitting near the entrance, then the WOPR tells more stores the move that product into place, but still plays games against shoppers with a few more. It's basically an insanely well oiled statistical war against the shoppers to squeeze every last penny out of them. I hate to say it, but it doesn't work on me when I go there. But overall, it's creepy, and impressive at the same time.
PS- I had this evil idea. If anyone is into the hactivism role, embed a voice recorder IC into a telephone set that matches your local WalMart's phones. Get the code to get on the PA system, and setup your "rouge" telephone to bump onto the PA every 5 hours or so. Be sure to include sounds to make it sound like someone is picking up the phone, and hanging it up. It will drive them nuts. Some stores seem to use Lucent sets on the wall (MLX-xxx) which are most likely ISDN on the back. Other stores seem to have analog ports on a lucent system. Just remember to give me props. Feel free to announce all shoppers a winner of a contest where they get everything they can stuff into a cart for free. Or remind them about the $700,000 in taxes the minimum wage making people cost the community at every WalMart.
Southeastern Virginia REPRESENT!
Walmart has been doing this for a long long time. One of the things they discovered is that people who buy diapers usually also buy beer (in states where walmart can sell beer), and vice versa. So, they moved the beer and diapers to the same aisle, and ended up increasing their sales by like 7 times on both of these items.
Virtually everyone who keeps track of this sort of thing is looking for their own beer and diapers revelation. I used to run a data warehouse which tracked the paths users took through websites in order to lay them out better to increase revenue on ads or purchases. Mine only had 6TB of data though.
Target has been getting quite good at this, since it seems everytime I walk into their store to buy one little thing, I walk out of there with a cart full of crap I didn't really need but thought would be nice to have.
Need Free Juniper/NetScreen Support? JuniperForum
'Wal-Mart has 460 terabytes of data stored on Teradata mainframes, at its Bentonville headquarters. To put that in perspective, the Internet has less than half as much data, according to experts.'
Apparently the "experts," overlooked alt.binaries.*
https://www.eff.org/https-everywhere
Actually the grandparent is correct. Walmart puts so much pressure on their suppliers to actually drop prices every year (inflation is for sissies) that they drive small manufactures out of business. Not to mention the small businesses that it suffocates. There are towns that literally shop themselves out of a job. Heck. Walmart singled handedly put Vlassic in bankruptcy by forcing them to sell a gallon of pickles for $2.97 dollars. This is a facinating article about why we should all boycot the place.
"I can not bring myself to believe that if knowledge presents danger, the solution is ignorance" - Isaac Asimov
I graduated from the Sam M. Walton College of Business at the University of Arkansas with a B.S.B.A in Information Systems. Wal-Mart was nice enough to donate a big chunk (~1 Terabyte) of information for us to datamine. It's pretty interesting stuff and very CPU intensive, as you can probably imagine; we tried not to do any CD burning while waiting on our results ;)
IIRC, It seems like one of the strange correlations we found is that the two items most commonly purchased together were beer and baby diapers. Go figure...