Wal-Mart's Data Obsession
g8oz writes "The New York Times covers Wal-Mart's obsession with collecting sales data.
Fun fact: 'Wal-Mart has 460 terabytes of data stored on Teradata mainframes, at
its Bentonville headquarters.
To put that in perspective, the Internet has less than half as much data, according to experts.'
That much information results in some interesting data-mining. Did you know hurricanes increase strawberry Pop Tarts sales 7-fold?"
Looks like Wal-mart is hiding something
and shopping there means your income has dropped 7-fold
Who says how much data the Internet has available?
Get your own free personal location tracker
would like to welcome our new (evil) data collecting overlords.
"Sanity is not statistical", George Orwell, "1984"
My company alone has over 50 terabytes of data available for download on the internet. Whoever thinks there's that little data on the internet is very poorly-informed.
I agree,
Wouldn't Walmart's records constitute some part of the internet also? It has to be connected at some point to the internet, and given some clever haXing skills... one could access it.
It really depends on your definition of the bounds of the internet, but I think someone is being hyperbolic.
I'd be highly surprised if the internet combined didn't reach the exabyte mark ...
Sunny Dubey
you fools have no idea that I would never let you hurt the Wall-Mart
Someone at Walmart has ALOT of pr0n!
Think how much porn you could fit on 460 Terabytes!
:/
Maybe I'm obsessed with data too
Even Walmart probably doesn't even know what all that data means. Think of the processing power needed to make sense out of it all. I'm sure there are countless interesting trends that are lost in that data ocean.
-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-
MaxPower (2263)
"I got it from a hair dryer."
George Orwell got it all wrong! I't wasn't 1984, it's 2004!
When you have 460TB of data, how the hell do you even begin to search it?
Seems like they'd need to license map-reduce from google or something. (That's a distributed data correlation engine. With extremely high fault tolerence, to boot.)
...that is...extremely....lame... I wonder, does rain increase Halo 2 sales?
the Internet has less than half as much data, according to experts
What's the word I'm looking for? Oh yeah - it's bullshit
...Microsoft has an astonishing amount of information collected from Windows Update users (none of it personally identifiable, of course).
I highly suspect Wal-Mart didn't get into the position it's in of being the largest retailer by being stupid, at least business-wise. This is the sort of project that allows them to stock a 120,000 square-foot big box store from JIT shipments every night, and why every Wal-Mart in a region looks the same. Though I would be interested to read more on the pop-tart to hurricane correlation...
they're storing them on a huge cluter of their $200 lindows systems. ;)
Marge, get me your address book, 4 beers, and my conversation hat.
To put that in perspective, the Internet has less than half as much data, according to experts.
e ction=cm&id=1396
According to other experts, "In June, an average of 8 million P2P users were online at any one moment, with 1 petabyte of data available to share."
http://digital-lifestyles.info/display_page.asp?s
Correlation doesn't imply causation!!!!!
I mean what if a third factor caused both the hurricanes and strawberry Pop Tart sales to increase 7-fold????
Somebody was going to blurt that bromide out at that statement, so it may as well be me.
Seastead this.
Uh... It doesnt have to be connected at any point to the net. Seeing as how that want to keep it to their selves....
stuff
The moderation on this guy amuses the hell out of me. Instead of saying "Why can't you be nice? -1 Troll" you say "Yeah, I know. -1 Redundant."
"Never attribute to malice that which can be adequately explained by stupidity." -- Hanlon's Razor
Comment removed based on user account deletion
From summary "To put that in perspective, the Internet has less than half as much data"
Unless the mainframes are connected to the internet, in which case they're part of it. Does data have to be broadcast from a service to count?
As a guest of WalMart I was able to enter their data center and see this Terraplex first hand. It's massive. It's thousands upon thousands of disks in ~8' frames, rows upon rows of racks. I walked down it and across it and up it and was simply awestruck by the idea of that many disks in one spot.
The gentleman who gave me the tour indicated they have something like 72 weeks (1 year plus 2 weeks) of purchase data on LIVE disk arrays, plus huge archives of the same data on tape. If you buy anything and use your credit, debit, or whatever card they can figure out your sales history obscenely quickly. Be afriad. Be very afraid.
I also got to see Walmart.com (Sun E15k) and Samsclub.com (A bunch of HP boxes in a smallish frame), they were creepy, in a sense... all those sales going on at once, converging on a spot not a few feet from me.
Great, maybe they even have data on the average slashdotter; for instance, for every 3 people that read the article, a hurricane destroys northern Taiwan. ... now notice northern Taiwan isn't being hit by hurricanes... CONSPIRACY!!! PUT ON THE TINFOIL HATS!!!
Acxiom, who in my mind are far worse than data hoes... they sell your information to the highest bidder.. and thats their business model.. Wallyworld would never give up their data... for their own self interest of course
Comment removed based on user account deletion
By its own count, Wal-Mart has 460 terabytes of data stored on Teradata mainframes, made by NCR, at its Bentonville headquarters. To put that in perspective, the Internet has less than half as much data, according to experts.
What experts?
The NYT doesn't say.
Want more information? You can buy some more from the New York Times.
Everyone go to the back of your Wal-Mart and smash the mirror behind the little door... NOW!
;) *evil laughter*
Oh, and don't forget to shop at Target
If Walmart created a web interface for their data, would the amount of data on the Internet suddenly triple?
I think the expert they got their information from was full of baloney.
--
RumorsDaily
The people making that estimate are probably only counting 'legitimate' data on the WWW. They probably don't include, for example, data made available via file sharing, which would make 460TB look miniscule.
I've been reading the comments
I forgot, are we supposed to hate Wallmart?
On one hand they are a large corporate empire and on the other, they promote cheap linux computers.
arg, Im so confused
Yes I did. God help me!
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
A few years ago when I worked in retail, everything was going smoothly. Every night the managers would go around with electronic guns and see what needed ordering the next day. Except for the busiest times of the year the backroom was pretty much empty of stock, and on top of the aisles the extra stock was minimal.
Then one day, the managers were really excited, as we were going to have a computer order everything for us, from records of sales from before and it would "predict" what we would need. They said the extra stock on top of the aisles would be eliminated. We would be able to concentrate on customer service.
Well, the day came, and for a few months you could tell the computer was fighting with limited data. Some weeks would be rediculously overstocked on a few items, others, the leading sellers in the store would have empty shelves. When it finally settled down after a year, it was worse than before the computer.
The top of aisles were jammed to the ceiling with stock, there was never any room to put anything up there, and getting to the bottom for something you needed cost a lot of time. Plus, the backroom was packed with stock. You could hardly move around, and trying to find the last box of something buried underneath these huge piles was a task that killed your morale. During the slow months, one stocker for the whole store was enough for a night, now 3 were common to deal with all the stock.
The article does not mention if that is compressed data or not. It seems like inventory & sales data should compress really well.
460 terabytes? chargen would seem to disagree.
i've seen single DC hubs that store more than Walmart
Wow, I didn't realise they still made mainframes. Ever since the DBC/1012 I thought they just ran Teradata software emulated under Unix or NT.
... things of beauty :)
Now the DBC/1012's, with the hardware AMPs
Google has 8E9 web pages and documents indexed. If the average document is 20 kB in length, then we have 160 TB of publicly available data on the internet, not including pictures and filesharing. The latter probably has a great deal of duplicate data anyway.
Avantslash: low-bandwidth mobile slashdot.
I know walmart does an amazing amount of business, but I still don't see how their CRM system needs 400 terabytes. How much space do you need to say, "person A bought pop tarts, a CD, and milk on 11/14/04"
"brxref
They're also one of the most successful businesses in the country next to Microsoft. Maybe the data is working.
Wal-Mart has 460 terabytes of data
The Internet Archive has 100 terabytes of data.
My brother sells mangoes to the Wal Mart Beast. He says it's all computerized, beginning with an order for the fruit, following the trucks, even the rotation of the ripening process in the warehouses is computer related. It's as close to virtual management as any company comes.
Anyone seen my jagged little pill?
Imagine what evil could be done with this data: how about a service where you can track your spouse's/SO's buying habits? See if they buy condoms and flowers every night they work late for example. Imagine what would happen if they started keeping track of fingerprint data off of cash/checks that people use in stores too. Well I am off to go buy some tin foil now (with cash, wearing gloves) :-)
I Am My Own Worst Enemy
The Law of truely large numbers.
Basically, the more data you have, the more likely you'll find weird coincidental correlations.
I guess these kinds of 'statistical finding' will become more and more prevalent in the future, given that we're living in an age where we're collecting ever-larger amounts of data, and have the resources to process all this data automatically.
It would be a good thing if people were a bit more sceptical of this kind of stuff. Correlation isn't causation.
Did you know?
EVERY TIME A LOAF OF BREAD IS BAKED,
APPROXIMATELY
150,000,000 YEASTS ARE
KILLED
Come to the award-winning 1987 film,
"The Very Small and Quiet Screams"
-- a cinematic electromicrograph of yeasts being baked.
A must for those who care about yeast, and especially for those who don't.
SPONSORED BY
Brown Anaerobe Rights Coalition (BARC)
Student Bakers for Social Responsibility
Coalition for the Elevation of Life (CELL)
Defend all life: "From greatest to least, from human to yeast!"
Help Fight SPAM today!
The internet archive has a lot more info than that. And grows by a lot each month. If they think walmarts 460 Tb of data is > than the internet I'd wager that they're wrong.
People who call themselves "experts" but are really just talking out of their asses do. Consider that The Internet Archive alone contains more than a petabyte (1024 terrabyte) of data, all of it accessible, and that they are adding on the order of 20 terrabyte a day, and you start realizing how much bigger the Web is.
Perhaps non redundant DATA?
I would assume this data is more than just shopping trends. I guess it includes survelance photos, employee data, backups of it all, etc. if it is all shopping trends, there are either very observative or stalkers.
The "Internet" has a hell of a lot more data than what the article stated. I don't know about you, the last time I checked, the Internet is a collective of Web Pages, Usenet, IRC, Sharing Networks, etc.
Hell, DC++ (Direct Connect Client/Server) has had more than 500 terrabytes of shared data in several of my favorite hubs.
My guess is that the "expert" is Al Gore.
internet website search for poptarts... ... ... ...
looking up geography of ip address....go it
purchase of poptarts within 20 minutes at walmart 5.3 miles from website search.
ip address also searched for toaster ovens but there was no purchase...better send an order for more ovens to that store.
contacting ip provider...go it
assimilating customer data...go it
sending snail mail to address about new toaster ovens at local walmart with 10% off ad...
Why read the article when I can just make up a snap judgement?
I hate to sound like some pro-totalitarian next generation Big Brother, but it's not as if they are collecting personal information on customers without the customer's consent. Wal-Mart are just doing some major (I agree with obsessive though) market research so as they can optimise their stores to maximise profits, exactly the same as every other business in the world.
Fat people are hard to kidnap
Coworkers who have worked with Wal-mart IT tell me that Wal-mart does indeed have mountains of data. However, they have so much data that they do not know what to do about it. They can't interpret it all because there is just too much of it.
This makes me wonder... there must be some ideal point where a certain amount of data collected is worth the most money because you can act on that data. After that point, collecting additional data is increasingly more costly and counterproductive unless you invest in an infrastructure that lets you process more data. How does one figure out that ideal point? Just a thought.
I'd say google would have a pretty good idea...
Wal-Mart employees who use their employee discount cards have every purchase tracked and monitored.
Activity of the cards is ACTUALLY monitored for discrepencies in buying habits to find abusive employees who buy things for their friends?
Did you also know Wal-Mart's employee name badges have RFID tags (and have had for many years) that allow Wal-Mart to track where an employee is at any given time?
Another interesting tidbit, did you know at Wal-Mart's Jewelery warehouses they actually WEIGH the amount of metal in your body when you enter a leave? (And I don't mean they ask you to put things in a dish and weigh the dish - they scan YOU)
Another interesting thing, Wal-Mart has a fallout facility in Oklahoma that has a near-real-time backup of each BIT of that 460 terabytes of data?
Wal-Mart could survive a direct nuclear blast and still keep on a truckin'.
And, of course, if you're in a Wal-Mart home office - ISD building - distribution center - et al... and dial 911 - BOOM - you get Wal-Mart's private security? Niiice, hope it's not a real emergency, you first have to explain it to them - then if they deem it neccessary THEY will call the REAL 911!
What constitutes the internet anyway? I know some dc hubs on the internet that have over 100TB, sure it's p2p, but what about archive.org? I know they have at least a feqw dozen TBs by themselves. That number in the article can't be right at all.
Wal Mart has the most sophisticated retail and inventory control programs in the world. This is the reason they have eaten everyones lunch.
Anyone seen my jagged little pill?
How the hell can they estimate that? Assuming "less than half" means about 45%, that gives us about 207 TB. Let's just round that up to 240.148445 TB to make it a nice, even number.
Google is searching 8,058,044,651 "webpages"* -- who knows what that means. Now, Google isn't searching every single page on the internet, certainly. But also, they can't be searching pages that don't exist. So the 8bn Google pages aren't certainly all the internet. But Google isn't double or triple counting pages. Still, at 240.148445 TB (my rough estimate), we come up with a page size of exactly> 32KB per page.**
Is this just counting the text? The code for this page right here (comments.pl) weighs in at about 14KB. Wal-Mart, in no way, has twice as much info as the internet. I would say the "internet" should be measured in at least petabytes. Archive.org itself already has 1PB, and I consider any of that content available to me "on the internet".
* I'm not even counting the Google cache.
* Which means Mr. Gates over-estimated by a factor of 20 when considering how much memory we all needed!
Small potatoes make the steak look bigger.
Did you know hurricanes increase strawberry Pop Tarts sales 7-fold? ...and if you needed a 460 TB data array to tell you that then you're too stupid to live.
You're using her as bait, Master!
We learned a lot about Walmart and Data mining in my database 101 class. And the professor asks "Why do you think Walmart is so successful?"
And everyone says something about leveraging technology and JIT delivery, etc.
Professor Liu says "Nope. Location."
Walmart chose most of their initial locations in cities/regions where there was no other competition. Places where there was no Kmart, no department stores, no malls. And they flourished.
In the future, I would want to not be isolated from my friends in the Space Station.
The Internet definately has more data than Wal-Mart. Consider this old 2002 study. The "deep web" alone, comprised mostly of databases, comprises 91,850 TB of data. And this was a couple years ago. It doesn't include email or P2P either.
The definition they used for "Internet" was probably "web pages indexed with a search engine" which is definately not the entire Internet.
Did you know hurricanes increase strawberry Pop Tarts sales 7-fold?
These are hurricanes in Candyland and Structural Strawberry Pop Tarts which they use like plywood.
My company has 300,000 employees each of whom has about 40GB on their desktops. That's 12,000,000 GB which is 12,000 TB most of which is junk.
between their corpratism and conservitive christian roots, it's all underage third world children, and they're conflicted over whether to sell it, or feel guilty.
For which it stands, one store under God, indivisible, with sales and product for all.
From the article;
"You can see the pattern of Wal-Mart's mandates, and as Wal-Mart grows in power, it is getting more dictatorial.....Wal-Mart lives in a world of supply and command, instead of a world of supply and demand."
Take the cheese to sickbay, the doctor should see it as soon as possible - B'Elanna Torres, "Learning Curve"
"Did you know hurricanes increase strawberry Pop Tarts sales 7-fold?"
Did you know that coorelation does not imply causation?
what?
Political parties are using consumer shopping patterns to figure out who to reach with 1-to-1 political messages.
Stuff like: women who buy from catalogs, eat "crunchy" peanut butter, own a cat and drive a minivan you are 87% more likely to react positively to prayer in schools as a "motivating issue."
I just made that up, but it's the sort of thing they find out. No tin-foil hats here - corporations and pollsters are shelling out millions of dollars for this stuff.
Here's a few google searches links to get you started:
Acxiom
Seisint
Pointless comparision. There's hardly that much non-redundant, noiseless or meaningful data in the walmart database either.
...about 6784 bits assumming 128 bit ASCII
...Had this been an actual emergency, we would have fled in terror, and you would not have been informed.
Uh, except that Google hasn't indexed all of the publicly available WWW. It's only indexed a small fraction of it. And the WWW isn't the Internet. They're different. Secondly, the Internet Archive alone has archived 1 petabyte of data so the figure of 230 terabytes of data on the Internet is obviously wrong.
Support the First Amendment. Read at -1
... do they have a freezer big enough for 460TB worth of drives?
Funny!
Hugh Hefner?!? Dude, didn't think you'd be posting anonymously! Share the wealth, man :)
Condemnant quod non intellegunt.
Well. The body was redundant with the subject5. Aka funny.
Except that the Internet Archive archives the same data over time. So, for example, a single website might be archived 100 times in slightly differing forms.
So while the amount of data predicted might still be wrong and probably is, it is not OBVIOUSLY wrong owing to the size of the Internet Archive.
Sunny
Be my Friend
To put that in perspective, the Internet has less than half as much data, according to experts.
someone realized that the DB servers are actually accessible from the internet and then bam, instand 2x increase in the amount of data on the internet.
"Not knowing when the dawn will come, I open every door." - Emily Dickinson
Um.
Owing to the size of the Internet Archive is obviously much better (4-fold better) than the prediction made in the article, as the Archive is a very tiny fraction of the collected Internet.
I don't see why storing the same site a 100 times yields less data on the web.
That means that the internet has well over a petabyte of information on it, much of the information is probably the same but it is on the internet>
Even discounting the P2P stuff, I can't believe that the WWW has less than 230TB, when the little bitty company that I work for has over 4TB online (and we _still_ run out of space!) and almost 30TB on tape. Especially when you consider that only 10TB of that was generated before I started 4.5 years ago.
Adherence to the truth is a form of disloyalty.
Everybody knows that the Internet contains more data than Walmart... So all that really happened here, I think, is that the NYTimes guy was irresponsible with his statistics. He probably just used that statistic off one study he happened to find that he thought was accurate. He probably doesn't know much about the internet, then, but we can't really know that until we look at his history.
So, how can they calculate the internet's terrabyte amount? They can't. They just found a statistic and used it. Welcome to the national media.
While the internet archive archives the same data in slightly different forms, it's entirely internet accesable, therefore the size of the internet must be at a _minimum_, larger than the archive itself.
WalMart can have a PetaByte OR two online ... and it's still tiddly-winks.
.... Space Imaging have more online imagery than everything Walmart "has" online.
NASA, EROS, ESA or anyone with LEO imagery.......say
Now - combine "just the names above"!
1/2 a PetaByte these days is like a bragging about your new 5MB disk drive back in '85......
Yeah... the Internet fits in a PetaByte. Get real.
From TFA: The experts mined the data and found that the stores would indeed need certain products - and not just the usual flashlights. "We didn't know in the past that strawberry Pop-Tarts increase in sales, like seven times their normal sales rate, ahead of a hurricane," Ms. Dillman said in a recent interview. "And the pre-hurricane top-selling item was beer."
Thanks to those insights, trucks filled with toaster pastries and six-packs were soon speeding down Interstate 95 toward Wal-Marts in the path of Frances. Most of the products that were stocked for the storm sold quickly, the company said.
Such knowledge, Wal-Mart has learned, is not only power. It is profit, too.
Now, imagine that in addition to stocking more of the products they know will see a sales jump in the stores in the affected area, Wal-Mart also bumps up the prices for those items by 5 or 10 cents in those stores-- maybe even more. Surely they could use their data mining techniques to find the 'sweet spot.'
It would almost be like profiteering in advance of the hurricane.
Some of the direct connect hubs I use have more than 230 terrabytes EACH...
The more data they have about me, the more precisely they can meet my needs. When I walk in, they can detect my RFID and make a pile of the things I might want. I'll choose which I'm willing to pay for and walk through an RFID portal which adds up my bill and auto-deducts it from my debit account.
If I call them, their caller ID will recognize me and present me with some things they think I want (press 1 to pay 4.50 for a basketball; press 2 to pay a dollar for a kilo of ramen noodles; press 3 to pay 250 for an xbox with four controllers and Halo 2). I'll make my selections, authorize payment, and wait for delivery to my archived address.
If they're so damn efficient why not let them supply us with everything?
Also, don't forget that the internet includes Usenet and other services under the protocol, which has TONS of additional data. Chances are, the internet is not 230 terabytes large and the idiot who made that claim...is an idiot.
A blog like any other.
Fight Club man....we need to take notes from that movie.
Burn that data warehouse. Yaaahhh!!
Life is not for the lazy.
Having all of your eggs in one basket (or complex) seems risky. I wonder if walmart stores its backup tape data in a secure bunker? I wonder if walmart would survive if a disaster struck bentonville headquarters? Some how I doubt it...
"Did you know hurricanes increase strawberry Pop Tarts sales 7-fold?"
The sentence tells all.
That much information results in some interesting data-mining. Did you know hurricanes increase [non-perishable food item] sales 7-fold?
It took them 460 terabytes of data to figure out that hurricanes make people buy more non-perishable food than usual?
Wow, data mining is "usefull"...
You can't take the sky from me...
I'm not even going to address "noiseless" and "meaningful" since that's completely in the eye of the beholder - and, with Walmart holding this much data, there's a fairly good chance they believe it to be noiseless and meaningful.
However, as to redundant, I would wager lots of money on the idea that, excluding backups (imagine the length of that tape!), Walmart has no redundant data in that datastore. Even if two people bought precisely the same things at the same time both with cash, it's still independant transactions which have different meaning because there are two of them than it would have if there was only one such transaction. Contrast this with 67,000 P2P users having the identical copy of the latest song from Madonna - the data is redundant unless you're trying to extrapolate meaning (that is, data mining).
Hello. McFly.
[man walks away with tail between his legs]
Sunny
Be my Friend
Ok, people are going to be without power for a while, possibly a long while, and Walmart predicted the sale of nearly unperishable dry goods would rise? My God, the sheer genius of it baffles me!
Call me when they can Mathmatically prove which flavors are most popular in a Hurricane.
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
The story I heard was about Walmart and Roach spray. It seems that there was a particular brand of "Ant and Roach spray" that was doing brisk business in the south, but just sat on shelves in the Midwest. Walmart investigated to see why it wasn't selling. Turns out it was a cultural difference. Roaches are no big deal in the south, but in the Midwest, having a can of "Roach spray" in the cupboard is equivalent to admitting you're a dirty slob.
Walmart talks to the manufacturer, and gets them to create a new packaging - with the same product - without any mention of roaches. Put it on shelves in the Midwest, and it sells.
Marketing is good if it can give you what you need, even if you may not be aware of what that is. It's evil when it forces a product upon you for the sole purpouse of sucking your wallet dry.
Your number is wrong, from their faq:
The Internet Archive Wayback Machine contains approximately 1 petabyte of data and is currently growing at a rate of 20 terabytes per month.
That's 20 terabytes per month, not per day.
This is just being overly pedantic. Yes, the words "internet" and "WWW" (using a very loose definition of "word" here) are not interchangeable. However, to the general audience reading this FA, the two words have the same meaning. Most people don't grasp the difference between "web" and "internet" - asking someone about email, they'll often use the word "web" somewhere in the description of how they get it. "I download my email from the web into Outlook Express." Bzzt, wrong. But when TFA is speaking to the less technical, communication has been acheived, and, oddly enough, the exact meaning the author intended to get across, with wildly inaccurate terminology, is the exact meaning that the average reader interprets - the definition of perfect communication (with that average reader).
They got their Internet statistics from the Chinese government.
Though I would be interested to read more on the pop-tart to hurricane correlation...
Of course pop-tart sales go up in the light of an oncoming hurricane. And if you look, I'd bet water bottles, candy bars, and similar foods skyrocket similarly.
No-preparation or simple-preparation foods go up in the face of an emergency. Complex-preparation foods go down. In this context, "complex" can mean "anything that requires a stove" or even "foods requiring electricity or water to prepare".
People are stocking up to handle a few days without services, ffs.
Why is this the least surprising?
Did you know hurricanes increase strawberry Pop Tarts sales 7-fold?
Post hoc ergo propter hoc
Business isn't willing to pay for products, innovation and careers, so we get brands, mortgage commercials and layoffs.
Wal-Mart can probably do anything with that data. I doubt there are any legal impediments. Even if there are, Wal-Mart's got enough cash to ignore American law. Who knows where this information can go? Certainly a lot of people would pay truckloads of money for it.
I have never made but one prayer to God, a very short one: "O Lord, make my enemies ridiculous." And God granted it.
First of all, most Walmarts don't primarily sell food, they primarily sell loads of other stuff. In fact, what they sell is a lot of stuff that people might need to survive a hurricane, including various kinds of hardware, containers, lights, reading material. So a hurricane would naturally drive lots of people into Walmart. Naturally those people will buy food products while they're in there, and the standard Walmart sells mostly junk food. So it's not as if people are seeking out pop-tarts in hurricane season, but the massive influx of people buying all kinds of things will also increase the number of people buying non-perishable junk food.
Consider also that people will not be worrying about their diets when they're primarily worried about not being killed by their own rooftops...
Combine a bunch of these factors together, and yes, I can easily believe 7x.
It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
That's 20 terabytes per month, not per day.
Even with that number, I wouldn't want to be the Hard Drive specialist...
Interviewer:Would care to describe you previous job?
-Installing HDs 24/7.
I live in Soviet Canuckistan you insensitive clod!
Think of the processing power needed to make sense out of it all.
nothing a beowulf cluster of linux xboxes couldn't handle!!
My problem? I was perfectly gruntled, until some numbnuts came by and dissed me.
If you are concerned about all this consumer information being used as 'big brother', maybe you ought to start doing something about it. Lying on the census or your income taxes is illegal, but marketers are fair game. The easiest way to mess with them is to tell them the opposite of the truth. Or, camouflage your true interests by entering a lot of junk. I.E. if your are pissed off that you didn't get a refund you were due from MicroCenter (notorious refund scammers) just fill out several hundred bogus refund forms. Jam the system.
If you're willing to break the law, you can even do worse harm. But I don't condone that.
Using legal methods to increase the entropy are the best way to fight the marketing databases.
300k employees all with desk jobs?
The internet got substantially more data than that. Heck, only my ultra small hobby-company has around 1 TB on the internet. And privatly I have around 0.5 TB shared over the internet from home. Then add all other small hobby companies, billions of webpages, colocation-servers, communities, p2p-"seeders" etc etc, and it will quickly pass 230 TB data, many thousand times over.
but it's impossible, since posting the number in this comment would add to the total.
-- v --
No problem, drop on in!
As mentioned by a friend when referrering to his video clip collection (but it doesn't help the videos/films he makes):
"Oh, I have a few frigabytes of data."
"Frigabyte? What's that?"
"Oh, that's a friggin lot of data."
I still think the comparision would be fairer if you fed the database to, say, bzip first. You're essentially compressing the internet as well by "encoding" similar files as pointers to each other.
WalMart's 460 TB of data, shared among about 300M Internet users, would spread about 1.5MB to each person. That is, of course, a tiny amount of data - probably just the indices on each person's inbox, let alone their email data itself. Each of those people average storage capacity is over 20GB, on new computers, excluding upgrades which are probably usually about 80GB. So just typical end user computers alone account for at least 10,000 - 40,000 times WalMart's big data dump. And then of course there are all the other servers on the Internet, like the SABRE airline reservation system, the US Federal databases of publications, Google's image cache, all the albums and other MP3/SHN/FLACs in P2P, and of course the endless stream of porn.
WalMart is trying to make itself look like it is turning its customer data into success, and benefits for its customers. That serves to downplay its reliance on labor exploitation, monopolistic competition when it enters local markets, and political favors that structure labor and market laws to give it a competitive edge. And WalMart might just be believing the IT sales hype that it spends millions of dollars on. But that's no reason we should buy their IT BS as much as we seem to buy their wares.
--
make install -not war
But since Archive.org is archiving Internet stuff, that's just duplicates. What I'm interested is the unique data on the Internet compared to Walmart's own DB.
Correct but if sovereign state has access to all these data files from all the wal*marts of the world ("security" reasons) then isnt that even more powerful.
who does the wal*mart off site backups?
If it was say, 'iron mountain' they have the entire voting records of the uk....so you're right when you mention the socks on their own, but combine that with *everything* and it is 1984.
I thought Britney was the Pop Tart...
Ok, so walmart has 460TB of data. What would this data actually be? Is it just 460TB of text documents that were compiled from every sale/inventory/stock order ever made? Or is it even larger like holding every world statistic of anything to do with anything that affects consumerism?
And how would one pull a pop-tarts statistic out of a 460TB database? Ctrl+F ? I'm serious. I really don't know.
Fussen
Something to watch.
Is Wal-Mart good for America?
It has to be connected at some point to the internet
I don't know why you would say that.
Walmarts storage breakdown (where 460Tb goes)...
Illicit Pornography 200Tb
Hidden Toilet Camera archive footage 100Tb
Sys admins private warez collection 80Tb
Previous employees records 60Tb
CIO's mp3's 15Tb
Sales Records 3Tb
Records of Returned / Faulty Products 2Tb
... but that makes for only 40 bits, and so can only address 1 TB of data (plus epsilon from the carries).
I hereby place the above post in the public domain.
What is the point of so much data? How would you ever find anything in that mass of "stuff"?
... More than 640 Terabytes anyway, right?
(did I just say that out loud?)...
"...Microsoft has an astonishing amount of information collected from Windows Update users (none of it personally identifiable, of course)."
What would be interesting is the correlations coming from the NSA database?
"I mean what if a third factor caused both the hurricanes and strawberry Pop Tart sales to increase 7-fold????"
Why would people buy hurricanes?
Walmart can but hold a candle to Major League Baseball....
Insanity is a gradual process; don't rush it.
"If people are creeped out by these invasions of privacy they should shop elsewhere.(2)"
Actually I find it amusing. Why? Simple. One people gave this information to Wal-Mart (1). There's a post elsewere asking why he didn't destroy all the data? Well why did you give it to them in the first place?
Also I bet a lot of that data is conclusions drawn from the raw numbers (surprising accurate. any marketing students out their?).
(1) There's also no laws being broken here either. Not legal. Not moral. Not ethical.
(2) I would argue that they're not invasions of privacy. If someone observes you in public, and draws accurate conclusions from that (conclusions that make you uncomfortable)? Is that an "Invasion of Privacy"?
Do you realise the volume of items Wal-Mart stores WORLD WIDE sell?
If anything, 460 TB seems like an understatement. Not to mention the claim that the Internet contains less than half of that. I alone have over a terrabyte of shit downloaded from the Internet. I seriously doubt there is only 229 more terrabytes to download.
Well, if you learn to spell terabyte properly, you'll save about 11% storage space for that word!
Why should we be afraid of Wal-Mart? They're using their data to be more responsive to their customer. They want to make sure that if you want something, it's in-stock and ready to go.
What could they do with their data, really, that would hurt anyone? It wouldn't be like "Bob Smith is buying condoms again." It would be more like "there's a condom spike in area code 78750 every Thursday, let's ship more out."
People who are afraid of data aggregation are jumping at shadows. Nobody cares what you in particular are buying. An individual as a data point is useless, unless you're an exemplar or something like that (which would be unusual).
Let's face it, individuals just aren't that interesting. More importandly from Wal-Mart's point of view, there's no return on looking at individuals.
I wouldn't be surprised if this whole article was bullshit, likely thrown together by a lazy editor who happened to catch CNBC's 2-hour documentary The Age of Wal-Mart this week. This pop tart tid-bit was one of the more interesting bits of trivia in the show, I suspect this fluff piece was written in reaction to it.
Anyway, the major news is about Wal-Mart. That's intersting to know such thing. Now what really matters is what information they gather and how?
Remember such stories about RFID at Wal-Mart? I remember a story about Wal-Mart illegally using it on test products.
My 2 cents...
Way to completely rip off the TV show about this that aired about a week ago. At the very least you could have added some new information...
You're nothing; like me.
Ah, but the Internet Archive makes new copies of the same site when they're updated. Frequently updated sites may have 20+ different versions of themselves archived.
It would be pretty badass to have a distributed file system(via Active Directory or similar) of 30 of those gigs per desktop(10 for core system/OS stuff). 9,000TB, just for the hell of it...
Try running Google Desktop Search on THAT thing.
Sooo, they are adding 20 TB/Month just to store additional duplicates of data that they already have? That's pretty silly.
"Remember, there never were pineapple-almond cookies here."
"With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea...."
RFC 1925
"Actually, they don't need to be toasted. As a matter of fact, while PopTarts have been a staple of my diet for years, I almost never toast them... in fact, I can't remeber having ever cooked a PopTart."
Try spreading a light amount of butter on the unfrosted side, and then toasting in a clean pan for a few minutes. Yum.
You think they don't compress it?
"Remember, there never were pineapple-almond cookies here."
The internet has well over a petabyte of data in it.
It has far less actual information...
Wow, you are stupid.
Do you even know what archive.org is?
A friend who worked briefly @ a local walmart during the downturn in tech employment told me about the huge datacenters. Evidentially he was told this in training, or a manager filled him in. Basically they are an IBM shop from what he said.
The systems have the layout of every walmart store in them, and the stores respond to orders from the main office to move products around on the shelves. The systems will tell various stores to move products into different places, and anaylyze the results. If a store is making more money with XYZ sitting near the entrance, then the WOPR tells more stores the move that product into place, but still plays games against shoppers with a few more. It's basically an insanely well oiled statistical war against the shoppers to squeeze every last penny out of them. I hate to say it, but it doesn't work on me when I go there. But overall, it's creepy, and impressive at the same time.
PS- I had this evil idea. If anyone is into the hactivism role, embed a voice recorder IC into a telephone set that matches your local WalMart's phones. Get the code to get on the PA system, and setup your "rouge" telephone to bump onto the PA every 5 hours or so. Be sure to include sounds to make it sound like someone is picking up the phone, and hanging it up. It will drive them nuts. Some stores seem to use Lucent sets on the wall (MLX-xxx) which are most likely ISDN on the back. Other stores seem to have analog ports on a lucent system. Just remember to give me props. Feel free to announce all shoppers a winner of a contest where they get everything they can stuff into a cart for free. Or remind them about the $700,000 in taxes the minimum wage making people cost the community at every WalMart.
Southeastern Virginia REPRESENT!
"Did you also know Wal-Mart's employee name badges have RFID tags (and have had for many years) that allow Wal-Mart to track where an employee is at any given time?"
I dunno about any of the rest of this, but I know that's false. My mom is a store manager at Wal-Mart, and their badges don't have any RFID capability in them. Not yet, anyway. It wouldn't surprise me if that's coming. But not right now. Care to tell us where you get your information?
"Badges? We don't got no badges. We don't need no stinkin badges!"
dragonhawk@iname.microsoft.com
I do not like Microsoft. Remove them from my email address.
Yes I did. How? I watched the same CNBC show last week.
kimanaw: "Except it takes 8 Teradata DBAs to manage the 460 TBytes, and 23 Oracle DBAs to manage 1 Gig."
Tim C: "Where I work, we have two dozen or more active Oracle databases, and 2 DBAs."
Ummmm, that's nice. How big are they? kimanaw was talking about database size, and you are talking about database instances.
'course, I'm pretty sure this whole conversation is bullshit, but I just felt like pointing that out.
dragonhawk@iname.microsoft.com
I do not like Microsoft. Remove them from my email address.
50 TB, is that all???? a place i used to work, the EROS DataCenter, USGS. had over 1 petabyte of publicly available over FTP satellite imagery. oh, and if you need DDS4 Tapes or DLT Tapes, if you order the free data on the tapes, they will ship them to you for free, you don't pay for the tapes, and you get to keep them. :)
"In a world without walls and fences, who needs Windows and Gates?"
Walmart has been doing this for a long long time. One of the things they discovered is that people who buy diapers usually also buy beer (in states where walmart can sell beer), and vice versa. So, they moved the beer and diapers to the same aisle, and ended up increasing their sales by like 7 times on both of these items.
Virtually everyone who keeps track of this sort of thing is looking for their own beer and diapers revelation. I used to run a data warehouse which tracked the paths users took through websites in order to lay them out better to increase revenue on ads or purchases. Mine only had 6TB of data though.
Target has been getting quite good at this, since it seems everytime I walk into their store to buy one little thing, I walk out of there with a cart full of crap I didn't really need but thought would be nice to have.
Need Free Juniper/NetScreen Support? JuniperForum
Regardless that their 'estimate' of the internet size is, the thought that a retailer has that much data stored is a bit scary i think.
---- Booth was a patriot ----
Exactly. It also depends on what they mean by "Internet," because if they're going to count not only the WWW, but also FTP, filesharing and IM, I'm going to bet that the "Internet" has way more data than Wal*Mart's DB.
"Dad, how does walmart sell everything so cheap?"
"Well son, it's simple economics. Which I don't understand one bit."
It's it's a Telco or an IBM, even the ones who don't have desk jobs would be carrying around laptops for diagnostics (at very least).
However, just because you have a 40GB harddrive doesn't mean it's full or unique data from the next guys.
Rod Taylor
"If wisdom grew on trees, you Sir, would be a bush".
Shouldn't "bush" be capitalized in that sentence?
"Fun fact: 'Wal-Mart has 460 terabytes of data stored on Teradata mainframes, at its Bentonville headquarters" Those sick fscks, Why would anyone need that much p0rn in their office, and why aren't they sharing it? The Internet is only 230 TB? I have a little over 2TB of data myself, does this mean I've already collected almost 1% of the Net's p0rn? That can't be right.
Ive been all over the USGS site for many years, always wanted to get some of the stuff on tape but never knew how to order it. If you got a few spare minutes and can let me know how to order the data on tape it would save me lots of time doing downloads. psillyjim AT gmail DOT com Thanks!!
Comment removed based on user account deletion
Stupid? Hardly. If archive.org are not running new each new copy of all the sites they archive through some kind of diff utility, then THEY are stupid.
Did you know that Wal-Mart anally implants a tracking device into each employee?
/conspiracy
It's to measure the stool production of each employee. You see, Wal-Mart realized that you can only eat so much during your breaks. Excessive stool production implies that your breaks are too long. Any employee with excessive stool production is flagged and actively monitored by store management.
It also is linked into the in-store McDonalds'. If an employee is producing too little stool, the employee is forwarded to the McDonalds for a quick snack, increasing the blood sugar of the employee and boosting productivity.
Except that the Internet Archive archives the same data over time. So, for example, a single website might be archived 100 times in slightly differing forms.
Here's a clue for you:
$ man diff
Not good for pictures?
$ man xdelta
Tell me, are they seriously archiving complete copies of the same website instead of diffs? If so, their admins are unbelievably stupid and need to be shot.
Don't worry, the extra r is recycled.
When I read the mainframe bit, at least I knew someone there had some smarts.
They may not have k3wl fading-window interfaces like Windows, but at least mainframes "just work".
Sure, the power bill for their big iron probably gets hidden deep in their shareholder reports (it probably consumes enough power to keep half the Eastern seaboard powered during the summer months), but its uptime is measured in scientific notation.
Oh, and best of all, no three finger salute.
----
Running 'Nix is like owning a Lightsaber. It's "a more elegant weapon for a more civilized time."
If any of that data is indexed, I'd imagine the indexes alone could require a substantial amount of storage space. When you factor out the indexes and other interim data (used for management or aggregation), how much REAL data is left?
It also makes me wonder if Walmart isn't either buying or aggregating data from other sources. I can't say it would surprise me, since no legislator seems to give a rats ass about protecting consumers.
Well, I don't think they are that simple minded. I would suppose that they only store the changes between the versions... and that they compress everything too. Of course, please realize that I am talkin completely out of my ass... I am just throwing ideas into the air. It is possible that they have some form of a version control system set up, just to minimize redundancy.
It is really quite simple (and in the best
interests of USA's national security, too).
Require that each and every container cargo
box from overseas be inspected prior to
entering US territorial waters -- as in,
at the 12 mile boundary and not at the
point of origin. It is the only way to
prevent WMD (WalMart Merchandise Dumping)
from entering the country.
'Wal-Mart has 460 terabytes of data stored on Teradata mainframes, at its Bentonville headquarters. To put that in perspective, the Internet has less than half as much data, according to experts.'
Apparently the "experts," overlooked alt.binaries.*
https://www.eff.org/https-everywhere
Look at suprnova.org. The number of unique data sets is the number of torrents. They don't publish the total size of all torrents, but suppose you have an average 300 MB. Multiply by the number of torrents (bottom of page), and you get about 100 TB right there.
If instead you look at the number of seeders, it is like 2 PB, just not all unique.
You think they don't compress it?
What does that have to do with anything? The issue is not the amount of storage, it's the sheer volume of data available.
If I compress a PB of data, I may be only using about..say... 100TB of storage, but it's still a PB of data.
"This calls for a very special blend of psychology and extreme violence" - Vyvyan "The Young Ones"
It's called: "The age of Wallmart: Inside America's most powerful company". It has been on MSNBC lately and Tivo says it will be comming on again 11-25-04. It has a lot of info on Walmart and their business practices. Also, there is a section on their technology and how they use it.
In other words, the Petabyte figure represents non-redundant data, and the hypothesis that the size of their archive proves that the internet is much biger than 230TB is correct. ;-)
"Remember, there never were pineapple-almond cookies here."
Actually the grandparent is correct. Walmart puts so much pressure on their suppliers to actually drop prices every year (inflation is for sissies) that they drive small manufactures out of business. Not to mention the small businesses that it suffocates. There are towns that literally shop themselves out of a job. Heck. Walmart singled handedly put Vlassic in bankruptcy by forcing them to sell a gallon of pickles for $2.97 dollars. This is a facinating article about why we should all boycot the place.
"I can not bring myself to believe that if knowledge presents danger, the solution is ignorance" - Isaac Asimov
I assume that we are discussing The Internet Archive's storage capacity, which would reflect the post-compression size.
"Remember, there never were pineapple-almond cookies here."
I graduated from the Sam M. Walton College of Business at the University of Arkansas with a B.S.B.A in Information Systems. Wal-Mart was nice enough to donate a big chunk (~1 Terabyte) of information for us to datamine. It's pretty interesting stuff and very CPU intensive, as you can probably imagine; we tried not to do any CD burning while waiting on our results ;)
IIRC, It seems like one of the strange correlations we found is that the two items most commonly purchased together were beer and baby diapers. Go figure...
Dammit why does lamb cost so much anymore?
This issue is a bit more complicated than you think.
Let's keep this all in perspective. I've seen Direct Connect hubs with about 250 TB stored with about 5,000 users. If 5,000 people with nothing more than an ordinary PCs can be 1/2 of Wal Mart's data, the internet as a whole must have much have far more available.
I think they have more than 230 terabytes of Grateful Dead live shows alone! ;-)
You have that much disk space, true.
But,
1) Not all the disks are full
2) Data on them is semi-useless (for example, you count terabytes of swap space and applications/OS)
3) Data on them is mostly useless for all practical purposes because it's unstructured and cannot be accessed at will (by data mining or other programs that you might have).
All of the data are precious to Wal-Mart.
Nasty little geeksesses, you all want my data. You all WANT THE PRECIOUSSS!
There are 10 types of people in the world. Those who know binary, and those who do not.
I suppose if someone has a 460TB data warehouse, that's something to crow about but even at that I've got to image there's some TLA (three letter agency) out there with a manageable data warehouse that holds 1 petabyte.
Why did I come into work today anyway?
Also, don't forget that the internet includes Usenet and other services under the protocol, which has TONS of additional data. Chances are, the internet is not 230 terabytes large and the idiot who made that claim...is an idiot.
You, sir, are the idiot. We all know now that there are, in fact, MULTIPLE internets....Dubya has leaked classified information. You fools speak of "the internet" and its terabytes of data. The real experts know that there are, in fact, internets that must be storing millions of exabytes of data!!!
Believe it or not, during the last big hurricane I actually bought a big box of strawberry pop tarts for the first time in years (but not at WalMart). I remember munching on them while watching the hurricane reports late at night.
No, you're wrong.
All you need to do is break the heart of wal-mart, the mirror in the back.
If my speelings is going to be called into question, then I will proceed to cite the entire Internet as a whole. Maybe if there were no spelling errors there, then perhaps the entire contents could fit under 230 TB.
And to think I almost worked my way up from stockman to IT there.
Scarey as hell.
"Power corrupts. PowerPoint corrupts absolutely."
If Walmarts hardrive went down, how big a freezer it would take to get them up and running again?
"Where did this apple come from?"
--Alan Turing
I kinda miss K-Mart. Mom would buy me the "Ketch" brand dress shirts for Christmas, the one with the sailboat emblem on the collar label, and these shirts had only half as much fabric as a regular dress shirt. I could wear them for maybe 6 hours before I would go into "deoderent failure" have to be careful to keep my arms by my sides for the rest of the day.
Cash.
for the link lazy
psillyjim@gmail.com
As per normal with the media
But zero Information. ;-)
dinner: it's what's for beer
as afraid as I am of such huge amounts of data being collected on indiviuals, this kind of stuff gives me the butterflies. I love technology, but I feel obligated (as should you all) to use technology to advance mankind, as opposed to shitting on it.
This is always a slashdot crowd pleaser.
This issue is a bit more complicated than you think.
I like to buy strange things together.
Extra Large condoms, fruit, and k-y gelly.
Have a lot of fun,
-Steve
I like the term they used in the end credits of Lord of the Rings: Data Wrangler. I just imagine some guy in buttless chaps with a cowboy hat and a lasso. A hard drive fails? He just lassos it out of the hot swappable array and another picks up the task.
Andrew
PS: I know the Archive doesn't have hot spare arrays, they use JBOD, but I was talking about the guy for LotR that has that title.
Just so eveybody knows, Wal-Mart has effectively leveraged its fantastic database of American consumerism into the world's most accurate society predicter. By carefully analyzing when you buy your Chips Ahoy cookies and Diet 7-Up, Wal-Mart now has a bead on your personality. By controlling the webposts you read (by stratiegally placing on sale, thus causing your bladder to reach capacity at the crucial moment) and the food you eat, they are taking seizing the country from beneath our noses! Through careful manipulation, the peoples of Earth will be subdued by their own laziness and fondness for low, low, everyday prices! Only then will the true founders of Wal-Mart be known: the Wal-Martians! They will invade our planet in force and we won't notice until it's too late because Wal-Mart will have alcoholic beverages on sale at 75% off and everyone will be too inebriated to notice! Doom awaits us all! DOOOOOMM I SAY! ...okay... I'm ...uhh... going to sit in the corner in a fetal position with some aluminum foil over my head, okay? yeah...*incoherent mumbling ensues*
They're 16bit each, not 32bit
(spoiler for those that didn't get it: this is the segmented addressing architecture of 8086, the first 'x86' chip from Intel)
-Kz-
data
group noun [U]
information, especially facts or numbers, collected for examination and consideration and used to help decision-making, or information in an electronic form that can be stored and processed by a computer:
Data Audio pronunciation of "data" ( P ) Pronunciation Key (dt, dt, dät)
pl.n. (used with a sing. or pl. verb)
1. Factual information, especially information organized for analysis or used to reason or make decisions.
2. Computer Science. Numerical or other information represented in a form suitable for processing by computer.
3. Values derived from scientific experiments.
4. Plural of datum.
define data
conclusion:
data = information
Read up on OLTP versus OLAP and shut your ass.
The last part of the article mentions RFID tags. Take that and the graphic of the girl with the barcoded forehead and I think wal-mart is trying to implement the mark of the beast. The data mining is just the begining...when will the stores become self-aware? ;-)
Did you know ice cream sales cause rape?
Naw.
Playboy's sites (cyberclub and free site) are under 20 gigabytes total. (i was root@playboy.com in a prior life).
Spice, on the other hand...
Although I'm sure the /. crowd will ridicule me for this...
/.ers, call the WWW is just the HTTP protocol...servers that serve pages using the HTTP protocol are just called WWW by default because it is easier to remember just the domain name and assume that it is preceded by www.
I was under the assumption that the WWW is the same as the internet. WWW stands for World Wide Web, meaning a collection of computers connected together in a "web" like fashion.
I believe what most ppl, and apparantly a few
Let us not forget that you can run a FTP server on a computer whose DNS name is www. You can also run a NNTP server on that same computer.
Or Lizard Wrangler, in reference to some coders for Mozilla.
Zodiac Survey
First of all, most Walmarts don't primarily sell food
Super Wal-Marts sell groceries. You see those in places like Florida. I was in Orlando and it was frustrating the simple fact that there was no where else to buy groceries where I was at. Ok there was a Win Dixie just across the parking lot, but its prices were insane and the quality of the produce was not so good. There were other grocery stores and a Costco but all were about 15 miles away. Trust me I did my best to stock up with Costco goods but for staples like milk, bread, eggs Wal-Mart was the only practical solution.
Regular Wal-Marts I don't believe sell groceries. I don't honestly know because I don't shop there. Super Wal-Marts have a very respectable grocery.
There is no sanctuary. There is no sanctuary. SHUT UP! There is no shut up. There is no shut up.
I think people I know alone have that much MP3 data on their computers which they downloaded over the internet:-). The information does not add up.
O this learning! What a thing it is - William Shakespeare
World wide?
In my whole country I don't know of a single Wal-Mart...
NCR hosts the information in California. I work for NCR and thus am Wal-Mart's slave. NCR's Teradata division does all the storing and processing, we just hand the crunched numbers to Wal-Mart. So, it is not in Bentonville.
so cool! i've never been to a wal-mart... ... ... ... we all shape the world ... :P
all databases should be publicaly accessible,
the world would be so much cooler.
now consider wal-mart opening their
databse to the public, how much meta data would
you get? meta-data being data on "search data",
much like most searched for term on google
it's really sad, so much reasearch material
and the public generates it, but doesn't have
access to it
one other note, i don't believe in owning
alot of stuff but quality stuff, so i'm
prolly not somebody who needs wal-marts.
also think about ALL the thing you can buy
cheaply, but are not just cheap but CHEAP!
they don't last and don't contribute to
a bettetr future, just an overall prove,
that we humans produced alot of garbage,
in the form of little people owning nearly
everything.
in essence by keeping products cheap wal-mart
contributes to low quality.
i mean WHY can't everyone have a a FAT house,
CAR, SWIMMING POOL, etc? i mean it's not
like we have to buy all this workforce
from some alien race in exchange for water
from our oceans
and should start thinking about how, what
and how much we buy. else wal-mart wins
and everybody else is still living in
the slum
"Let us not forget that you can run a FTP server on a computer whose DNS name is www. You can also run a NNTP server on that same computer."
And you could run a Web (HHTP) server on a computer named ftp.somethingorother.com. The DNS name has nothing to do with what kind of server is on the machine (except to give an indication).
To my understanding the internet is composed of multiple subnets who are all connected together (via the internet) and communicate using IP (the Internet Protocol). Above that protocol they can use other protocols (generally TCP and sometimes UDP) that are a bit more specialised and atop these protocols you have even more specialised protocols like HTTP (for the Web), SMTP (for mail), FTP (for file transfer)... and you even sometimes have protocols on top of these, like SOAP which is a remote procedure call protocol on top of HTTP.
The Web in World Wide Web is not the computer being connected in web-like fashion but the webpages being connected to each other in web-like fashion via hypertext links.
Does that explains things a bit?
"The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers." Bill Gates,
Maybe you should get out more often instead of nit-picking people on /.
I think it's closer to 9.1%.
Nah. Just let it run long enough and it will come up with the complete works of Shakespeare.
The rest will be uploaded as the Next Big Thing and be called "The Intarnet". We will tell generations of young hackers that the data's just encrypted and use the time it takes them to realize what's going on as a geek test.
Who is General Failure and why is he reading my hard disk?
As I actually did RTFA, I can now issue a smackdown: "It also takes pains to keep the information secret. Some of the systems it uses are custom-built and designed by its own employees, the better to keep competitors off the trail. Companies that sell equipment and software to Wal-Mart are bound by nondisclosure agreements. Three years ago, Wal-Mart summarily announced that it would no longer share its sales data with outside companies, like Information Resources Inc. and ACNielsen, which had paid Wal-Mart for the information and then sold it to other retailers."
In the 1990s some economists were wondering if hundreds of billions of dollars invested in IT by business was going down the toilet. Because it didnt seem to be increasing productivity or profits. Well, WalMart is the counter example. It invested early and still is at the forefront. It manages to keep prices low (among other things). It is the largest productive force in the world, and may control a third of the world's retailing in a few decades.
I saw the TV special, too. Their computer system is the "second largest next to the Pentagon" according to the show.
There is a Universal Life Value Check it
They'll sell loans to the guys and girls getting the welfare benefits.
Buying or aggregating data is antithetical to protecting consumers?
Oh, wait, I see your knee is jerking over there. Good luck with that.
El Karma: excelente(principalmente la suma de moderación hecha a los comentarios de los usuarios)
Sure, if you scrolled through enough random characters, eventually you'd find the complete works of Shakespeare, but:
1. You'd already need to know what the complete works of Shakespeare looks like in order to recognize it (and therefore, you would be getting zero information you didn't already have) and
2. In order to address the starting location of said works in the random stream, you would need, on average, as many bits as are in the whole works to begin with!
Thus, the random stream still contains zero information.
dinner: it's what's for beer
They got their Internet statistics from the Chinese government.
Or Dan Rather.
Negative. The corporation doesn't even own every single store
That's why it's called a franchise - and it's completely beside the point.
It's possible to own land independantly of the building (and even business(es) within the building) that occupies the land.
Here's an example:
I own some land. I *rent* the land to a developer who builds a building. The developer rents the *building* to several businesses.
Saying "I own the land" does not mean that I own the businesses or the buildings the land sits on. It just means I own the land.
It's perfectly possible that McDonalds does own the land, and rents it to the franchisee as part of the franchise. (Note that a franchise contract typically states where your business will be located - you're not allowed to move without permission of the head office.)
They have this huge database of marketing gold and you think they won't capitalize on it in some nefarious way? You truly don't understand corporate mentality if you think they'll just use that data for inventory purposes.
They're going to use every scrap of that data and wring every bit of profit out of it for any and every purpose it can possibly be used for. We could come up with scenarios until the cows come home and still not hit on all the devious ways marketers (and probably government agencies) will make use of this database.
You surely understand that a series of purchases can be used to create pretty accurate psychological profiles of the consumers who make those purchases. Suppose that the government wanted a list of all the liberals out there for "security" purposes (you know, those people that "hate America.") That way they can round them up, real quick like, during the next national emergency -- for the safety of all the "good Americans", after all.
How about a corporation that wants to market more effectively to promiscuous women so would like a list of them? What if an evangelical group wanted the same data?
Surely you can't be that naive about the way corporations work. Human reason and decency don't get columns on their spreadsheets.
- Hail to our fearless misleader! Fool speed ahead!
A Web site I run contains over 1 terabyte of non-redundant scientific data, mostly text, all accessible directly through hyperlinks, no Deep-Web searches required. I really doubt that our site comprises as much as 0.5% of the non-redundant, shallow Web even.
These are probably "pundits", not "experts" making such claims. Maybe I too can add "pundit" to my business card and portray my subjective opinion as fact on a variety of subjects!
hmm thats alot of data for one company, anyone else get the feeling Wal-mart has teamed up with M$
it is increasingly being used to answer discount retailing's rabbinical questions, like how many cashiers are needed during certain hours at a particular store.
I know there are never enough cashiers at the Wal-Marts by me. I end up going elsewhere just because the check out is such a horrible experience. I know your're reading this Sam.
Cheap storage VM.
You're right. If it's unique information, the internet is only 2 bits big.
There is a good idea there.
What about corporate situations where user's data files are getting bigger and grander, and file servers can't keep up? You could have users store their data seamlessly on other users machines, in the unused space.
Problem being that people tend to shut down or reboot at inopportune times. Also, relying on consumer hardware for important data would be a bad idea.
Not very well thought out, but i still think there is an idea in there somewhere.
Pretty Pictures!
Assuming you don't find some lost Shakespeare plays in the set :)
In July O7, I got a mac pro. There's no punchline. Just endless joy and wonder.
That's my mom.
In July O7, I got a mac pro. There's no punchline. Just endless joy and wonder.
At first I thought you were crazy for thinking that 460 TB is an understatement, but when you think about it, Walmart has over 4000 stores. That's only like 115 GB per store, which doesn't even take into account warehouses, central offices, etc. So I think you are right and 460 TB is a huge understatement.
I alone have over a terrabyte of shit downloaded from the Internet. I seriously doubt there is only 229 more terrabytes to download.
You are probably right on this too but your logic is flawed. The internet is not a stable entity. Most of what you have downloaded probably does not even exist on the net anymore.
Time makes more converts than reason
Sure it does. It's in my public_html/stuff/ directory.
They can't interpret it all because there are no technologies currently available to do so. (BTW - A database/datamart is not "a technology", for this discussion, it's just a place to hold information.)
Data-mining and data-visualization technologies that can handle a petabyte of data do not yet exist in the business world. I guarantee you that advanced research projects are tackling just such problems, and advancing the state of the arts. It won't be long before useful nuggets of information can be gleaned from these vast seas of numbers.
An advanced relationship-visualization tool can be found on the web at - TheyRule.net
Another one can be found at - Map of the Market
It's only a matter of time before all that data will yield useful clues to Total World Domination(tm). And who better than WalMart to exploit these clues to subdue the dominant world power, and move its base of wealth to a communist, human-rights-ignorant county, leaving a vast wasteland of low-wage, no-benefits, tax-roll-supported service-oriented jobs in its wake?
God Bless (what's left of) the USA - Made in China
The Internet DOES have more data than that.
Every Microsoft Windows user with a home computer connected directly to a broadband (cable/DSL) router inadvertently shares their hard drive capacity (and its contents) with the Internet. This, alone, adds petabytes and petabytes of capacity to the Internet, which - according to the RIAA - is even now being used to illegally store and download billions of dollars worth of copyright-protected national treasures, such as Roy Orbison's greatest hits.
"Microsoft - Where Do We Want You To Go, Today?"(tm)
http://www.transparen.com/
Figured i may as well put it out there for everyone who's interested. go to http://lpdaac.usgs.gov/main.asp go to the EOS Data Gateway. This has media options plus the FTP downloads. otherwise the Datapool is full of free ftp data.
"In a world without walls and fences, who needs Windows and Gates?"
sheesh.
Here is a trick that seems to work fairly well. If you ever fill out a survey go ahead and use your real mailing address so you can get junk mail. When they ask your income tell them it is extremely high. For your profession choose whatever you want free magazines in. In a few months you will strat to get the same junk mail as your bosses boss does-- which is a hell of a lot better than the junk mail you _should_ be getting. Basically you will start getting some free magazines. Glen Pepicelli http://www.glenp.net