Slashdot Mirror


Wal-Mart's Data Obsession

g8oz writes "The New York Times covers Wal-Mart's obsession with collecting sales data. Fun fact: 'Wal-Mart has 460 terabytes of data stored on Teradata mainframes, at its Bentonville headquarters. To put that in perspective, the Internet has less than half as much data, according to experts.' That much information results in some interesting data-mining. Did you know hurricanes increase strawberry Pop Tarts sales 7-fold?"

14 of 581 comments (clear)

  1. economies of scale by man_ls · · Score: 4, Insightful

    When you have 460TB of data, how the hell do you even begin to search it?

    Seems like they'd need to license map-reduce from google or something. (That's a distributed data correlation engine. With extremely high fault tolerence, to boot.)

    1. Re:economies of scale by Sexy+Bern · · Score: 5, Insightful

      More to the point - how do they back it up?

  2. And in other news... by wesmills · · Score: 4, Insightful

    ...Microsoft has an astonishing amount of information collected from Windows Update users (none of it personally identifiable, of course).

    I highly suspect Wal-Mart didn't get into the position it's in of being the largest retailer by being stupid, at least business-wise. This is the sort of project that allows them to stock a 120,000 square-foot big box store from JIT shipments every night, and why every Wal-Mart in a region looks the same. Though I would be interested to read more on the pop-tart to hurricane correlation...

  3. Correlation doesn't imply causation!!!!! by Baldrson · · Score: 5, Insightful
    Did you know hurricanes increase strawberry Pop Tarts sales 7-fold

    Correlation doesn't imply causation!!!!!

    I mean what if a third factor caused both the hurricanes and strawberry Pop Tart sales to increase 7-fold????

    Somebody was going to blurt that bromide out at that statement, so it may as well be me.

    1. Re:Correlation doesn't imply causation!!!!! by krymsin01 · · Score: 4, Insightful

      It makes sense though. If you are going ride out a storm, you are going to need lots of food that will not require refrigeration nor cooking.

      Beer makes sense also. There are always a hell of a lot of hurrican parties in Florida whenever a hurrican comes 'round.

      --
      stuff
  4. even the mango is tracted by loid_void · · Score: 4, Insightful

    My brother sells mangoes to the Wal Mart Beast. He says it's all computerized, beginning with an order for the fruit, following the trucks, even the rotation of the ripening process in the warehouses is computer related. It's as close to virtual management as any company comes.

    --
    Anyone seen my jagged little pill?
  5. There's a name for this.. by k98sven · · Score: 4, Insightful

    The Law of truely large numbers.

    Basically, the more data you have, the more likely you'll find weird coincidental correlations.

    I guess these kinds of 'statistical finding' will become more and more prevalent in the future, given that we're living in an age where we're collecting ever-larger amounts of data, and have the resources to process all this data automatically.

    It would be a good thing if people were a bit more sceptical of this kind of stuff. Correlation isn't causation.

    1. Re:There's a name for this.. by sql*kitten · · Score: 4, Insightful
      It would be a good thing if people were a bit more sceptical of this kind of stuff.

      Ermm, RTFA.
      1. They predicted that pop tart sales would increase
      2. They shipped additional pop tarts in anticipation
      3. The pop tarts sold like, umm, hot pop tarts

      You can be skeptical all you want. Someone at Walmart made the call, and they were right.
    2. Re:There's a name for this.. by k98sven · · Score: 4, Insightful

      I did RTFA.

      And, firstly: that's not exactly a proper test.
      (Supply does create demand. Why do you think stores like building big pyramids of merchandise, and so on.. Hint: It's not just because it looks pretty.)

      Perhaps you should read my comment again and try to get the point. I wasn't neccesarily being sceptical about pop-tarts. I was being sceptical about the method in general.

      Obviously some of the correlations they'll find are real too. That's not what I was referring to.

      What I was referring to, was that it's very easy to become blind to the statistics. To fall into the trap of seeing correlations where there are none. The human brain has a remarkable pattern-finding ability. Unfortunately that ability does lead us astray sometimes.
      (For instance reading human faces into natural formations, and so on)

      Besides this, the Wal-mart people probably aren't very interested in talking about the times their fancy new method failed, are they?

  6. Re:So, if Walmart put up a web interface... by Frnknstn · · Score: 5, Insightful

    Firstly, there is no way they can be talkinging about all the data availible on the internet. Filesharing networks alone have WAY more data than this, and when you add all the FTP servers and mirrors, the webmail archives, the home Windows users with insecure shares...

    There is no way this can be true. Even if you ONLY take publicly availible WWW pages, it would far exceed their measly estimate.

    --
    If it's in you sig, it's in your post.
  7. Re:Seen it! by SamMichaels · · Score: 4, Insightful

    The gentleman who gave me the tour indicated they have something like 72 weeks (1 year plus 2 weeks)

    According to Google:

    1 year = 52.177457 weeks

    So 72 weeks is 1 year plus 19.822543 weeks.

  8. Re:I would have thought that the Internet had more by Brynath · · Score: 5, Insightful
    But the Internet Archive is on the internet right?

    That means that the internet has well over a petabyte of information on it, much of the information is probably the same but it is on the internet>

  9. WalMart BS by Doc+Ruby · · Score: 4, Insightful

    WalMart's 460 TB of data, shared among about 300M Internet users, would spread about 1.5MB to each person. That is, of course, a tiny amount of data - probably just the indices on each person's inbox, let alone their email data itself. Each of those people average storage capacity is over 20GB, on new computers, excluding upgrades which are probably usually about 80GB. So just typical end user computers alone account for at least 10,000 - 40,000 times WalMart's big data dump. And then of course there are all the other servers on the Internet, like the SABRE airline reservation system, the US Federal databases of publications, Google's image cache, all the albums and other MP3/SHN/FLACs in P2P, and of course the endless stream of porn.

    WalMart is trying to make itself look like it is turning its customer data into success, and benefits for its customers. That serves to downplay its reliance on labor exploitation, monopolistic competition when it enters local markets, and political favors that structure labor and market laws to give it a competitive edge. And WalMart might just be believing the IT sales hype that it spends millions of dollars on. But that's no reason we should buy their IT BS as much as we seem to buy their wares.

    --

    --
    make install -not war

  10. Be Afraid? Why? by mveloso · · Score: 4, Insightful

    Why should we be afraid of Wal-Mart? They're using their data to be more responsive to their customer. They want to make sure that if you want something, it's in-stock and ready to go.

    What could they do with their data, really, that would hurt anyone? It wouldn't be like "Bob Smith is buying condoms again." It would be more like "there's a condom spike in area code 78750 every Thursday, let's ship more out."

    People who are afraid of data aggregation are jumping at shadows. Nobody cares what you in particular are buying. An individual as a data point is useless, unless you're an exemplar or something like that (which would be unusual).

    Let's face it, individuals just aren't that interesting. More importandly from Wal-Mart's point of view, there's no return on looking at individuals.