Slashdot Mirror


Internet Data Mining for Investment Analysis

CaroKann writes "Reuters is reporting on a Wall Street investment research company, Majestic Research, that is using web crawling techniques to track business performance. Instead of attempting to estimate business conditions by talking to company management, or pounding the pavement visiting stores, this company uses data mining systems to collect real-time sales data and other information on companies that have a web presence. Using this data, Majestic attempts to estimate company earnings more accurately than traditional research outfits."

4 of 74 comments (clear)

  1. Now that the companies know that... by drgonzo59 · · Score: 2, Insightful

    They can create bogus pages to feed to the Majestic bot like in the BMW vs. Google case.

  2. Cue the web spam... by Rob+T+Firefly · · Score: 5, Insightful

    We can expect yet another huge rise in fake blogs, fake product reviews on Amazon and such, and paid shills in chats and message boards. Swell.

  3. I call Bull by spectrokid · · Score: 3, Insightful

    TFA mentions data about drug prescriptions by hundreds of physicians. Is that lying around unorganised on the net? Tell me which algorithm you are going to use to predict how many XBOX365 are going to get sold next month by webcrawling??? You think supermarkets post their sales-figures to public webpages? Wallmart is said to have more data off-line than is available on the entire public section of the net. Now give me access to that.. But on the other hand; if you work for the sales-tax administration (in Europe) and all the big companies file their invoices weekly, that is also a good starting point...

    --

    10 ?"Hello World" life was simple then

  4. the rise of the machines... by DeveloperAdvantage · · Score: 2, Insightful

    This is interesting stuff. I would like to learn more about the algorithms they use to analyze their data - the article has very few details. It is neat how systems like this are becoming favored over traditional human analysts (or at least reducing the need for people).

    I remember back in grad school in the late 90s I worked on a major project to design an intelligent agent based system including the same functionality, but, in addition to pulling information off the internet, it could also take into account whatever other information could be gathered and interfaced into it (for example, there is also a lot of content on TV which could be fed into a system, in addition to the online data). It was a design project though and not implemented, perhaps I will need to resurrect it!

    I do think the whole area of quantitative or at least semi-quantitative analysis of information, both textual and numerical, is going to explode over the next few years, driven by vast amounts of incredibly cheap computing power and bandwidth. Computer applications do amazing stuff right now, but five years from now truly "intelligent" applications will exist. The term "artificial intelligence" has fallen out of fashion, perhaps a sign of how common place these systems have now become.

    As an example, our local phone company has a voice recognition system which actually works reasonably well, much, much better than anything 5-10 years ago. We are certainly making progress.

    --
    FREE - Java, J2EE and Ajax Audiobooks for Software Developers - www.DeveloperAdvantage.com