Slashdot Mirror


Ask Slashdot: Choosing a Data Warehouse Server System?

New submitter puzzled_decoy writes The company I work has decided to get in on this "big data" thing. We are trying to find a good data warehouse system to host and run analytics on, you guessed it, a bunch of data. Right now we are looking into MSSQL, a company called Domo, and Oracle contacted us. Google BigQuery may be another option. At its core, we need to be able to query huge amounts of data in sometimes rather odd ways. We need a strong ETLlayer, and hopefully we can put some nice visual reporting service on top of wherever the data is stored. So, what is your experience with "big data" servers and services? What would you recommend, and what are the pitfalls you've encountered?

9 of 147 comments (clear)

  1. First step by Anonymous Coward · · Score: 5, Insightful

    The first step is to ask Slashdot a really vague question to a highly technical and expensive undertaking.

  2. KISS by Anonymous Coward · · Score: 0, Insightful

    AWS RedShift. Don't bother with old school operating servers, patching OS's, etc.... Just focus on data + business logic. That's where you really add value, right?

    1. Re:KISS by Zarmvenius · · Score: 1, Insightful

      This. Redshift is far and away the cheapest and most straightforward solution. Hooks up nicely with Tableau to help analysts, efficient ingestion.

    2. Re:KISS by segedunum · · Score: 2, Insightful

      Ahh, yes. Cloud stuff. Where you are processing a lot of data and where your processing and I/O resources are not your own. I always laugh at people who say "Oh, we don't need all that infrastructure stuff" and start moaning "Oh, why does it cost so much and why do we have to spend so much more when we add data?" Not to mention putting your important data on a platform that is financially questionable, has outages that providers simply don't care about and where it's going to be one hell of a PITA to move at any time later owing to the amount of data.

      Sounds like a recipe for success.

  3. Check out Amazon Redshift by Anonymous Coward · · Score: 2, Insightful

    Pretty easy to try it out immediately... http://aws.amazon.com/redshift

  4. Re:Dear Slashdot, by Sesostris+III · · Score: 5, Insightful

    Maybe. However I would also be interested in any answer (especially any answer involving FLOSS software). Interested not because it's my job or my company is looking to use such software, but because I'm curious and like to expand my knowledge.

    In general I don't mind such questions on Slashdot, as they're usually interesting and informative to the rest of us. And if they're not, then I (we) don't read the article!

    --
    You never know what is enough unless you know what is more than enough. - Blake
  5. That isn't big data by thogard · · Score: 4, Insightful

    If the data fits in a database, it is not Big Data.

  6. But what do you need? by zmooc · · Score: 4, Insightful

    Sounds like you're very good in the buzzword-department but have no idea what you're doing at all.... What kind of data are we talking about? Lots of writes? Lots of reads? Is the data suitable for splitting up? What kind of queries will you need to run? Do you need uptime? Or consistency?

    Also if you're looking at MSSQL or Oracle, you obviously DO NOT HAVE Big Data. Big Data is data that cannot be dealt with using regular RDBMSes. Do you really have or plan to have multiple terabytes of data? If not, you don't have big data.

    Based on the information you've given us we cannot give you any advice at all apart from stopping what you're doing and hiring an expert.

    --
    0x or or snor perron?!
  7. You must follow the correct process. by codepunk · · Score: 4, Insightful

    1. Hire some bonehead that is expendable and ask him to make the decision.
    2. Fire him when the project fails.
    3. Nobody will ever bring this up again.

    --


    Got Code?