Slashdot Mirror


Information on OLAP Databases?

oh-lap asks: "One relatively new topic in computing that hasn't seen much mainstream press (yet) is OLAP or multidimensional databases. OLAP is based on the principle of organizing your data along a number of dimensions which can be pivoted and drilled into. OLAP sounds like a neat idea and represent an abstraction of several problems which I've seen tackled by custom coding jobs. The biggest problem with OLAP right now, seems to be a dire lack of expert knowledge in the area and no standardized API or query language for OLAP servers. But the lack of decent introductory reading material is particularly bothersome. If there are any experts (or even dabblers) in OLAP out there, what are the sources of information (books, websites, forums, webblogs, etc) that you go to for info about OLAP? The issues I'm interested in are wide ins scope and include everything from querying and data modeling to actual design of an OLAP server. Any pointers"

34 comments

  1. Wikipedia: OLAP by Anonymous Coward · · Score: 2, Insightful
  2. MS SQL Analysis Services by MaxQuordlepleen · · Score: 3, Insightful

    If you happen to have a Microsoft SQL Server lying around, try the Analysis Services tutorial. Takes about 4 hours to do all the tutorials...

    1. Re:MS SQL Analysis Services by omibus · · Score: 3, Informative

      That is the nice thing about Analysis Services. It is "Free" (as anything is from Microsoft). Basically, if you own a copy of MS SQL Server (a non-MSDE version) then you have Analysis Services. Note: you can buy the Developer Edition of SQL Server for $50.

      It comes with some sample cubes to check out, and you can browse the data using Excel.

      As for books: Microsoft(r) SQL Server(tm) 2000 Analysis Services Step by Step
      and MDX Solutions: With Microsoft SQL Server Analysis Services

      --
      Bad User. No biscuit!
    2. Re:MS SQL Analysis Services by kiwimate · · Score: 2, Informative

      If, on the other hand, you don't have MS SQL Server on hand, you can go here.

  3. Mostly Product-Specific by Bravo_Two_Zero · · Score: 3, Informative

    We've had similar issues wrapping our minds around OLAP and justifying a budget for the analysis services. Most of the documentation is, as you noted, product-specific.

    http://www.cio.com/research/data/data_mining.htm l

    http://www.datawarehousingonline.com/rdetail_dw. as p?ID=857&sub_category=OLAP

    They aren't technical links, but they might point you to something useful. ORA has a book on accessing OLAP servers from Java that might have some useful background information.

    For us, we've more or less settled on MSSQL and Crystal Analysis as our most-likely-to-be-viable-and-affordable platform. It's not best of breed, but it is the cheapest. We've looked for OSS/OLAP resources, but they aren't plentiful at this point (as far as we've found... someone please prove me wrong!).

    We're mostly concerned with analysis of a data warehouse, so we may differ from what you want to do with OLAP. I'm sure an Oracle or DB2 shop would have more to say on OLAP as it pertains to their products.

    --


    Amateurs discuss tactics. Professionals discuss logistics.

  4. MDX by Zardoz44 · · Score: 2, Informative
    Have you looked into MDX, the SQL for OLAP?

    MSDN MDX

    There's also some books being sold on the major websites like Amazon.

    1. Re:MDX by IntlHarvester · · Score: 1

      Watch out though, coming from a traditional SQL 2D set perspective to nD MDX can seriously warp your mind. It took me a couple weeks just to completely conceptualize what was going on. Also, the MSDN docs/example are a little thin, at least for what we were trying to accomplish. Of course, now that I've learned it, I can't seem to find someone to hire me to use it.

      --
      Business. Numbers. Money. People. Computer World.
  5. Ralph Kimball by lal · · Score: 4, Informative

    Check out Ralph Kimball's site. A good place to start is his book, The Data Warehouse Toolkit. I haven't read the 2nd edition, but the first edition is full of good practical examples of dimensional modeling.

  6. expert here by NumLk · · Score: 4, Informative

    I'm going to preface this by saying I'm an expert in this field, so my comments may be biased.

    OLAP concepts have been around for quite some time, although as a mainstream product only since the early '90s. Today the two major players in the market are Microsoft and Hyperion Solutions, with Cognos a distant third. (To be fair, there are plenty of other players, these are just the three largest). I personally have worked extensively with both Essbase and Analysis Services, and can honestly say that both have very strong points, but generally speaking I have found Hyperion's Essbase to be superior as a whole to Microsoft's Analysis Services.
    The original thread commented on how little mainstream press OLAP has received, which I wholeheartedly agree with. However, it is worth noting that many, if not most, large businesses have OLAP implementations (I can't find a source, but if I remember correctly, 499 of the Fortune 500 companies have an implementation.). Considering the market is about $3.5 billion, the products are definately being used extensively. Granted that is a drop in the bucket of the total relational marketplace, but it is hardly insignificant.
    Personally, I strongly believe the lack of knowledge outside of the Finance departments of the world about the concepts is directly tied to the lack of exposure the concepts receive at the collegiate level, which continues to focus on relational databases as the primary storage of enterprise data (I am not implying that OLAP database can replace relational databases, they are truely a complimentary product tailored for specific use). The lack of a standardized API & querying languages is definately a problem for the industry, one which the large players have tried to address, with limited success. On the flip side, the industry has pushed (rightly or wrongly) toward selling the total package, with both the backend database and a suite of front end data entry & reporting tools all in one. This makes for an easy sell to many customers, but doesn't encourage much innovation from the outside (IMHO, the front ends to OLAP products are the weakest link, and really do need the most development).
    Ironically, one of the best online resources I've found is IBM's web site. They repackage Hyperion's Essbase as DB2/OLAP, so their Redbooks on the concepts are slanted toward Essbase, but they have the most extensive source of free information on the concepts I've seen from a commercial vendor.

    --
    Children in the backseats don't cause accidents. Accidents in the back seats cause children.
  7. Yukon - MS SQL Server 2003 new features by frooyo · · Score: 3, Informative

    Microsoft is implementing a lot of important functionality that is heavily used in OLAP into Yukon.

    Probably, most important - Yukon t-sql will have the ability to PIVOT. This is a huge time savor for anyone doing reports, which is where most of the functionality of OLAP comes into play.

    For more info on the new features of Yukon, see the link below:

    http://www.sqljunkies.com/Tutorial/F73E1FFE-0577-4 630-B92E-C41CB7C5088F.scuk

    1. Re:Yukon - MS SQL Server 2003 new features by omibus · · Score: 2, Interesting

      That is a bit simplistic. Probably the biggest advantage of OLAP over SQL servers (any of them) is SPEED! OLAP does agregation querys hundreds of times faster than SQL servers.

      I've seen SQL querys that took hours to compute only take a matter of seconds for an olap cube.

      But yes, OLAP is used for reporting, but really its strength lies in analysis, which means you are not relying on predefined reports.

      --
      Bad User. No biscuit!
    2. Re:Yukon - MS SQL Server 2003 new features by AndyElf · · Score: 1

      hmmm, isn't is because that same OLAP cube has been building for a few hours before?

      --

      --AP
    3. Re:Yukon - MS SQL Server 2003 new features by Anonymous Coward · · Score: 0

      Yes, but if you due more than one query, you win.

  8. Datawarehouse is the generic term by martin · · Score: 1

    For all this stuff. An OLAP cube is just one way of implementing the DW. As a previous post mentions Ralph Kimball is *the* guy here and his books are very readable.

  9. Check out Cognos by illusion_2K · · Score: 3, Informative

    While I personally work in a Cognos/SQL Server shop and am therefore probably a bit biased, I seriously think you should take a look at Cognos' offerings. Their software is great and probably about as leading edge as you can get in the OLAP world. While we're using their PowerPlay and Impromptu products, the direction they're going at the moment is towards ReportNet which is a new product they just released a couple of months ago. Browing their website I came across this online demo that might help you in coming up with a justification for such a system (registration required, but worth it IMHO).

    Anyway, if you're looking for help from someone who deals with OLAP systems - drop me a line. ;-)

    1. Re:Check out Cognos by Anonymous Coward · · Score: 2, Informative
      I'm going to clarify your statements, keeping in mind that I work at Cognos.

      PowerPlay is the OLAP analysis tool. It came originally in a desktop application, and now in a web application; both use the same OLAP engine. We have our own proprietary cube format, but we support Essbase and MSAS OLAP as well.

      Impromptu is the relational database reporting tool, which can report on data from all the major DBs, but does no OLAP.

      Likewise for ReportNet. There is some OLAP technology built into ReportNet, but that's under the covers. There is currently an integration between Powerplay Web and ReportNET, but future releases will have a seamless Relational + OLAP environment; i.e., a combination of Impromptu and Powerplay with modern technology.

    2. Re:Check out Cognos by illusion_2K · · Score: 1

      Thanks for the clarification - I probably should have mentioned those specifics rather than just linking to the product webpage, but what are you going to do?

      The reason I mentioned ReportNet specifically is because, according to a Cognos rep. I spoke to a while ago, the long-term vision over there (as I understood it) was to replace PowerPlay, Impromptu, possibly Visualizer and whatever other applicable applications over to ReportNet after version 8 or 9. If that's indeed the truth then it seems like a good idea just to bite the bullet and go with ReportNet today - but what do I know, I'm just a customer. :-)

    3. Re:Check out Cognos by Anonymous Coward · · Score: 0

      I don't know the schedules, but you're correct about the ReportNet plans. ReportNet is currently a next-generation Impromptu (and Cognos Query). The next-generation Powerplay is in BETA (or close to it), and it will become another piece of RN. Over time, other major products will be tied into the overall framework as well. The idea is to allow the consumer to report on data without getting bogged down in details of whether it is relational or OLAP, or anything else.

  10. OLAP, ROLAP, MOLAP, HOLAP, Shmolap .... by Circuit+Breaker · · Score: 3, Informative

    Recommended reading:

    Data Warehousing for Cavemen by Phil Greenspun (of ArsDigita fame) -- some background, and implementation using SQL.
    A dimensional modelling manifesto.
    Wikipedia has good coverage.
    MDX is the query language (look it up in MSDN). Personally, I don't like the syntax but who cares.
    And then, when you're convinced OLAP is complex, have a look at Stevan Apter's Drilldown example. The source code is here, all of one printed page, including the GUI and generation of random data.

  11. Some historical perspective by gtrubetskoy · · Score: 1
    One relatively new topic in computing that hasn't seen much mainstream press (yet) is OLAP or multidimensional databases

    I am by no means an OLAP expert, but back in 1995 (I think) I worked on a project where we analyzed huge amounts of antimicrobial resistance data, and I looked at a lot of multidimentional databases, read a lot of magazine articles on the asubject, etc. I remember that back then, the traditional database was considered obsolete. I remember Informix went on a shopping spree acquiring Red Brick and a few other non-traditional database vendors. They also had this hot new product which would give you a completely new perspective on databases, and I can't even remember the name of the darn thing... The database market seemed as hot as the .com in 99.

    The funny thing is that almost 10 years later, Informix, having gone through a mini-Enron financial crisis has been bought by IBM, noone even remembers startups like Red Brick and Oracle pretty much rules the db marketplace, and their product is a very traditional database primarily interfaced with via SQL... Not an object-oriented, not a multi-dimentional, not an OLAP database.

    As far as multidimentional analysis - Microsoft Excel (of all applications) used to provide a pretty capable tool back then and still does now - it's called pivottable.

  12. This is actually a very old database idea. by Linux_Bastard · · Score: 2, Interesting

    New as of 1968
    Can anyone say MUMPS?

    Massachusetts General Hospital Utility Multi-Programming System.

    MUMPS origin

    The latest version of the MUMPS language/database is Cache

    There is even a free version called GT/M

    Sanchez GT/M

    The story of MUMPS is actually a sad tale of a bad language with a great Database.

    Origin in 1967
    ISO standard 11756 (1991).
    ANSI standard: "MUMPS Language Standard", X11.1 (1977, 1984, 1990)
    Effectivly killed in the late 90's by Intersystems.

    The hallmarks of M were the very terse sytax and fast eficient databases.

    99 bottles of beer program example

    R O,C,B,b U 0 F X B,O,C,B,C W "Take one down, pass it around,",! S b=b-1 X B,O W ".",! H:'b W !
    W " on the wall"
    W ",",!
    H 1 W $S(b:b,1:"No more")_" bottle" W:b-1 "s" W " of beer"
    99

    My signature is a complete DSM program that prints out a formatted, justified report of all the prime numbers between 1 and 1000.

    --
    F X=0:1:9999 F D=2:1 Q:((X>2)&(X#D=0)!((D>X/2)&(X'=1))) I D>(X/2) W:$X>75 ! W X,?$X+5-$l(X) Q
    1. Re:This is actually a very old database idea. by Anonymous Coward · · Score: 0

      Your signature is one of the worst examples of "chicken scratch" I've seen in a long time. Perhaps the C obfuscation people should migrate to MUMPS, it seems ideal.

  13. What has not been mentioned so far ... by kbahey · · Score: 1

    OLAP is not a new concept. I evaluated OLAP products back in 1996 as part of a project for a customer's data warehouse.

    As you say, there is no standard, and different vendors advocate different solutions.

    The main issue is that the Cube (as OLAP databases are often called) has to be refreshed every day from the data source (the main system). It becomes a nightmare to sync the data, specially when the OLAP was started by departments, without IT involvement in the first place.

    ROLAP, the relational version has some merit here, but performance can be a problem. Many databases cannot cope with such a load of ad hoc queries, and therefore the database vendors advocate a separate OLAP server.

    Some data warehousing specific databases (e.g. NCR's Teradata [Disclaimer: I work for NCR, but not in the Teradata division. I did consulting on data warehousing with NCR for customers a few years back]) can handle such a load and a relational OLAP or departmental systems that feed from the main store is a good solution, since you get one version of the truth, data cleansing and scrubbing is done once, and the data goes in one direction only (from source system(s) to data warehouse, to OLAP servers).

    Here are some products that were not mentioned:

    - Microstrategy DSS Agent, DSS Architect, ...etc. These fall under the Relational OLAP category. Setting it up is tricky, but gives you OLAP functionality without a separate data store (if the database can handle the load, Oracle often cannot with large databases).

    - Red Brick is a company that makes an OLAP specific database (or used to).

    - Business Objects used to have a product too, can't remember its name.

    Others have mentioned Cognos Powerplay and Impromptu, and ESS Base already.

    It has been quite a few years since I did data warehousing, so the above may be a bit dated.

    1. Re:What has not been mentioned so far ... by Kardamon · · Score: 1

      Sybase also has a dedicated data warehouse server: ASIQ (Adaptive Server Intelligent Query). I have never used this product, so I cannot comment on it (Disclaimer: I used to work for Sybase).

      --
      -- Qu'est-ce que la propriété intellectuelle? It is thought control.
  14. OLAP is old news and useful by salesgeek · · Score: 1

    The driver behind the "data warehouse" concept is reporting and analysis. OLAP like capabilities have been available to end users for some time via tools like:

    * Crystal Reports
    * MS Excel Pivot Tables
    * Cognos
    * Brio Enterprise

    They usually have little appeal to software developers but are incredibly useful to managers. I've always wondered why as OLAP lets end users quickly pivot and drill through data without having to bug IT people to get reports written...

    Adding OLAP type analysis is a great idea for any software that is database driven. It gives users the ability to get more knowledge out of data.

    --
    -- $G
    1. Re:OLAP is old news and useful by boogy+nightmare · · Score: 1

      Not to forget the biggest platey around Business Objects reporter/developer...

      And newish on the scene is Siebel Analytics (hmmm still a little iffy about this one)

      --
      Kingdom of Loathing (www.kingdomofloathing.com) Addicted is me
    2. Re:OLAP is old news and useful by salesgeek · · Score: 1

      Siebel Analytics (hmmm still a little iffy about this one)

      DEVELOPERS LISTEN UP: The trend in OTS software is to add OLAP capabilities and call it analytics. From a technical standpoint, it's whoop-de-doo. From the end user standpoint it's what they want. If you have a VAR channel, they want nothing of it because they will lose revenue writing custom reports.

      Siebel's product is a case study in this trend.

      --
      -- $G
  15. Open source OLAP by arberya · · Score: 1

    Java, Open Source OLAP. Mondrian

  16. TDWI by bdbarnesatl · · Score: 1

    For coverage on OLAP and more generally, business intelligence, visit The Data Warehousing Institute. In addition to their online articles, you can subscribe to a weekly email newsletter which provides timely info about BI and OLAP developments. If you really want to get into it, they have some hands-on courses offered fairly regularly around the US. I haven't taken any of them, but based on the course outline the OLAP course looks pretty informative and I think TDWI has a good reputation.

    Finally, you might check out DM Review's website as they have some good content on data warehousing, BI and OLAP as well.

  17. Relatively new? by Anonymous Coward · · Score: 0

    OLAP isn't new at all. OLAP, ROLAP, warehousing, datamarts, datamining/dipping have been around for some time.

  18. OLAP by Anonymous Coward · · Score: 0

    Sure, why don't you hope on my lap and well talk about the first thing that pops up.

  19. Oracle and MS SQL Server OLAP by gentlewizard · · Score: 1

    Although both Oracle 9i Release 2 and Microsoft SQL Server 2000 have OLAP features, they differ greatly in their approach.

    Microsoft has a separate analysis engine (Analysis Services), with separate storage for the cubes. This has the benefit of offloading processing onto another server, but carries with it the burden of transferring the data, usually with the Data Transformation Services (DTS) utility. But DTS doesn't scale well in the current version, and is being totally rewritten in Yukon to correct this and add functionality as an Extract, Transform, Load (ETL) utility.

    Oracle has chosen to implement their OLAP engine in the main database, subject to the same security model. The benefit here is that the data isn't moved outside the security boundary of the database. Their ETL solution is not a separate utility; they chose instead to add functionality to SQL such as external tables (flat files that pretend to be Oracle tables), multi-table INSERT statements, and the new MERGE statement that either INSERTS or UPDATES depending on whether a row already exists. Also, the parallel processing features built into the database engine and SQL can be used, making it scalable.

    It all depends on whom you believe, but data warehouses tend to be BIG databases, and Oracle knows BIG, so I'd give them the edge right now, though they are a pricier solution and the tools aren't as nice as Microsoft's.

  20. instantOLAP by okock · · Score: 1

    Hi, if you don't like the complexities and/or prices or installation hazzles of "the big" OLAP enabled servers, try instantOLAP with a low footprint and easy web access to your data.
    This is "low footprint" in terms of installation issues and requirements for backend databases: The engine enables you to talk OLAP to any relational database without first needing to create cubes in your dedicated OLAP server. This way you can issue OLAP queries to your production database instead of the one month old cube.