Open, Web-Based OLAP Clients?
Zoloft asks: "I'm looking for a web-based OLAP client; something with lots of nifty features that PHBs find appealing. The normal available offerings are proprietary, expensive, and closed, closed, closed. Open-source would be nice but not required. Money is not an issue. What's important to me is: it's UNIX native - not NT native and ported with one of those bloated NT-to-UNIX layers - and its data formats are *open*, *readable* and *programmable* - whether it sits in files or a database. I have been beaten down for 10 months with this product that was forced upon me, which shall remain nameless of course. The only way to "develop" a custom app was through its piggish graphical front-end binary, obfuscated file formats and no programming or scripting hooks. I could go on and on, but you know the deal."
I've often thought, that if I make ONE open source software, OLAP would be it.
:)
I Love Multidimensional Databases... ESSBASE and Microsoft Plato being the ones I know best.
I'll tell you about the database I worked on, it was for a County government, the dimensions were Theme, Program, Package, Job Class, Department, Year, Account, and Version.
So, if you wanted to know how much the County Hospital Nurses planned to spend on Pencils in '97, the database knew.
It pulled data from MS Access, Oracle, SAP, IBM DB/2, and old Prime minicomputers, piled it all into a 2 gigabyte database, and allowed real-time browsing and editing.
2 gigs is what the consultants set it up to be, I optimized it down to 500 megs, which, since the server only had 200 megs of RAM, made queries go 10 times faster. They liked me there
Interestingly, If I extracted the 'level zero' information, and the database schema, and the attached Access database, I had everything needed to reconstruct the entire 2 gig database... those files pkzipped, fit onto 2 1.44 meg floppies.
Basically, an OLAP database pre-calculates every possible combination of the data, and stores it for fast (on-line) retrieval.
I've been programming SAS for seven years, and have begun exploring open source alternatives. I don't know of any general solutions, but components exist.
What specific features do you need in an OLAP tool? Among those I can think of:
I'm not saying that a solution should have all of these features, but they are, in rough ranking, the ones I'd be looking for. My preferred model is to build a solution from existing components, or at least structure it from multiple modules, rather than look for a single integrated system. One thing SAS has taught me is that this isn't the best way to fly.
Anyone else have thoughts on relative importance, unnecessary items, or other features they would want to see?
What part of "gestalt" don't you understand?
As for "PHB," that is the acronym that Scott Adams, famed for the comic strip Dilbert coined to describe the Pointy Haired Boss. Usually used to describe someone who is completely devoid of practical knowledge but who is "in charge."
If you're not part of the solution, you're part of the precipitate.
OLAP usually stands for On Line Analytical Processing. (Footnote: the OLAP Council website claims to intend to provide common definitions, but do not actually provide a definition for OLAP...)
Datamation describes it thus:
OLAP is pretty strongly associated with the common buzzword, "Data Warehousing."
More precisely, what it is about is the notion of taking the data created by an online transaction processing system, and collecting this into a big database that you then want to do "analysis" on.
The point here is that the analysts that are looking for patterns need to have a separate copy, as the things they do may hit a DB server hard, and are probably not friendly to the transaction-oriented operations of "Entering Invoices," "Processing Sales," "Paying Bills," and such.
SAS is pretty big on OLAP, as they have been building powerful statistical software for many years now.
If you're not part of the solution, you're part of the precipitate.
More info about Applix's TM/1 can be found at
http://www.applix.com/applix ware/linux/prodovertm1.cfm
Bye, LenZ
Multidimensional refers more to the physical data layout than the actual logical model, which can still be relational.
In a conventional RDBMS, a tuplet of values (a,b,c,d), say, can end up in any free block in the table (and indexes are needed for each column searched). In a multidimensional db, tuplets are laid out as if their values were array subscripts, the way a multidimensional array would be in, say, FORTRAN. It's a good deal easier to find data that way, since it's the same as an array lookup. The downside is that the data has to be fairly dense, or else you'll end up with a lot of empty holes and wasted space.
In a data warehouse, the array is usually organized along dimensions that are well-populated, such as time, location, cost, etc., so the sparsity is manageable. There's also a high priority on looking up things by virtually any dimension or combination of dimensions, so the array format is particularly useful.
---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
I have been beaten down for 10 months with this product that was forced upon me, which shall remain nameless of course. The only way to "develop" a custom app was through its piggish graphical front-end binary, obfuscated file formats and no programming or scripting hooks. I could go on and on, but you know the deal
... Unfortunately, the suit love it :)
Ho! you mean Microsoft Access ! Yeah, I know about that, I've been through it too. A brain-damaging experience
:wq
OLAP stands for OnLine Analytical Processor. From what little I know, it's basicially a program that can take information in the form of a (usually multidimentional) database and figure out stats and trends. Basically it's a one of those programs that figure things out like 20-40 year old males are the primary consumers of home electronics. (Like we already did not know that)
The problem is the voodoo most OLAPs perform on the data is very complex and propritery.
IBM has OLAP for DB2. Here's the link.
http://www.software.ibm.com/data/db2/ db2olap/
Plus IBM is Open Source friendly. (although DB2 OLAP server is not yet available on Linux)
-Jay
And you'll keep getting assorted answers in various directions unless you're more specific...
dbProbe? Web OLAP. Commercial. Java.
Why didn't you just use Cistron RADIUSD?
It's fast, stable, and open-source. We've used it for the past year (with and without NIS) and never had a problem with it.
One of the best examples of OSS I can think of.
On-Line Analytic Processing is one reading. Try 'data warehouse' or 'data mart' as well. Cheers, Drieux
...the easy way is -always- mined...
This is one of the best references I've seen for OLAP terminology.
Basically, OLAP/Data mining/ Ect. are optimizing data bases for analysis (ie, data warehousing) and then applying machine learning algorithms to find interesting, useful bits of information which you werent' aware of before. This process goes by many different names including OLAP, data mining, KDD (knowledge discovery in databases), and intelligent data analysis among others.
While some of this is pure hype, there is some very cool, interesting work going on.
I had a lot of fun with Dynamicube from www.datadynamics.com. I applied it to 6 million record files with up to 10 dimensions. Wish I could find the same for Linux.
There are several datacube efforts, some in actual practice which are not labeled as such. One is analog which builds a datacube from httpd access logs.
Once one has well defined categories, all it takes is to add one dimension for the sum, form the cartesian product of the categories and sum all the combinations of the zeroth member in each category. The process is reversible so that records can be added and deleted.
It can also be generalized to more than sums and there are some parallels to n-dimensional correlation matrices.
The problem gets a bit nattier when the data is sparse. I suppose an index of keys would ameliorate the problem
I would be willing to help in such an effort and can help lay out specs.
1. Pointy-Haired Boss...see that "boring" layer of computers known as Dilbert.
2. On Line Analytical Processing. Data Warehousing, Data Mining [insert other buzzword here. Not my cup of tea, certainly. But I'm not sure that you can call it the "boring" layer!
PowerPlay isn't particularly "open", and has the limitations talked about above. Largely graphical user interface, little back end programming ability, and no adaptive interface to speak of. Their next generation software looks a lot better, but of course it isn't out yet. The current release has had, in my opinion, some problems scaling. Check my reply to the main topic for more.
OLAP stuff is tough to write from scratch; even if you've got a good underlying database that's designed for OLAP, writing code to navigate a 7 dimension or higher cube isn't pretty. What is it you need to do that a low to mid tier package like Cognos won't do?
Holos, which Seagate recently bought, is very open and strong. They have a structured programming language that gives you as much support as I imagine you could use for back AND front end work. Their web interface is developing, but it's in early stages yet... hopefully by next calendar year they'll have a new release out that will pretty that part up. If you need real strong scalability, Unix support (sorry, no Linux to my knowledge), and fairly open control, it may be worth looking into.
Unfortunately, people that need the sophistication of OLAP haven't been the people that write OpenSource software, so I don't know of any truly open solutions. If anyone wants to write one, tho, I'm willing to help!
sas is a brilliant company for leveraging legacy apps. if you did a masters anywhere doing statistical research or clinical trials, you've tasted sas. lots of folks learn it in postgrad studies. but using sas in today's market is a little like mba's using their same hp calculators to run the company finances. it just doesn't make sense except for the fact that you are extremely comfortable with it. but what can you say to people who are in love?
hyperion as a company has a problem in reaching slashdot-type folks. that's probably because it came from the finapps space. not a lot of bsd in corporate finance. be that as it may, don't let that convince you to try and build olap from scratch. i've met with very bright guys from citibank who built their own. (believe me it was a seriously hostile, new york style meeting). but when i explained the internals of essbase to them, they came around. i mean the stuff is patented, and the inventor understands everything about sparse matrix math and all that eigenstuff.
there are a significant number of computational problems that solid olap technology addresses. the fact that ibm, acknowledged master of sql optimization, has opted to oem the hyperion technology rather than build their own is all the proof anyone should need.
fault-tolerant
The original poster asked for:
Open-source would be nice but not required. Money is not an issue. What's important to me is: it's UNIX native - not NT native and ported with one of those bloated NT-to-UNIX layers - and its data formats are *open*, *readable* and *programmable* - whether it sits in files or a database.
From what I've seen, Oracle products on UNIX fit the bill here. Note that he says that "Money is not an issue."
*open* doesn't mean Open Sourced. Oracle products are definitely programmable, and readable, if that means the data is conveniently manipulated.
As to Larry Ellison being a Billionaire. Oracle sells in a competitive database market. Ellison is rich because their product is good. Is there some requirement to hate those who are successful?
Check out this link at Oracle's web site
I think that Erik Thomsen's book "OLAP Solutions" comes with an evaluation version of TM/1.
Firstly the link in one of the other posts didn't seem to work so try this. ,SCL, webAF or webEIS.
Secondly which SAS programming language don't you like ? Base SAS
I agree that SCL syntax can be a bit annoying at times but the next version of SAS V8 to be released early next year has much better syntax more like c++ or java. WebAF is just java with a lot of extra classes added to it and a ide so you know what to expect there.
But the main reason for using a product like SAS is that you don't have to rewrite all the statistical and analytic back end procedures. However if you don't like the front end there is a standard server for Open OLAP server available from sas as well as several different web front ends.
For more info check
SAS OLAP
Cognos OLAP
Oracle OLAP
An aside SAS is releasing htmsql 2.0 for Linux as well as all its standard platfroms on tuesday. Does this mean that all of SAS is to be ported to linux ?
Grem
Some OLAP (On-Line Analytical Processing - i.e. reporting) stuff off the top of my head that I've had contact with:
Cognos
CrystalReports/Info
ActiveReports/ActiveCube by DataDynamics
Cyberprise
My company tried Cognos and it seems to be a heavy hitter to satisfy the PHB's. It's got data mining/driling down, stored static cubes so you don't need to go back to the DB (makes it fast) and when you drill down till you are out of data, you can go into the DB off the cubes. You can run reports right off the cube data. Unfortunately I wasn't a part of that venture but from what my co-worker(developer) says, it was pretty slick... YOU DO PAY through the nose for it.
Crystal Reports: Very slick & easy to use. Almost idiot proof to run them off the web BUT the web engine is single threaded and you can only run one at a time on the web server (useless!). If your DB is slow and 5 people ask for a report each at the same time and they all take 5 minutes, the last person will be waiting 25 minutes...you know that by then they've already clicked refresh 15 gazillion times (or the default install of IE has given up). The ActiveX and JAVA controls that come with Crystal 7 that allow you to view reports through the browser are sweeeeet. You can export reports to RTF and a couple other formats right from the browser. Oh ya, it's also VERY easy to design reports and the COM interface makes it easy to work with. I demo'd CrystalInfo but select boxes confuse our users enough that we didn't want to give them the ability to create reports on a whim if they dont' understand the underlying tables. You can pay some comapny 10k U.S. for a multithreaded Crystal Print engine. Crystal7 is reasonably priced.
On the same lines, ActiveReports (and ActiveCube) by Datadynamics is quite a bit more useful although no where as easy to use as Crystal & doesn't come with the handy pre-built functions to manipulate/shape data (but I like doing stuff from scratch ;). It is an ActiveX Designer plugin for VB. You crateall your reports in VB and then create a generic report object to wrap around those activeX designers. And the best thing of all was that you could run multiple reports simultaneously which beats the pants off of Crystal. The export controls aren't as full featured though and there aren't many export options but there are enough that you can get data into pretty much any app. Besides, too many options confuse users ;) ActiveReports is reasonably priced.
Cyrberprise was another thing my company tried but I can't say much about it as our interest leaned more torwards COGNOS.
Anyway, currently we are using a crappy buggy 16-bit Helpdesk software by Applix (transitioning to a home-brew Oracle Forms app instead) and the reporting was buggy and useless..not to mention they save the users password in a text file in the root of the system drive...but I digress.
This Oracle thing (started out on web...too flaky, going client/server unfortunately) needed web reporting so all the Crystal Reports we had suited it perfect to run via ASP (Active Server Pages). I designed a system that allowed me to put all sorts of dynamic web selection forms in front of the crystal engine and pretty much run any report we had. I can add options to the selection form just by inserting into the DB and it pops up on the web page.
This allowed users to run pre-defined report templates against the system to extract the stuff the needed. All in all, it works great (except for the slow DB and single threaded Crystal report engine) and I'm in the midst of modifying it to be able to run Crystal and ActiveReports (so I can port everything to ActiveReports).
As was previously mentioned, you could use PHP script or PERL or C/C++ on Linux to do your stuff but that would require a lot of work.
Sorry I couldn't give you any Linux info. Perhaps these companies have something coming down the pipe.
--Clay
"Olap" is Finnish in origin. It is synonymous with "data whorehouse". I don't understand it; I just explain it.
Honesty. Loyalty. Kindness. Laughter. Generosity. Magic!
I happen to work as an intern at a fortune 50 company right now-- and i've been asking around... They have every spare engineer and technician working on algorithms/programs/controls etc for the automation of data analysis.
NASA and the Air Force started it years ago with a project they did to detect (and thereby save money on unschedulred repiars) early failures on rotor shafts and bearings.
I find this stuff very boring and rudimentry. The *only* reason any company keeps this propeirty is because they don't want people to know how crappy their products are (i'm referring to in-house OLAP development, where release would mean exposure to their most sensitive failure data).
IBM apparently uses OLAP on their hard drives because they don't quote their mean-time-between failures. I even called them and asked.
I've used their SAS programming language earlier in the year, and I'll tell you-- it's NO walk in the park. The syntax is worse than umm.. well.. I guess it's just the worse.
The only reasonable solution to this lad's problem would be to develop his own system.. It's the cheapest, probably the most reliable and, would be, by far, the most customizable.
I feel for him for having to do this kind of work though because I was driven mad by it. I still have a bad taste in my mouth from it.
My suggestion to him : hire a bunch of interns and have *them* do it.