Visualizing Complex Data Sets?
markmcb writes "A year ago my company began using SAP as its ERP system, and there is still a great deal of focus on cleaning up the 'master data' that ultimately drives everything the system does. The issue we face is that the master data set is gigantic and not easy to wrap one's mind around. As powerful as SAP is, I find it does little to aid with useful visualization of data. I recently employed a custom solution using Ruby and Graphviz to help build graphs of master data flow from manual extracts, but I'm wondering what other people are doing to get similar results. Have you found good out-of-the-box solutions in things like data warehouses, or is this just one of those situations where customization has to fill a gap?"
What ever happened to the simple good old Novell days?
Abstract: We propose a method for characterizing large complex networks by introducing a new matrix structure, unique for a given network, which encodes structural information; provides useful visualization, even for very large networks; and allows for rigorous statistical comparison between networks. Dynamic processes such as percolation can be visualized using animations. Applications to graph theory are discussed, as are generalizations to weighted networks, real-world network similarity testing, and applicability to the graph isomorphism problem.
Unfortunately, this means that it is much too utilitarian (and ultimately, why products like Peoplesoft are making headway).
If you find that you have developed a good product to help with operating SAP, you can sell it as a third party add on. Many of the popular add on's were created out of a sense of frustration with the "mother product".
PtolemyPlot and Java.
http://visualcomplexity.com
Have fun!
i have experience rendering massive datasets via OpenGL and this where a lot of visualization still happens in government and big business.
These can be incorporated into other general shelf visualization tools or just be used standalone on any major platform as long as the machine has the horsepower, including, not suprisingly, a powerful GPU.
the first computer i started doing visualzation on was a SGI. imagine that.
How are you supposed to handle the data if you do not understand it? Sure, there can be too much to see/think about at one time, but if you don't understand it, how can you visualize it usefully?
I am asking because I have a problem: Where I work, I understand the data and I make efforts to visualize it for others. The trouble starts when they don't understand the data and it's sources and limitations, so what they see in my visualization is all they know of it, and they make assumptions about it. I've even had people worry that the network is down because there were holes in the collected data which then showed up in the visualizations.
If anyone has some good URLs for such thinking, I'd be grateful.
I simply do not understand how you can visualize data for people if you yourself do not understand it.
Support NYCountryLawyer RIAA vs People
There was a thread about the R language a couple of weeks ago. Look it up and read it....
An open source version of IBM's Data Explorer. The interface is a little clunky (openmotif-based) but it is _powerful_.
I wish someone would take it and throw a modern front end on it.
The infovis community has been dealing with these subjects for years. There's many different visualisation techniques around. Here's a list of the past conferences and the papers:
http://conferences.computer.org/Infovis/
Plenty of good products out there, but the one that I like most is from Tableau Software (http://www.tableausoftware.com/).
Life is complete only for brief intervals in between toys or projects -- John Dalton
I used Xgobi (http://www.research.att.com/areas/stat/xgobi/) for a lot of things back in the day. It gave me the ability to 'see' and understand high dimensional data sets quite easily when I was looking at computer vision research.
I drink to make other people interesting!
I work in biology, and we use Spotfire DecisionSite to visualize and analyze a lot of our massive genetic data. It's a very powerful program that I barely know how to use. It seems to have packages able to analyze pretty much anything you want, and you can even write your own scripts to help things along.
Wouldn't any everyday cube browser along with any tool to detect base dimentions in a datawarehouse schema do the trick? You may have to add a few custom dimentions on your own depending on how shitty the master data is (I don't think that can be helped, no matter the solution, if a dimention is "these two fields multiplied together times a magic number appended to the value of another table", you need to know, no tool will guess), but aside that?
Thats usually what I do anyway. I dump my data in a datawarehouse, use whatever built in wizard can auto-generate dimensions, then play with them in a cube browser. Works for even pretty archaic home-made multi-thousand-tables-without-normalization ERP systems I had to work with in the past anyhow.
Not sure what you can use to create a visualization, but the information you need is in the IMG.
I don't have a need to develop a visualization of the whole of our SAP implementation, just my little FI-CO corner of it, and that's a big enough pain
Do not taunt Happy Fun Ball
Your ERP isn't supposed to directly analyze the data. You're supposed to use a Business Intelligence software package for that. This being SAP, I believe they'll try to sell you Hyperion.
I'm sure they all can be entirely useful. But it's good to keep in mind the unifying principle:
Real estate prices in America almost always increase over time; and if they fall, the decrease will be short-lived and most certainly will only affect a few scattered markets.
That's a trick I learned from studying the methods of top experts in the banking, insurance, and hedge fund industries.
I'll try to answer your question without the key info needed: "What is the data your modeling?"
You're on the right track...
Either way, from experience i'd say you're answer is "this just one of those situations where customization has to fill a gap"
Be warned though, out of the box solutions do exactly what's on the box. Anything else is going to be modeled by you, or customized (usually at a high rate), by the vendor.
That being said, I've used oracles' solution http://www.oracle.com/solutions/business_intelligence/index.html for financial data (10TB data), and used my own solution to my music recommendation database http://www.egusta.com/ (43gb Data).
At the end of the day, I like using my own custom solution. And by the fact that you're familiar with Ruby (I checked out your site), I'd say you're on the right track already.
Just take the first 65k rows and dump them into excel and create a pivot table.
Can I suggest you look at Centruflow, which is an application designed to analyse dynamic data in a nice, user friendly way.
...which I just took offline for a quick database upgrade. Er, sorry, will be back online soon!
The Army reading list
I've worked with a couple of architects who used OneData (http://www.datafoundations.com/index.shtml) to do this sort of thing. Although I haven't used it myself, the idea is simple & cool. It's essentially a Software As A Service implementation of olap reporting. The demos indicate that you can theoretically get up and running rather quickly. Not sure if that's true as I haven't done it myself.
I'll bet Edward Tufte would have something to say on the topic... http://www.edwardtufte.com/ Has anyone been to one of the "Presenting Data and Information" seminars? Any feedback?
Take a look at Prefuse. I haven't used it myself (I considered it for a project), but it may have the right mix of a good Java API and flexibility/customizability that you're looking for. As a bonus, it's BSD licensed. YMMV. Good luck.
http://www.pureshare.com/
Basically turns any data into a widget, without taxing your data every time you want information.
Into a matrix screensaver.
I record my sleeptalking
I've had really good success using an information visualization tool called Starlight on a number of projects like this. Everything from process modeling to military intelligence. It's a commercial spin-out from the DOE PNL lab information visualization research in Washington State.
http://www.futurepointsystems.com/
take a look at OpenDX http://www.opendx.org/
I'm sure it's my mathematics background, but when I saw the headline I assumed the author would be discussing something involving the square root of negative one, to which my response was, "Silly author, you can't visualize four dimensions. (Sober.)"
Let us not become the evil that we deplore.
Use TeX/LaTeX/MetaPost for the drawing and layout engine(s). Use you favorite language as a front end to turn the input data into source files for these programs. Plus, the result output is PDF, which means you can avoid the crap-fest that is Word. http://tug.org/
Alternatively, you can use Asymptote, which is like a modern version of MetaPost. http://asymptote.sourceforge.net/
I have no idea how I stumbled across this, but it looks very pretty...
I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
Both tools combined allow you to easily visualize large data sets and adjust the resolution of your data.
You could use Pentaho with one of the SAP plugins.
Large, highly complex data sets are best described on the back of four cocktail napkins or on a fixed white board in a shared conference room. ~
Invenio via vel creo
Take a look at Essbase http://en.wikipedia.org/wiki/Essbase. It is now owned by Oracle and is used by finance departments at most Fortune 100 companies.
As you have do doubt discovered, SAP is great for transaction level detail, but kinda sucks at the big picture and doing "what ifs". Essbase's tight integration with MS Excel and very cool reporting tools makes it a much easier to analyze your data than looking at spending reports from SAP.
Mainly implemented by budgeting and finance groups, Essbase is not a favorite of IT departments though, as change management is a challenge and Essbase requires quite a bit of subject knowledge and is almost impossible to outsource to another continent.
Skip ------ See the latest from http://www.anArchyFortWorth.com
It all depends on what output you need?
DAD software has the ability to customize data types, multiple inheritance of objects, and to define different relationship types.
You can then trace along object relationships bringing back a dynamic graphic depending on what you want to show (and spit out to PDF).
SAP realized they were missing this kind of stuff so they splashed out a few billion on some BI tools.
Matlab or Spotfire can do it, assuming you have enough RAM
Have a look at Processing, and the book Visualising Data by Ben Fry.
Unexpect the expected!
What kind of data is it? What are you trying to figure out by looking at the data? What type of people will be looking at it? Depending on these answers, I may recommend one of the leading BI tools on the market. IBM Cognos SAP Business Objects Microstrategy These COTS solutions are focused on visualizing masses of data, usually for some type of pattern discovery or decision making.
I just used network x in python, ahref=http://networkx.lanl.gov/rel=url2html-2723http://networkx.lanl.gov/> .
I used network x to visualize graphs. It is pretty simple, but it might be very similar to the ruby solution you described.
It has can interface with many other libraries as here http://networkx.lanl.gov/reference/index.html
Tableau Desktop is an interactive analysis and visualization product that connects to relational and cube data sources to help people see and understand their data. There was a webinar (slides - PDF) back in November 2008 covering Blastrac Global's success in using Tableau with their ERP system.
Disclaimer: I work at Tableau Software, so I encourage you to see for yourself with a free trial: http://www.tableausoftware.com/products/tour
ask - what do I need from it, and is the data accurate.
As you say, there is a lot of work on ensuring data correctness. Probably a lot of reports are printed for middle management to chase the working level people about each of their pieces to that data set and correcting it.
This will go on and on. Soon major decisions will be based on largely suspect data. Then the fun happens.
I've worked with ERPs and at Fortune 10 companies down to mom & pop places and do consulting on process improvement. Look long and hard at the data monster and then go out and pick up some good books on Lean Manufacturing and pull systems.
Doesn't matter how fancy the analysis is if the data is questionable. But then your plight is to make pretty pictures. Best of luck, but be sure and read up on Lean.
a company that competed with SAP. This is a problem that is industry-wide.
The solution you probably want is to make sure your SAP is set up to use a common relational database, then use another tool (Crystal Reports, Seagate, etc.) to visualize your data in ways that are not already built-in to your ERP system.
I first saw a video of Hans Rosling, who had some very unique ways of visualizing data that would otherwise be useless to a simple mind such as mine.
After I watched that, I found a piece of software called Tableau. I downloaded the trial version, and really liked how easy it made visualizing data for me. I can take the data I have, and Tableau will see how it's connected and allow you to generate visual reports of the data. I'm not saying that it'll work for everything, but it certainly does what I need it to extremely well, especially for my business intelligence initiatives.
depending upon the problem domain, a very useful (albeit expensive) set of tools is StarLight, written for the US Government: http://starlight.pnl.gov/
highly recommended if you've got tough visualization problems. this tends to get used for the *really* interesting visualization challenges.
I consider myself quite a geek, but I have to admit I have no idea what SAP and ERP are. I guess I'll google it since the summary just assumes the reader knows WTF these are.
Where I work I am sometimes writing code to import/export data from one system to another and it usually gos quite well until a company decides to switch to SAP. No 'gone to SAP' is synonymous for 'going pear shape'. It's amazing how the bad technology is the most expensive.
Exeros makes software for that kind of thing. I haven't tried it myself, but you might take a look and see if it does what you need.
http://www.exeros.com
I am pimping my own employer's product here, and I'm admittedly biased, but we've got a phenomenal web-based/SaaS solution to this exact problem. We've done work for clients with billions and billions of rows of data (like 50+GB) and we've got a unique database that can generate reports in seconds that could take upwards of fifteen minutes on a SQL-backed solution. You can take any report, drill down arbitrarily into the data below, flip through the datasets, arbitrarily flip axes, filter out unwanted data on the fly, all that.
It's not a FOSS solution, but it is very affordable -- the last time we had a company-wide meeting marketing/sales was going off about how it costs about the same as a daily latte. Being a web solution, we're platform independent.
It is pretty much ready out of its metaphorical box. The only thing you need to setup on your end is the data export. We'll accept most any data format, usually tab-delimited CSVs. After we have your data, all you have to do is create reports, and we've got a team of people that can help you with that.
I think that's about enough self-pimping. There's more on our website, http://pivotlink.com.
You haven't stated what you're needing this for, I assume it's not just for your own consumption. I work in Business Intelligence (Kimball Method Dimensional Modeling etc) and we use PeopleSoft ERP in our workplace. We have found that the best way of displaying/using this type of eclectic data is to model it in star schemas and put it in data cubes. This way the people who use the data can really use the data for analytical purposes... any other way just makes more work for us IT people, this is great for our pay packets but BAD for our work/life balance
The man in black fled across the desert, and the gunslinger followed (SK)
I had a similar situation to yours recently, except I was trying to detangle a horridly complex product substitution graph for a logistics company.
I used a bunch of Perl to crunch the raw databases into various abstract graph structures, but instead of graphviz or something created by/for developers, I found that the best software for graph visualisation is the stuff that the genetics and bio people use.
The standout for me was a program called Cytoscape which can import enormous graph datasets and then gives you literally dozens of different automated layout algorithms to play with (most of which I'd never heard of, but it's easy to just go through them one at a time till something works)
It's got lots of plugins for talking to genetics databases and such, but if you ignore all that and use Perl/Ruby/whatever for the data production part of the problem, it's a great way to visualise it.
Looks like SAP tricked another sucker.
A company I worked at several years ago migrated to SAP. It took several hundred million dollars, 6 years, AND the companies main branch was already using SAP. All to replace an MVS system that cost under $5M a year to run, did more, and was much faster.
SAP is NOT a business application. It's a programming environment where you get to build and customize your own. Then those German Wunderkids break your customizations every time there is an SAP change.
A "good" business software package allows you to customize "it" to match your business processes. Not the other way around as with SAP.
If all German engineering was this good, the Polish cavalry would have chased off the Nazi Blitkrieg.
It's quite good, but expensive. You can do very complex things with very little code.
How about sql-fairy http://sqlfairy.sourceforge.net/ that's open source and very complete?
On y va, qui mal y pense!
The one thing that OpenDX still has, that for some reason the others have not, is an excellent interactive colourmap editor. I use OpenDX to get the colours correct and then export to use in other toolkits.
.
Of all the products out there, Business Objects strikes me as the best solution to quickly engage you and provide strictly the useful information your looking for. They were also recently acquired by SAP so I would recommend you ask someone at your company what the corporate availability is to their their products. Maybe get in touch with the SAP account executive. If your company doesn't already have the availability to use the product you would probably qualify for some reduced price incentive. The Business Objects GUI is very clean/user friendly and you basically just drag your various dimensions to create a request and qualify respectively. If you can't make sense of your data doing this, then you probably have a poor and heavily denormalized database design. The other issue you address is the poor data quality present in the master data management system, be it the data warehouse etc. These issues plague many of the largest companies out there and can often take many years to sort out. Unfortunately, the best way to fix these issues is typically when someone like you decides to actually use the system for the very reason it was put there. So... ask some good questions and start scratching your head when the answers don't make sense. I can pretty much guarantee you that you will find a goldmine of things that don't make sense if you look hard enough. Its obviously a serious concern when there are data quality issues, however a company almost never knows how bad they are. To put it another way, when it comes to understanding this data your company doesn't know what they don't know yet. Its a continuous process of improvement that is largely driven by the users and their efforts to ask new and innovative questions. The questions to consider would be very industry specific, however try asking some basic queries like how many distinct US states are showing up in the system (if you get 70 thats a big problem), if the same social security number is showing up in a lot of different stores all over the place (fraud detection, criminal activity), if the cost of any products are showing up as 0 or next to nothing (your company is potentially over ordering products that that it thinks are creating huge return percentages, however are actually creating huge losses and inventory costs). To create truly informative strategic reports your going to need a good deal of historical data and a well tuned database system to provide a decent level of service. I am a little concerned as to how your accessing the data though. You state that your manually extracting the data to what I assume to be flat files. Even if your very sparsely qualifying the data and preaggregating much of it, it won't be very useful unless you mean for it to answer a few particular and very similar queries. You need to be tieing your front end tool to the data warehouse or an aggregate cube that is being built off this central system. The cube will likely perform faster unless the data warehouse is well tuned for your requests. Without going into the complicated solutions for such tuning a quick solution is to create a couple aggregate join indexes or denormalize part of the relational system as a last resort for performance. The bottom line is that the greatest value of these multi million dollar database systems is found in the information that is uncovered through asking the right questions, which lead to making better corporate decisions. This may seem obvious, however these systems are used in so many ways from operationalizing complex triggered company processes, to creating customer targeted incentives and market basket analysis/product pricing. When the data quality is bad this typically cascades through these processes and leads to poor decisions and inaccurate forecasting. One very cool solution I have seen is to use MS Excel to access the database. I'm assuming this was using ODBC drivers. You can create some very cool pivot tables and charts/graphs and voila, you save $100,000 and have yourself a pretty cool BI solution! Granted, this is not a BI pr
Have you looked at data mining solutions? Someone mentioned Pentaho already, but there's also:
Rapid Miner
Orange Data Miner
all of which are packed with enterprisey features. But you may have to learn some stats. Once you get past what you can do with the pre-packaged stats methods, then head for R, or write a RapidMiner plugin in python.
Check out Stephen Few's blog http://www.perceptualedge.com./ Good info there. For my money, Tableau is the way to go. It's cheap enough and easy to implement. And it reads practically everything. The visualizations are cool too.
MayaVI - http://code.enthought.com/projects/mayavi/ (Originally http://mayavi.sourceforge.net/) , was developed by a friend of mine (actually one of the founding members of the local Linux User Group). Although initially designed for CFD (Computational Fluid Mechanics), I think, it can now handle pretty much a lot of stuff....
It is a language (based on Java) made for this sort of thing. Made together with one of the guys who wrote "Visualizing Data" (Ben Fry). It is fun to play with too.
MagnaView has developed a product for visual analytics. We specialize in real-time processing of big (transactional) data sets, consisting of millions of records. These can be loaded from various data sources (ODBC, text, Excel) and presented in advanced visualizations based on treemaps. We allow in memory analysis, real-time zoom in and out, filtering, an excel like expression language and web based publishing. You can download a free trial at www.magnaview.com.
Roel Vliegen, CTO MagnaView
i've heard of this Finnish company developing a very promising graphical interface for ERP systems, if i recall correct SAP was among them.
Have you looked at Maltego at www.paterva.com/maltego ?
Maltego is a generic data visualization application that can be used to visualize almost any type of data, by writing your own little plugins called "transforms" you can import any data from any source.
At the moment you need to buy the full version to be able to use your own transforms, but apparently the community edition that supports local transforms will be released later this year :-)
There's a product called JMP. It's relatively inexpensive. It's great at visualizing (especially statistical) data. They've got a 30-day trial. Check it out.
__
disclosure: I have an association JMP's parent company.
--
(sourceCode == freeSpeech)
It even use Boost Graph Library (BGL) underneath.
Ref:
http://www.sandia.gov/Titan/media/Information_Visualization_in_VTK.pdf
Might check this suite out...
http://www.jaspersoft.com/
Kap Lab is providing a free data visualization tool for Flex : Visualizer. It understands many data types like XML and CSV, and several layout possibilities to display the data. Link : http://lab.kapit.fr/display/visualizer/Visualizer
Since you've already bought licences for SAP ERP, you could get a bargain on the Master Data Mgt component. It also offers support to control the Master Data harmonisation process, which you probably need if you have such a large amount of data.
Look into VTK and ParaView. Both open source packages (BSD license) from kitware (vtk.org and paraview.org). Historically they have been a scientific visualization system, but in the recent two years they have worked with Sandia Nat'l Labs to add an extensive information visualization subsystem. They major goal of the work is to make it scalable (using distributed parallel processing); something that's really important as data sets get large.
As the author of the webinar morton2002 included in his comments I can confirm that the biggest challenge in any BI system rollout is the quality of the underlying granular data. Tableau is a wonderful enabling tool for visualizing data graphically or in tabular form; however, it can resolve bad data in the underlying source system. Accountants reconcile general ledger data, but generally don't bother with reconciling underlying subsystem information to the final general ledger closing information. Key to success at Blastrac was developing the weekly/monthly processes to resolve those issues found in the underlying source data. Resolving that issue provided the obvious benefit of having very granular data that matched what the accounting team reported each month. The less obvious benefits came from improving processes that were causing the bad data to begin with. That saved TIME and MONEY in addition to providing great (and very granular) information over time.
dgm885
If your analyzing sales data and you have dimensions related to customer type, product classifications, and market channel and let's say a credit memo is issued to a record merely crediting the customer account; no information is provided regarding product in the line item grain of the transaction. No datacube on earth can resolve that unless you have a very good ETL methodology to catch those kinds of errors.
dgm885
Professor Hans Rosling used an impressive tool to visualize data from UN and other sources to debunk myths about the third world. I don't know what the tool is called, whether it is available, open source or what not. I got the impression from the presentation that it was specially written for the task. No idea if the data sets qualify as complex, but if I had to visualize data I'd certainly check out if this tool is available.
Sorry, forgot to link to the presentation I mentioned: Hans Rosling: Debunking Third World Myths
MayaVI [http://code.enthought.com/projects/mayavi/] based on VTK. Runs on Python. Originally designed for CFD (Computational Fluid Mechanics). Now a days, it can handle almost anything.
You can try GGobi http://www.ggobi.org/. It's very good in discovering correlations among variables by by visualizing points plotted against two or three variables of your choice, quite easily manipulated by the click of the mouse. The number of variables may be very large, I think.
I would like visualize the link structure to see which pages are linked the most.
Each page should be one knot and every link to another page should be a line.
Does any of you know of some tool where I can do this with little effort?
Thank you in advance!
how IT is changing the world - http://max.zamorsky.name
Yes!!!! No.
General Dynamics offers a product called CoMotion that allows you to visually explore your data and find interesting patterns and trends.
http://www.gdc4s.com/content/detail.cfm?item=32341561-76f9-40f8-8ad5-0f0d66dd240e
CoMotion is a commercial fork of Visage, a collaborative visualization platform designed at Carnegie Mellon University and MAYA Design:
I have recently been using yEd for quick visualizations. It's a free (beer) java app, similar to graphviz in concept -- you just provide a logical graph and it figures out how best to display it. What's nice is that it's an interactive program, so it makes it easy to play around with different layout algorithms and tweak the settings until you find one that works for your graph.
It's Java and it launches via webstart, so it's very easy to get running on any system.
http://dearkobe.com/slut-jinqp8f3.html
I'm a SAP consultant and have "cleaned-up" several data sets over the years. I'm lucky in that all of my customers are running it on Unix with DB2.
I wrote a series of PHP scripts that go through everything and present inside a somewhat simplified web interface. I also use Crystal Reports to provide "cleaner" copies.
But, at the end of the day, it's more of a brute-force exercise then anything. Providing a simpler interface then R/3 is the first step, but you have to have users that are willing to use. What I've done in the past is to set up the interface and then send out a mass email telling everyone that they have to spend at least an hour every day going over the history for their department.
It usually takes about 6 months, but it's really beautiful thing when it's done.
Suggest reading the book "Visual Display of Quantative Information" by Edward Tufte.
check out:
www.edwardtufte.com
and 'napoleans march'
Are you working with data that is in an OLAP Cube? We have both an OLAP grid and OLAP table for data slicing. (LogiXML)
If you have time for a custom solution, use Processing to pull off a visualization. It's easy to use/learn, you can get results quickly, and it integrates well with Java.
If you have a Java API for connecting to SAP (ala OLAP4J -- if not, maybe look at using Mondrian for the warehouse, and Kettle to clean the data + construct the warehouse), that solves the query/response end of it, leaving your hands free to work primarily on visualization.
I don't use their product but http://palantirtech.com/ makes a data visualization tool and has a good blog about it, with some interesting Java dev tips thrown in. It might be overkill for the data discussed in the article summary, but sounds pretty badass.
One really interesting blog article http://blog.palantirtech.com/2008/12/12/vizweek-2008-report/ talks about something called the "VAST Interactive Challenge", which as near as I can tell is a competition for data visualization tools to go head-to-head against each other. (Side note - wouldn't it be cool if more software frameworks/applications had shootouts like this?)
The blog article talks about how they had 30 minutes to train an analyst that had never used Palantir, and then 2 hours for the analyst to explore the data. It's an interesting read, and makes you realize how useful a really good tool like this could be for finding trends in raw data.
You drank my drink, you drunk!
Check this out: ATSV. It can two 2D and 3D plots, as well as adding additional dimensions via color, glyph size, glyph orientation, glyph transparency, stuff like that. It's pretty slick.
Stop! Dremel time!
Just use elaborate Venn diagrams.
Check out OpenDX, its visualization capabilities are way beyond Graphviz's and it provides a GUI. It's an open source version of IBM's famous Visualization Data Explorer (initially released in 1991), which IBM converted into an open source project a couple of years ago.
Quoting the site: "OpenDX is a uniquely powerful, full-featured software package for the visualization of scientific, engineering and analytical data: Its open system design is built on familiar standard interface environments. And its sophisticated data model provides users with great flexibility in creating visualizations."
For a short glimpse at its capabilities, visit the gallery here.
Coming from a data mining background, InforSense's platform allows you to not only manipulate your data, perform calculations (and data mining) but then visualize your results in an AJAX environment. You can call open source apps like R and Cytoscape directly from InforSense's workflow building application, so none of your pre-existing work gets discarded.
http://andreas.materns.com
Check out Structure101g
http://www.headwaysoftware.com/products/structure101/g/index.php
The Open Knowledge Foundation maintain a list of open source visualisation packages.
http://okfn.org/wiki/OpenVisualisation
There's also our open-visualisation mailing list.
I believe Gapminder runs on the Trendanalyzer software, which was acquired by Google:
http://googleblog.blogspot.com/2007/03/world-in-motion.html
Here's a shameless plug for the software product I work on. Although it has some custom features for geochemical data, it's flexible enough to be useful for generic datasets. You can get a trial version from: http://www.ioglobal.net/ioGas.aspx
"Through the ioGAS dynamic graphical environment, you can interact with the data in real time, making it effortless for you to detect patterns, anomalies, and relationships across your data. Combined with optimised workflows, you can produce high-quality interpretive outputs with the confidence that you have applied best practices developed by a team of geochemical experts. Most importantly these results can be produced in a fraction of the time taken by traditional tools."
I have been searching for a solution to represent on the web a small graph of links between blogs and finally found the flexviz flex library in google code: http://code.google.com/p/birdeye/ an example of the result here: http://www.len.ro/2008/11/harta-blogosferei-la-feminin/
There was a previous replier that mentioned Tibco Spotfire. This software solution deserves, IMHO, a much stronger endorsement. I have no idea how much it costs and I suspect it's hell'a expensive, but know this: it will let you visualize the living daylights out of whatever data you can throw at it.
As another poster mentioned, there is no substitute for UNDERSTANDING your data. But sometimes 'understanding' must be arrived at by an initial hypothesis (an educated guess) and the data must support the hypothesis. Tibco is a tool that will let you visualize the data very easily to test these hypotheses, in a very dynamic and flexible way.
How does it work? Think about pivot tables in Excel... then think about feeding it the same juice that the Incredible Hulk took. It's pivot tables on lots and lots of steroids.
I was trying to offer a "simple" solution... although of course it has its drawbacks.
Our biggest customer complaint was that we did not use a "common" relational database (SQL Server, Oracle, etc.). Therefore, we had to provide all the data views the customer might desire which is not feasible.
The simplest solution to this dilemma was to modify our system to use a database that the customer could access themselves... read-only of course. Of course the relational databases were slower, and the modifications major.
have a look at www.spacetimeresearch.com
We use this at work to find correlations between facets of web page generation. For example, plot the total time it takes to render a page vs. the time to fetch something from storage. Or versus the number of objects fetched from cache. `vp` lets you select, crop, scale, etc. fairly smoothly by using your video card. It also works on windows, linux and mac.
http://astrophysics.arc.nasa.gov/~pgazis/viewpoints.htm