Generating Reports from Access and Excel Files?
casals asks: "I'm a computer engineer working at a non-IT company, and there's this thing bothering me: by the end of each job, we have to generate a huge report that's actually a composite of lots of minor reports, each one of them made using a different software. Since the softwares used don't interact at all, we have to input the same information five or six times - not too smart, I guess. The outputs are either Access databases or Excel spreadsheets (some of these reports are just Excel spreadsheets that must be filled with data); so, I was thinking about making an application that could aggregate all the input models and generate all the outputs I need, at once. Any suggestions?"
"Here's the thing: it cannot be a web-based application (connectivity is a luxury at the rig), it has to run in a laptop (each employee should have it installed, stand-alone) and it must be able to import images from Excel worksheets. Crystal Reports uses spreadsheets as data sources, but it's not Open Source; I was thinking about using BIRT or JasperReports + POI, but that looks to me like inventing the wheel itself, so I decided to ask before digging into it."
Perl or Python would be best.
Sigs are dangerous coy things
This is the sort of job visual basic (classic) can be good at.
.net experience appeal? theres gotta be some VB devs hanging around here - where have you lot started moving to since v6.0 'closed' its doors?
Interaction with MS objects is simple in this environment and theres plenty of help in the IDE.
It lives on a machine quite nicely and is certainly quick enough since most operations will be at the speed of the apps, excel word or ado for the data.
All very "enterprisey"(-4 years) and works on everything from 95 to now with minimal effort just install office and your app and everything is there.
Write a new extractor for each report required and let it grab data as required and push it into the outputs.
Its macro-macros.
ot: does the
liqbase
Those reports are based on data pulled from a database. Do reporting on data in DB and not the output from singular queries. Save yourself some headache and time.
This is my sig. There are many like it but this one is mine.
From your question, I'm not exactly sure what these reports are and what they are ultimately used for. However, if they are used for financial reporting purposes, this is an area that IT and financial auditors are looking at even more closely. Just make sure if you implement a pre-written package (OSS or not) or if you write something yourself, make it easy for an auditor to come in and get comfortable on the realiability of the application and they data flowing through it.
"Extremism in the pursuit of liberty is no vice. Moderation in the pursuit of justice is no virtue." --Barry Goldwater
If your outputs are all Access and Excel, you should normalize all the data to one Access database and generate the "master report" from there. You should use good ol' VBA (or .NET using the Office interop libraries), not Jasper reports or whatever.
A lot of people dismiss MS Access, but actually it has a lot of powerful functions for importing and exporting data of various formats. This is exactly the sort of job it was built for. You should really consider it.
How much time do you got? ;)
.net VSTO.
.net.
MS
You can connet to pretty much any datasource with
The VSTO add one allows you to create managed code(not VBA) in dot net and fully integrate with excel.
I have done what you need to do many, many times. So I feel your pain.
I am availadble for consulting(not contracting) for a reasonable fee.
The Kruger Dunning explains most post on
I don't have Access on this computer, so I can't test it, but it seems like Access should be able to import images from Excel files somehow. If not, I'm sure you can whip up a separate app to run the import and get the images in somehow.
After that, why not do all the reporting with Access? Attach a few VBScripts to some buttons and zoom.
Put it in a relational database and use a linked ODBC table in access or MS Query in excel to generate everything.
It's easy to do, just beware that the Jet engine sometimes makes mistakes on linked tables (maybe better a pass through query).
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
REALbasic is really shaping up to fill the niche left by the demise of Visual Basic 6. I haven't played with this feature extensively yet but it does have an office automation feature to handle Excel, Word, etc. Might be worth a look.
If you have a 2003 server, just flip on Windows SharePoint Services.
Each project gets their own site.
You can store all your data on the site and then muck with it as you like.
Create your own lists/web parts/whatever.
Have fun.
The opposite of progress is congress
My current job invovles alot of work similar to what you have to do. Both Visual Basic and Perl can do anything within office that Visual Basic for Applications can do. VB accomplishes the connection by specifying the respective office compenent's DLLs in the VBs references dialog box. Perl can talk to Office via Win32::OLE pm, see http://search.cpan.org/~jdb/libwin32-0.26/OLE/lib/ Win32/OLE.pm. Using the method produces code very similar VBA macros but with all the advantages of perl syntax and functionailty. In your situation I would use Perl for its excelent Office interaction and ease of text processing. There are nice free tools out there for packaging perl scripts as transparent binaries as well, so you don't have to worry about people not having Perl, etc on their laptops.
I'll check up this thread later if anyone wants to pick my brain about this stuff.
This is actually how Intel (used to, maybe still does) gets all their tools to talk.
Intel's ia32 design shops are a bunch of specialty made tools (iHDL vs VHDL, a fork of Synopsys vs real Synopsys, etc.). None of these tools is really maintained anymore. No one working in Design Automation really knows how these tools works anymore either. The guy who designed tool X left 2-3 maintainers ago. So, you dare not touch the tool's spaghetti code for fear of causing the multi-billion dollar project to break.
So, what Intel does is wrap everything in Perl scripts.
Want a hierarchical source control? Don't use CVS/Syn wrap RCS in Perl to make HRCS (very good tool BTW). Want Synopsys to process iHDL code? Write a Perl script to convert compiled iHDL into something Synopsys will read.
Perl wrapper scripts on top of Perl wrapper scripts.
Lovely.
What is even better is tool ossification.
Say Synopsys 2 (I make up a number) didn't have a feature Intel wanted. So, Intel threw a bunch of money at them and got a forked version of Synopsys 2 w/ the feature. Now Intel has used this fork for 5 years and built a ton of Perl wrapper scripts around it. Synopsys has moved on to Synopsys 5 with the feature as part of the normal way of doing things, but slightly different because they fit the feature in using more thought and care then the Intel forked rush job. Intel can't move to Synopsys 5 because none of the wrapper scripts work with Synopsys 5.
Nor is Synopsys willing to port Synopsys 5 back into the Synopsys 2 fork. Synopsys has grown since Intel last dealt with them and they don't desperatly need Intel's cash.
Even if you get Synopsys 5 into Intel, you'd have to write a wrapper script to make it's command line interface look like the forked Synopsys 2.
On Windows, accessing Excel and Access files?
Ok I'm typing this using Firefox, Fedora with FreeRIDE running on another desktop and I think VBA is probably the tool for the job.
Deleted
I find it abhorrent to copy/paste something more then once. Retyping something is something I just dont do. Both Excel and Access can be (*cough*) accessed programaticly, with just about any Windows tools worth mentioning (including each other). You've got yourself a stupid system, but it should be possible to work within that stupid system better then copy/paste. Activestate perl can do this, but if your using (complex) excell and Access, assumably you have some local experience with VBA or, at worst, some domain specific VBA to use as examples.
ODBC glued together with VB to extract the Excel images. Done.
Just use vlookup
Creationists are a lot like zombies. Slow, but powerful and numerous. And they all want to eat our brains.
DTS that is included with MS-SQL Server 2000 is a good option. I have done a lot of this type of thing, and it is quick and easy with an ETL (Extraction Translation tool) like DTS.
I wish I could find and Open Source replacement for this tool.
This book has been very helpful
http://www.microsoft.com/MSPress/books/6525.asp
Your solution to your problem is the domain of a System Analyst, someone who figured out what output was required and figured out a path to get there from the data available. You are not looking for a tool, you are looking for a method.
The mainframe world was lousy with system analysts, who told programmers what to do. Do they still exist?
The latest Slashdot meme.
You could check out Tabletop software. It's a visual frontend for Excel that makes working with data and graphs much more intuitive. Currently in beta, it is due out this summer. I would email those folks at Terc for more info/availability.
Plenty of languages can use Excel to manipulate spreadsheets (i.e. PERL, VB, .NET, etc). This is okay for an application that runs off the user's computer.
You can also use Excel as a data source using the Jet OLEDB driver. I've had some problems w/ it (i.e. if you have a columns that contains entries like "bob", "jane", and "23" -- it will choke on the 23 because it is expecting text isntead of a number -- perhaps there is a workaround for that.
Access is easy to work with as a data source....
Evolution: love it or leave it
Business Objects enterprise.
Even better, a pile of perl scripts.
"Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
Better, switch to a platform that cares about backward compatability more than Microsoft does. MySQL & Perl would make sense. PostgreSQL & Ruby would make sense. Oracle & Java would make sense. But with Access and Excel, good luck catching up with the "please pay for upgrades and rewrite" treadmill.
With built-in SQL, macros and BASIC, it can nicely import most things into extra tables, call external programs and if all else fails, interact with them by sending keystrokes - which should allow you to extract through the clipboard anything they wouldn't readily disclose otherwise.
And with a bit of SQL again, generating reports from there is something it can do really well (despite not being FOSS just yet
I've been doing scripts at work lately in Perl to pull data from Access and Excel files. There's ActivePerl for Windows, DBI to get at the Access data, and other modules for Excel. It all works quite well, and the text processing of Perl is handy for those reports (more so than, say, VBA). Not sure how you'd get at the images though. (Note: I am not a Perl fanboy in general, but if it works...)
This prolly isn't what you want to hear but....
Sounds like a situation I was in. I needed to come up with a long term robust solution for my company for the type of situation you are describing
If it's worth doing then its worth doing properly. Dont fart about with hacks here and there. You need to get everything centralised on a SQL/Oracle etc server, getting rid of the shitty legacy Access databases etc written as a temp bodge by an intern 5years ago.... stuff that has now become mission critical. Get the suits to contract out the work if need be.
Before you complain this isn't what you are pitching for, lets talk monney (suits like the bottom line).
1) How much is it currently costing to type and process data six times (not taking into account the 'chinese whisper' effect and errors creaping in)
2) How can your auditors trace the current mess and find where the monney is going?
3) How future proof is your current setup? What will upgrading the current mess cost?
4) What accounting errors already exist in the current setup (no doubt written by non-profesionals)? 5) What backup/recovery policy do you have for your existing mess (none?) and how much will it cost when (not if) Freds hard drive dies?
Pitch the above points to your boss/suits and they will soon realise they need to do it properly and spend some monney. Of course, your situation may vary
FWIW, I completely moved our company away from the legacy ad-hoc crap and am processing everything with a centralisedd LAMP stack. But what else wold you expect to hear on slashdot :-)
Been there, done that, got the T-shirt and the blame :-)
Anyone quoted by a reporter knows how little they understand
Don't believe what you read is the truth.
Crystal Reports uses spreadsheets as data sources, but it's not Open Source
Neither are Excel and Access.
However, most of my colleagues would not be able to do this. They have lived in the mainframe world (think COBOL, DB2, OS/390, TSO, REXX, JCL etc) for years. They'd probably be able to make excellent suggestions along the lines of processing the data and reporting with SAS or COBOL.. but that's the mainframe.
So, to answer your question: yes, we do exist. Most of 'us' however, are barely able to use Excel.. let along something as 'complicated' as pulling data from several sources and creating a report (even by hand).
To answer your question from the mainframe side (go on, let's assume this problem was on the old iron): Sure. We've solved it a million times, know how to make it effecient (or not) and can chew through this kind of problem easily. We do it every day. Next!
(for those who are interested.. in the mainframe world this data would be coming in from several sources: a large database, eg: DB2, MQ, VSAM files, flat files or routed in via from an external source through an number of input paths (and most likely then stored in MQ.. Hey, it's what it is for) and stored. Batch processing would then clean up and/or massage the data and store it in a form to be used (most likely in DB2, VSAM or a flatfile) and then the data is wholescale processed by a program (Cobol, Delta, Assembler) or SAS. Furthermore, for those who care, the mainframe has a linux partition running (alongside ZOS) and is capble of lots of useful functions.)
You have a sick, twisted mind. Please subscribe me to your newsletter.
I work in a company that uses Excel to track many aspects of the business. We face a similar problem trying to generate reports from multiple Excel documents. Recently, I've started moving each business owner to an RDBMS solution. Front ending to a web site, users can login to a simple interface to interact with the data. Using web servers, users can use Excel (remember, Excel can use XML streams as a data source) to manipulate data for any custom reports.
;)
There are many choices for a "free database". In addition to the traditional, free, Linux based databases, Microsoft, Oracle, and IBM have made free versions of their commercial database. In our case, I choose DB2 Express-C edition. Allowing 2 CPU's, 4GB of RAM, and unlimited data files and data file sizes, it was the best option for our company. I just had to convince the others to abandon Microsoft SQL Server Express.
Woah, woah, woah! Any shop using an ad-hoc collection of Access DBs and Excel spreadsheets is probably a small business that can't afford Oracle. They're comfortable with their current inefficient system, and the guy proposing this is planning on doing it with no funding and probably little to no allocated work time. He needs a free solution because he has no budget.
Proposing a multi-thousand dollar system is going to go over like a lead balloon in a workplace like this.
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
If I was doing something like this I'd be looking to build a good domain model of what I was doing and then a ORM tool to map that to the data. I'd also be looking for a dynamically typed language to write in, one that is supported by .Net for its native access to the data artifacts that you have.
http://www.tableausoftware.com/
5 03.html
Sure, it's not open source and it costs money, but it does everything you're asking. For you to roll your own, your company is going to end up paying you a lot more than that costs and then what happens when something doesn't work or if you leave?
http://www.stanford.edu/class/ee380/Abstracts/060
http://www.stanford.edu/class/ee380/schedule.html
> Woah, woah, woah! Any shop using an ad-hoc collection of Access DBs and Excel spreadsheets is probably a small business that can't afford Oracle.
.net, etc. Keep it extremely simple.
Not necessarily - since oracle for a small database ( 4gbytes of data I think) is free now anyway. But *oracle* doesn't matter - use of any database, even mysql, would be a drastic improvement.
What's probably more important is:
1. there's no network for a centralized solution, they use client software instead
2. there may be no funding to do this right
3. management may be of the type that doesn't like to tackle big improvements that it doesn't understand well
Ok, so lots of unknowns. But here's a potential approach:
1. A centralized solution using a single database is the ideal approach. But perhaps the network connectivity simply cannot be overcome. Or at least not immediately - so first implement a small database on each laptop. This means something really tiny like MySQL. Perfectly fine to start with, and compatible with everything else - so you could convert to whatever later on once the network issue is resolved.
2. You are probably stuck with the excel & access - since it sounds like they are the output of required applications. Fine, then you just need a way to import that data into MySQL. Some databases (like db2) have built-in import tools for excel - so you might get lucky. Otherwise, I'd shop around for the simplest utility to help with the task. I'd avoid anything that's too much of a distraction here -
3. I'd make the import/export process as simple as possible. Ideally a big green icon they punch.
4. You could use a light-weight http server along with php for the reporting. Again, very simple to implement.
Once the above is working fine on the laptops, then if the network problems can be overcome it wouldn't be too difficult to centralize everything. The same web reports that ran on the laptops can run on a server, along with the same database schema as well. Could theoretically even be mysql if the amount of writes is small enough. Uploading the files, or transferring data from the local copy of mysql would be the only new development required.
If you already have a normalized data, why do you need some other tools to generate a reports? Excel comes with a powerfull report generating facilities, which unfortunately not many people know how to use it.
There are basically 2 features in Excel that use can use
1. Pivot Table and Chart (Data Pilot in OpenOffice)
2. List
List is quite simple, it is good for filetering data (not really a report). However, Pivot Table is very powerfull. It is what makes Excel different from other spreadsheet! To generate a report, just drag and drop. You need a few try to get hang of the concept. Once you get use to it, you don't want to use other things for your reporting needs. I come to know about pivot table/chart when I attended technology preview by Microsoft. The guy said that it's what makes in Microsoft they do not use paper report. I don't know it's true or not, but it is possible. I would take pivot table report over paper report anyday.
OpenOffice is cathing up in this area, but Data Pilot does have some bugs. It's functional , nonetheless. It's nice to see a free alternative implementing this feature. I can comfirm that Lotus 123 does not have this feature. Does anyone know about Gnumeric and Koffice (having this feature or not)?
This software has some pretty amazing capabilites and can connect to spreadsheets, word files, multiple databases, HTML, XML, CSV, well you get the picture, all in one contiguous stream of data for pubilishing. I would recommend checking it out for your needs. www.patternstream.com
http://jackcess.sourceforge.net/
http://jakarta.apache.org/poi/
I may have missed something, but if you are happy using Excel/Access why do you then complain about Crystal Reports not being OS?
To have a right to do a thing is not at all the same as to be right in doing it
VBA has all you need for this job. You are working from single laptops. I understand.
Definately get all your data pulled into one Access db. Output reports from there.
But what about the charts (from Excel)? Use an automated process with VBA in your excel file to output the charts. Use the autoopen sub to start the process once the excel file is opened, or called from the access db. The excel file would open, do the data crunching(perhaps drawing the info from the access db with MS query) and chart building, export the charts to be picked up by the access db. Even better, be sure to use the UserDefined Charts - that way you get exactly the chart look you want. Be sure to look up the command for exporting the charts - it can be found in the VBA help file under Activechart.Export.
In a data center, long ago, I helped make simple excel files which were launched hourly. Upon opening, those excel files automatically pulled in server statistics (text files), ran calculations on that data, created nice charts, exported the charts as GIFs, and then closed down. The GIFs were static names, so they were overwritten. The GIFs were included part of an HTML file for viewing. Simple, effective.
http://monarch.datawatch.com/monarch-pro.asp
Lets you configure models to pull data from excel, access, text files of all sorts.
Scriptable with COM as well!
-Rob