Ask Slashdot: Changing Career From OLTP To OLAP Dev
First time accepted submitter xby2_arch writes "After spending over 12 years writing OLTP applications (Java EE/JDBC/ORMs), I decided to dabble in the OLAP world. I had decent DB skills, considering most of my previous projects had involved data modeling and coding using Stored Procs, etc. Yet I hadn't designed or implemented any dimensional databases. Luckily for me, I had enough relevant domain knowledge to land a developer job in a data warehousing project. The work was enjoyable enough that it motivated me to spend that extra time and effort I needed to cope with the different dynamics of coding in the OLAP realm. In my past life, data volumes weren't the primary concern (instead, transaction volumes were), here, everything was about data. ETL/Integrations present another set of problems you generally skirt in a typical web/app-tier developer role. All in all, it turned out to be a non-trivial, yet worthwhile transition. I am certain that there are plenty of seasoned developers out there who plan to make a similar move (or have made already), who see data as the next chapter in their careers evolving toward becoming Enterprise Architects. I want to hear what's holding them back, or what helped them move forward. What should be considered a prerequisite to make this switch, and what are the risks, etc.?"
Please stand by while the trolls try to figure out how to respond to your question.
In the meantime, you should expect about 50 posts by PHP developers asking what a stored proc is, and suggesting you move to RoR if you want an ORM.
So, um, what the crap are OLTP and OLAP?
(And yes, this marks me as hopelessly undereducated, and obviously a fool who doesn't know anything about Real Programming. So sue me. Just tell us what they are, too, please?)
Dan Aris
Fun. Free. Online. RPG. BattleMaster.
FUCK. I can't tell you how many times I've tried to use software written by some ignorant fuck who thought that his knowledge of programming languages translated into knowledge of the industry he was in. Learn a language and and industry... and stick to it.
The first thing we did to strategize mission-critical web-readiness to expedite wireless users was leverage our dot-com ROI and aggregate robust e-markets. Being able to synthesize cutting-edge channels enabled us to unleash extensible users, which in-turn enabled us to orchestrate turn-key mindshare.
Utilizing our synergistic functionalities, we were then focused towards mesh visionary markets and envisioneering collaborative initiatives with our partners. The net result was the incubation of plug-and-play experiences to transform vertical vortals and utilize cutting-edge deliverables.
Plus, I have a large cock. That helped a lot.
Man, I am glad I work in the world of startups where we just Get Shit Done.
12 years doing "OLTP"??? Just fucking shoot me.
..post to a board that expands uncommon acronyms.
You know, somewhere professional.
So... you're saying you've already made the switch from OLTP to OLAP and you'd like to take this opportunity to gloat about it, but you'd still like to hear from other developers what they think the prerequisites are for making such a move and what has held them back from doing all the cool stuff you're doing? Or am I missing the question?
Breakfast served all day!
Moving from one specialized type of programming to a closely related type of specialized programming is pretty straightforward. Apply for such a position and you wont suffer compared with other candidates. Or, if your current employer needs something new done, do that new thing. You're not talking about a major career change here. Programming is programming. Even moving from something like standalone application GUI programming for windows in C# to back-end web service programming in C++ on Unix isn't that big a deal. If you can program, you'll pick up the new tech/language/idioms as needed and notice the striking similarity in the work you actually end up doing.
I'm thinking of making the leap from C# 3.0 to C# 4.0. Does anyone have any advice?
For those of us living in the "real" world of actual consumer products and wondering what these IT acronyms may even mean, here is what google found
So... you're saying you've already made the switch from OLTP to OLAP and you'd like to take this opportunity to gloat about it, but you'd still like to hear from other developers what they think the prerequisites are for making such a move and what has held them back from doing all the cool stuff you're doing? Or am I missing the question?
You forgot to mention that he thinks that moving from being a code monkey to a data monkey is suppose to land him an architecture role. I would have thought his original job would see him better qualified. At best this is a step sideways but in reality it is probably a step backwards....and if he doesn't realise this it's probably just as well for all involved.
But then even slashdot's heyday ask slashdot was about clueless time wasters asking how to do their job or apply for one they weren't qualified for and had no idea about. Now that slashdot is a shadow of it's former self why would we expect the quality of these submissions to improve?
These posts express my own personal views, not those of my employer
So you changed job from one thing to a highly related thing. Great, but why tell us about it?
You totally lost it on '...evolving toward becoming Enterprise Architects', Seems you have been hanging around the buzzword management types for far too long.
Sounded like a speech from a pointy haired boss....
Reminds me of the cartoon of a dull-looking man talking to a woman at a party - text below was "You may not think it to look at me, but in my time I've been a bank clerk, bank teller and bank customer liaison officer".
"The greatest lesson in life is to know that even fools are right sometimes" - Winston Churchill
Is this a conversation from a Dilbert strip, but without the punchline?
... the next chapter in their careers evolving toward ...
Whats a "career"? We don't have those around here in "IT related fields". I suppose if you live in silicon valley there is a chance of upward mobility, or maybe TLA .gov jobs, but for everyone else, its just luck that got us in a good spot in a downsizing economy in a downsizing company and downsizing department where they haven't axed us yet.
"Career" in general would be a more entertaining "ask /." topic. Work in the above plus the "ha ha noobs don't realize than even the concept of my job didn't exist when I was their age" and plenty of ageism whining and funny stories about nepotism and stuff like that.
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
To be good at such projects, you should have a decend masters degree in applied math or computer science (with a focus on basic theoretic concepts, such as logics, knowledge and meta modelling, PetriNets, graph theory, etc.).
On a side note. Next time you ask a question, don't gloat that much about how cool you are and ask your question straight. And please try to use normal language. Thanks.
The prerequisites to making the switch is first and most importantly having an appropriate business case for OLAP. The second prerequisite is that you've tried doing analytics in a traditional RDMS, perhaps jumped on to the NoSQL bandwagon, and you've failed at it (i.e. success for a little while but then your data eventually brings your queries down to its knees). Don't worry, failure isn't necessarily wrong, it's just you and your team needed the experience before you could make the next leap.
The risks are a knowledge jump in to an OLAP mindset from a traditional SQL mindset. Invest in you and your fellow developer's knowledge. Push back on management and sales when they want more immediate results and let them know that it will take 3-5 months to replace your current system. Do your proper technology evaluations. Learn FoodMart and Adventureworks and let them guide you down the path of good fact and dimension design. Don't snub your nose at Microsoft as they absorbed the company in the 80's that basically pioneered this stuff and made billions, but also don't take their stuff too literally as there are several products out there and some that do things better.
Read The Data Warehouse Toolkit thoroughly and practice using Mondrian which is an open source Java OLAP engine that can sit on top of PostgreSQL, MySQL, and others. Find a good ETL tool rather than trying to write your own at first and don't be afraid to force your internal users to use this tool to create their facts. Don't worry if you don't get it the first time, but keep trying and keep discussing with your fellow developers as it takes a team to work out all the kinks. Later on you'll probably end up seeing how you did things wrong, but hopefully you can get most things right in the beginning.
Your lack of understanding that real programmers do everything you're talking about in the course of a year, not a career, is what's holding you back. You're never going to get to "Enterprise Architect", so you should cut over to DBA now while you can and find some very large company where you can remain hidden until you retire.
Most Enterprise Architects I know started out as good programmers. They decided to become managers to boost their salary, then learned that they really sucked at managing. By then they were too high a level to return to programming so the company gave them a "lateral move" and the title of Architect. Now they go to meetings with the big shot managers and listen to vendor sales pitches; afterwards the VPs ask for their opinion in a way which leaves little doubt that their opinion had better support the decision the VPs already made based on cool buzzwords.
50 posts by PHP developers asking what a stored proc is
we feel offended. as if a thousand developers cried in agony because someone who is in a different field thinks that his field is more elite than ours. oh the humanity !
hear ye ! hear ye ! person in random field thinks those in another field are less important and knowledgeable - and their work too.
Read radical news here
Enterprise-class TLA-soup homework! Useful!
From my limited experience, the OLAP community is small and/or behind walled gardens, the tools are poor and closed source, and potential employers are only interested if you have experience in *their* BI tools (Pentaho, Microstrategy, Cognos, etc). Microsoft appears to be the only one trying to establish a theoretical basis for BI, but their efforts are starting to show age despite their being so much more that can be done in the field. Finally, you will be misunderstood by the majority of Rails/PHP/Web developers: The same one who think Key-Value stores and NoSQL are the height of modern technology.
That said, BI can be technically satisfying. If you get down to the SQL/MDX you will appreciate what a database can do; which allows questions to be phrased succinctly. I have seen too much code written in procedural languages (Javascript being the worst of them) that are many lines long and run atrociously slow, that can be restated in SQL (or MDX) simply, and run a 1000x faster. I love that fact there are no loops!
From a business perspective, you have much more exposure to management and other departments: You will have improved visibility in the company, and your worth will be inflated - as you will be the one that satisfies management's appetite for more information to help make decisions.
Your previous experience as a developer/DBA is largely irrelevant. Data analysis is a completely different discipline (and depending on tools, may even be a different language than SQL).
Basic building cubes is something you could learn in 5 minutes. Advanced stats, how to look for patterns, what to look for, etc. is much more involved. Most of the people I know who are good data analysts have advanced degrees in mathematics.
BTW, why do you think the career path is dev -> data analyst -> enterprise architect? Completely irrelevant. Plenty of data analysts who couldn't write code. Plenty of EAs who couldn't construct an OLAP cube - they're focused on infrastructure, apps, etc. EA really has nothing to do with data analysis, other than designing systems to support it. In many companies, EA is not some priesthood at the top of the food chain - often they're a virtual team made of different disciplines.
Advice: on VPS providers
As for becoming an "architect", this is how I became one: I took all the ( little ) experience I had, I designed some stuff, made sure the code monkeys could and would actually code it ( me knowing their life as I had always been one ), got some $$$ incentives for them so they would build what I had thought up. I defended it in hard fighting with the management. Then the stuff went into production. It took balls, hard work and some luck - and, yes, some politics. "The rest is ashes and dust", as Russell Crowe has it in "Gladiator".
Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace
Actually i matured 5 years or so ago, after spending some 10 years in the delusion of thinking that the fields i was in were more 'elite' than the others. they were different fields than programming. but, the core concept of 'our work is tougher and more elite - other people are not as elite as us' nonsense was the same.
i grew over it.
Read radical news here
I made a similar transition and my furious pace and insistence on sub 300 millisecond response times for any query, or web page load was foreign to those in the OLAP world. I would submit a query to our OLAP (I was doing pricing optimization—cool enough) which took over 60 minutes to get a useable dataset. Most of the people I worked with were very nice and extremely bright though they existed on a very relaxed pace; one I just couldn’t adjust to. I had such a tough time of it I moved back to the OLTP world where someone would listen to my insistence, enjoy my furious pace and manic persona and I couldn’t be happier. Do give it a try though. It is very rewarding to be able to analyze all a company’s data and write one query that will help generate more profit. I do wish you well and just remember to slow down.
Read the book, Star Schema by Robinson. Also, you can do both OLTP and OLAP - really, they complement one another.
As a system administrator, I've supported OLAP processing for a good part of my career, although I'm not doing that work currently. What I've seen is that originally a lot of the data warehousing complex was used just to run the database with it's billions of rows and many terabytes of data, running something like Redbrick or maybe Sybase IQ. Then we started to spend more time handing "staging" (ETL) workloads, with tools like Ab Initio, SyncSort and N-sort. Every year, the amount of data increased, the number of details increased. We were constantly fighting to get the reports done in our 24 hour window. If you get behind, you may be catching up for awhile.
These days, it's all about using Hadoop to prepare the data for load into a database, either Oracle 11g RAC using ASM, or perhaps a Netezza box. I see some of the column stores like Vertica becoming important for some workloads in the near future.
As a result, I'd say that people wanting to move into data warehousing should learn Java, Hadoop, Oracle RAC. Keep an eye on HBase, Hive, and the NoSQL solutions for the future.
One thing that I'll mention that I've found very useful is that throughout the time I was working on this stuff, perl was very handy. I imagine other scripting languages would be useful, too, but in my case perl provided the glue that made everything work.
OP asked a reasonable question and most responses were either wisecracks or otherwise simply ignorant and/or useless. Good thing I only come here for the most part to see what the tech headlines are; otherwise, I'd be disappointed when I read the usual half-backed, leftist, ignorant MS-bashing rants that pass for commentary on this site.
Once you go beyond data warehouses and ETL, the programming language skills required for OLTP and OLAP are different
MDX for multidimensional cube queries - can get a little hairy
DMX for data mining models - a bit limited in terms of extensibility
Correct cube design is also an issue - use case specific dimensions & measures; storage price / perf ratio
I started my career with OLAP & BI, been at it for a decade or so now. Also doing OLTP like everyone else.
Not sure what to say, it's just bits and pieces, and you need to process those so that you can meet requirements and things like that.
This is one of the things described as "If you can see a difference, you are not qualified to work with either".
Face it. You have spent 12 years mucking around with one single application that happens to be easy to shoehorn into multiple systems. You do not understand underlying theory. You can not solve a simple problem -- how to organize data to perform complex queries that can yield some analytical information. This is not something you can "learn" from a vendor manual to some expensive chunk of unreadable Java code. This is not something you can copy/paste from "tutorials". This is something you were supposed to know before you taken your current^H^H^H^H^H^H^Hprevious job. You think, you can fake your way up to the "Enterprise Architect" title?
I have one advice for you -- kill all your friends, then yourself.
Contrary to the popular belief, there indeed is no God.
You are a little vague about your new role as a OLAP developer. I'll try my best to give you some advice from what i have learned in my first 5 years out of college as a OLAP developer. This by no means is a comprehensive guide and I am still learning as I go. My experience might be a little different from the norm; I work at a company where we rolled our own OLAP system completely from scratch.
Subject matter and use cases are critical. You must work with your subject matter experts to understanding what your users are trying to accomplish with your data. This stage is easy to gloss over because sometimes you might think you know what and how the users want something. I have been humbled many times by making this stupid assumption.
If you are writing your own reporting tool you will learn quickly how to write factories to generate queries. OLAP users like to stretch their reports in every dimension possible; it should be dynamic. Also if the queries are not at least aggregate aware you are wasting your time. Find good reporting interfaces such as tools that can draw tables and charts.
Populating an OLAP database is more similar to network programming than anything else. You are simply loading data into structs, iterating over your data, and passing it along. Understand dimensional modeling. Carefully choose your dimensions and their scope; they will change.
Read Kimball's book. I'm ashamed to say I have not done this myself yet.
I'm not sure if my advice is coherent or even applicable to you. But if you have more details about your responsibilities I can try to help you out.
I mean the switch isn't so binary. He recently switched, he wants to hear other experiences to help him, and he generally thinks it would be a good Slashdot discussion.
Democracy Now! - your daily, uncensored, corporate-free
I love the fact that you made that joke, and people still needed to blast you for mentioning MS. And they actually claim Oracle-controlled, lawsuit hell Java is in a better position.
Democracy Now! - your daily, uncensored, corporate-free
Development is development. There are differences, but its still development. I am a DBA and a database developer. I have worked on very large OLAP and OLTP systems. Here are the main differences. My experience is primarily with oracle.
1. You need to really learn SQL well to handle OLAP well. You often get very complex requirements. Processing large amounts of records always performs exponentially better using straight sql if possible. You want to use as little procedure code as possible. Don't write loops, use java, shell, or procedure database language.
2. you need to dig into the DB's features. In particular, features for processing large numbers of records. For example, analytic functions are critical. They are somewhat confusing if you have not used them. They are part of the ANSI 1999 standard. Most people do not know them. I think most SQL databases have them. Each database will have an OLAP guide. It is not just about using the cool buzzword features. You have to dig into regular things. Understanding table partitioning is useful.
3. you need to test this stuff out. The docs are not always 100% correct.
4. There are alot of things you need to do to improve performance that will hurt performance in an OLTP. For example with Oracle, you may end up using parallel processing and alot of create table as nologging. Both of which do not scale with large numbers of users. OLAPs typically have less users, but larger batch processes. These are invaluable in OLAP processing. For example, if you have a very complex requirement, you may do a series of create table as statements to get to the report you want. This allows you to break up the logic so its simpler to follow, it is also fast. You can't do this with OLTP, since this hurts performance with alot of users. Create table as statements also allow you to break down very complex requirements into simpler chunks. I often do create table as (and many of them) to chunk my steps. Then when I figure out how to answer the requirement, I combine and simplify what I did.
5. you have more freedom to do more outside the box coding approaches. As stated above, since you have less users, you can do alot of stuff that doesn't work in volume. Google around. Alot of these things are on the web. However, you need to often combine them to come up with better approaches.
6. ETL... this means dumping, loading, and changing around large volumes of data between databases. This is large batch processing. This follows what I said above. Avoid using procedural logic as much as possible. Try to use as much straight sql as possible. Running a loop that calls sql for each row is the absolute slowest way to do this by alot. However, that is how most developers write their ETL because that is all they know how to do. You can look like a genius, by taking this code and getting rid of loops and sub loops. Once you do it a few times its easy, but you will look smart.
7. if you use an ETL tool, make sure you understand what it does underneath. figure out how it is generating the sql and what the code all looks like.
8. if you use the approach that java developers use (they are basically the only ones who do this) that the database is a data dump and then talk in mumbo jumbo and do all processing in java. your code will suck. someone like me will come in and do what I said to do in step 6. Ill look like a genius. you will look stupid. done that lots of times... the hardest part is getting past the java guys mumbo jumbo. IF your a python, C, .net, or any other language developer you will have an easier time doing OLAP because your not as caught up in the silly cliches I hear from java developers.
9. let me add one more time. This is a mistake virtually everyone makes. Loop based procedural logic is the absolute slowest as absolute worst way to do OLAP, ETL. SQL is written in C. So it is optimized for array processing under the covers. If you use loop based logic, you eliminate that. Sometimes you have to, but avoid it as much as pos
First, read An Introduction to Database Systems:
http://www.amazon.com/Introduction-Database-Systems-8th/dp/0321197844
Second, read Temporal Data & the Relational Model:
http://www.amazon.com/Temporal-Relational-Kaufmann-Management-ebook/dp/B005UY0W0E/ref=sr_1_fkmr0_3?s=books&ie=UTF8&qid=1326737114&sr=1-3-fkmr0
Then, read any book you want on data warehousing. If you like, read Kimball and/or Inmon. If you read anything by Kimball or about his methodology and think it's the right thing to do for a data warehouse, go back to steps one and two. However, you can use a varient of the Kimball methodology when you get to "Cube Land" (OLAP). But, for a data warehouse you really want to understand and implement temporal data in your data store of choice.