Stored Procedures - Good or Bad?
superid asks: "I'd like to get opinions and real world experiences that people have had with database centric applications that rely extensively on stored procedures. I believe that most enterprise class databases such as Oracle, MS-SQL, PostgreSQL, DB2 and others implement stored procedures. MySQL has been criticized for not supporting stored procedures and will be adding them in MySQL 5. The ANSI-92 SQL Standard also requires implementing some form of stored procedure (section 4.17). So, I'm asking Slashdot readers: if you were architecting a highly data-centric web based application today from a clean slate, how much (if at all) would/should stored procedures factor into your design? Where are they indispensable and where do they get in the way?"
"The arguments for stored procedures are pretty straightforward: 1) Centralized code; 2) Compiled SQL is faster; 3) Enhanced security (as our application is over 15 years old, and consists of much legacy code, reimplementation and feature creep that now includes over 3000 stored procedures). At one time we had a client/server architecture so those three advantages were relevant. However, in the past 4 years we have moved everything to web front ends and I have argued that this is no longer true. Does it really matter if my business rules are centralized in stored procedures or in a set of php/asp scripts (ie, in the web tier)? Is it really important to shave compilation time when connection and execution times dominate? (and overall response is ok anyway?) Since the focal point is the webserver, shouldn't security be done there, rather than the DB?
In addition, you either have to have a dedicated T-SQL or PL/SQL coder who then is the weak link in your coding chain, or your pool of developers must become fluent in both your scripting language of choice as well as the SP language. I have experienced both of these approaches and found this to cause bottlenecks when 'the database guy' is unavailable and learning curve problems (bugs) with new coders getting familiar with the db language.
Finally, after staying with our DB engine choice for all these years we are acknowledging that they may not be around forever. Management has asked us to look into migrating our data and business logic to another DB choice. We'd sure love to just be able to point the web tier at a new data source but that is unattainable due to a convoluted tangle of db specific code."
In addition, you either have to have a dedicated T-SQL or PL/SQL coder who then is the weak link in your coding chain, or your pool of developers must become fluent in both your scripting language of choice as well as the SP language. I have experienced both of these approaches and found this to cause bottlenecks when 'the database guy' is unavailable and learning curve problems (bugs) with new coders getting familiar with the db language.
Finally, after staying with our DB engine choice for all these years we are acknowledging that they may not be around forever. Management has asked us to look into migrating our data and business logic to another DB choice. We'd sure love to just be able to point the web tier at a new data source but that is unattainable due to a convoluted tangle of db specific code."
I like to keep business logic in one place as much as possible. You are almost assured to have some in your app, so I try to keep all logic there.
Stored Procs and triggers make can make the code simpler and more efficient, but spread out the workings of the application and unless properly documented, more difficult to understand.
Just my $0.02 CDN
Bad, of course.
I only put triggers or constraints or whatever in the database for one reason: to make sure only valid data enters the database.
For instance if ColumnA must be between 5 and ColumnB+34, that should go in the database. The database itself should guarantee the data is clean. "Data logic" you could call it.
Business logic and everything else should go in the higher layers. There is some ambiguity about what is "data logic" and what is "Business logic" but it's usually pretty clear.
Why? Maintanence. The stored procedures tend to rust in place over time. If you're an "agile" developer you'll go nuts not being able to refactor business logic or have unit tests check your database procedures. If you're a "BDUF" (big design up front) shop, you might like it, but thankfully many are moving away from that.
Particularly for an application where you are returning large amounts of data, stored procedures hold a distinct advantage over dynamic SQL queries in that, if the SP is designed correctly, the database has pre-optimized the query plan at compile-time and runtime execution is therefore much faster. It also allows for underlying table structures to change without impacting your application logic.
Also, when it comes to long-term database maintainability, putting your database logic in stored procedures allows the db admins to get an accurate overview of what objects/tables are in use and which are no longer needed. At my company, where we have over 20 databases, this is an absolute must.
Generally speaking, I use dynamic SQL during initial development and move to stored procs for QA and production.
I actually miss them in my current job with mySQL. I used to like the way you could run a stored proc every X period to copy "live" data to "public" areas. Or, archive "public" data after "x" period of time. But then again, I am a micro$oft developer at heart, and all this Perl, CGI, Java, (even COBOL) on old RS6000 systems gives me a headache sometimes.
why are you posting if you don't have an answer?
you are only giving me a headache.
One of my favorite thing about postgres is it's support of plpythonu (python stored procedures) and the recently added java support.
Just define the function (in java or python) and SELECT it with whatever arguments you've designed it for. I don't know what the overhead involved is, but we used it more for convenience than anything else.
I was told basically, "the fewer the better" and "keep them confined to the innermost loops."
If I were a bit more of a tinfoil hat wearing man, I'd be Slashdot makes some of these "Ask Slashdot" topics up because the ensuing flamewar will cause more page hits than usual.
A bit trollish, but I'll respond: perhaps he's actually smart enough to seek outside opinions, even though he thinks he knows the answer.
That, or he's preparing some kind of presentation/paper to justify the use of stored procs to a boss who doesn't believe in them (or vice versa), and is seeking real-world examples to bolster his point.
Just a couple of possibilities.
What about version management of stored procedures? Yes I know it's possible, but it creates pain. Everyone must have their own copy of the DB otherwise an error by one developer modifying a SP breaks it for everyone, even if they have their own copy of the app. Scripts must be written to ensure that the latest SPs can be applied at the same time as code updates.
I don't like using SPs, but I think version management for me is the nail in the coffin.
Mark
The idea of a database is to put the whole data-relation logic in the database, if only to insure atomicity of operations.
Because as soon as you rely on an external process to maintain data integrity, you're bound to fall prey to some sloppy programmer who does not understand the data relationships and will not properly maintain the data integrity.
At least, when you use stored procedures, you can concentrate the data integrity logic in only one place, which is easier to control and manage.
Most of the development I do at my job is Coldfusion+Fusebox with SQL Server on the backend (I don't care if you hate MS, don't bother knocking SQL Server) and stored procedures just make life easier. They're also handy in the instance that you may have multiple front-ends written in multiple languages accessing the same database in many cases. Making a change to the way data is returned is far easier to do in one stored procedure than in X number of front-ends. One of the main reasons we don't use mysql is because the stable versions don't have them.
If the applications are written in one of the various scripting languages, then this argument doesn't apply:
One major problem with enterprise applications is that when a problem is found in an adhoc query (poorly written, a bug with the DBMS, performance related, etc) then the application would normally have to be recompiled and pushed out to the entire enterprise (could be tens of thousands of computers to push to). This isn't desirable.
Moving the queries into stored procedures (where possible) allows you to correct the stored procedure at a central location and roll it back to the 'old' stored procedure if necessary with minimal effort.
A good rule of thumb: use stored procedures for compiled applications
Jason L. Froebe
No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil
If it can magically make 503 errors go away - yes if it does anything else - no
If you have to ask, you'll never know.
Having the queries in one place is also generally an advantage, and if the vast majority (or entirety) of your queries are in those stored procedures then migrating from one DB to another means NO messing with DB specific code (and every query ends up being DB specific if it does much of anything at all) except for the query developers.
The shop I work in has two main applications which access the same database. One is a web environment written entirely in Perl where all of the DB logic has been pushed into stored procedures, and then further abstracted into modules. Now migrating from one DB engine to another simply means rewriting the stored procedures from PL/SQL to TransactSQL (for example), and some minor modifications to the underlying modules. Then if we want to change the business rules for that data we can change the modules with only minor changes to the app logic.
Contrast this with a mega app written in C which has tons of queries directly in it, a minimum of stored procedures, and a constant stream of bugs because of the morass this has created. The app moves slowly, ponderously, and half the time wrongly.
Anytime I'd be called upon to architect something, I'd be pushing stored procs as much as possible. They're a logical extension of good, modular design.
1) I'd use both and I wouldn't use security as an argument to use stored procedures. Mere "mortals" should not have access to the database server at all. Just beware of SQL injection attacks (Google it if you don't know what that is).
2) Stored procedures aren't always the fastest because you can't do array inserts with stored procedures, for instance.
3) Queries are cached. So the second time a query is executed it won't be compiled. Just make sure that your queries are parameterized. Don't put your values in the query with string concatenation. Use parameters. Otherwise queries can't be cached. You will also be vulnerable to SQL injection attacks.
4) Use stored procedures when you will gain a clear performance advantage. You may want to try to implement it in your data tier first, and if that isn't fast enough, move it to a stored procedure.
5) Buy or make a code generator that will generate data tier code for you (and possibly other code).
6) As for database compatibility, if you implement it as stored procedures, you are screwed for sure. If you use normal SQL you are probably screwed anyway. Check out this chart this chart for compatibility. And that only points out the parts of these databases that follow the standard. They do have plenty of non standard features as well. If you want to try your queries for standards compliance you can go here.
I have plenty more where that came from, but the wife needs the computer. Good luck though.
The Internet is full. Go Away!!!
Prepared statements and vendor-neutral SQL are the way to go for portability and controllability of the development process. Use SPs judiciously, if at all, and only when there's a highly compelling need to do so(e.g., order of magnitude speedup, etc).
Napster-to-go says "Fill and refill your compatible MP3 player", which is a lie. It's not MP3. It's WMA with DRM.
In addition, depending on your situation, it shouldn't be too hard for the developers to learn how to write stored procedures for the database. Once you know one language, learning another isn't that hard. The developers might write some inefficient code at first, but they'll get better very quickly. Plus, it will give them a better idea of how the database really works and performs, improving their overall designs.
Of course, IANADBA (I am not a DBA), so take it with a grain of salt.
A year ago, I converted an open source project I write in my spare time from MySQL to PostgreSQL. The primary driving factor in the conversion was to make use of the more robust features of PostgreSQL in order to maintain data integrity.
This involved of course the creation of a schema that made use of referential integrity and stored procedures for certain key operations that would enforce data integrity that the code required but fell outside of relational databases proper.
As I was completing the source code conversion I noticed that a lot of the data checks that had to be done under MySQL 3.x disappeared as PostgreSQL enforced it for me.
The creation of users, and other entities became much simpler as did their removal. Cleaning up the code and making it easier for me to make modifications to the scripts, without having to second guess another script having adverse effects.
The scripts themselves still handle logic, albeit at a higher level. The process of using stored procedures to handle data integrity and enforcing certain rules simply allowed me to concentrate on the bigger picture when dealing with scripts.
Of course, the trade off was in speed. MySQL to this day, still seemed to be capable of handling more loads since the site is dominant on SELECT statements. However that is more of an issue between PostgreSQL and MySQL proper.
Besides the optimization the DB might do on SP's as opposed to dynamically created SQL statements, SP's are nice from a security point of view.
:).
You have to be extra careful with dynamic SQL due to SQL injection bugs that we all know about. This isn't really an issue when you're dealing with stored procedures that take defined data types as opposed to creating SQL on the fly based upon your data (which could have injected SQL).
Controlling which DB accounts can use what stored procedures is also handy mechanism for determining permissions. Stored procedures represent what all your application might do, so picking which DB connections (which have credentials) can access these is a nice place to control those permissions.
Granted, you can still do lots of stupid things to mess up security
Also, there are places where SP's are not really possible. Severely dynamic reports are a good example (assuming you allow that functionality in your application). There's definitely times when you HAVE to generate SQL on the fly. In any event, try to create a "data access layer" in your code and if you have to dynamically generate SQL, run all sorts of checks on it with regexp's etc to check for injection.
One case where I find that SP's can be useful for storing business logic is when your system will have different front ends using different technologies. For example, if you have a web frontend with PHP or Java but also have a rich client written in .NET or VB.
Of course you can also solve this using an additional tier (like an app server and use web services) and it could be easier to maintain, but if performance is too much of an issue, then you could go for SP's for some of the logic.
I don't think it has to be an all-or-nothing decision, though. You usually end up with some logic in the app code and sometimes some logic in SP's.
Go hug some trees.
An RDMBS (as opposed to just a database) is actually for manipulating data, not just storing it. Otherwise you'd just use flatfile for everything and implement all the relational logic in your app code. The database can execute stored procs far, far faster than your app code can perform the same functions. Using database side stored procs gives you the exact same advantages as a class library with additional performance and security options. There's no loss. Why not use them?
I'm an Oracle DBA, and I like creating packages of stored procedures and functions (and especially table functions) that represent an API of sorts to the database. This means that the application code doesn't care how the data is stored, and the DBA is free to rearrange the data for tuning purposes, without requiring any app changes (assuming the API remains constant).
In the past I've supported keeping more of the business logic in the database, but I no longer believe this is an optimal design. Now I keep business logic out of the database as much as possible and limit the stored code to enforcing data integrity and making the database look like a "black box" to the apps.
I've found that, say, writing an app with a lot of code in Oracle PL/SQL, using cursors, etc, means your app will only and forever support Oracle, without a whole lot of re-write and likely re-design.
So unless you like vendor tie-in... stay away from db-specific stored procedures.
MORTAR COMBAT!
triggers that keep history tables
triggers that keep referencial integrity that a foreign key simply can't do
stored procedures that loop through large amounts of data that only return a small amount back
A stored procedure can be an evil, evil thing:
implementing complicated algorithms that can't be debugged
creating HTML
I've seen both sides. It's like asking if javascript is a good thing for web browsers.
John, who gets a 503 when trying to log on
Stored procedures are a performance optimization, consider the following scenario:
Retrieve the 20th page using a page size of 50 records for all the SKUs under a catalog (potentially millions total) for a specific user which could or not have visibility permissions for each SKU. Assume the security provided by the database is too coarse to fulfill the business requirements, therefore some set of rules must be evaluated to determine SKU visibility for a particular user.
That query would normally be very fast if implemented as a stored procedure because:
1) Only one round-trip is needed.
2) You don't have to move all the data to a middle-tier and then filter out information.
3) RDBMSes are usually faster at filtering data out (by using indexes, denormalization, etc.) that what a developer could code in a middle-tier to filter out information.
4) Most RDBMSes are very good at scheduling tasks, caching, managing memory, etc. The more you move logic away from it, the more you would have to implement it yourself.
You could send all the SQL statements to the database and achieve the same effect, but it might make debugging harder and you still have all the SQL logic in some place, only a different one.
On the flip side:
1) It's harder to write stored procedures than it is to write code in a managed language like Java or C# (thirty-line SELECT statements are not very intuitive).
2) Generally speaking, the compiler of a managed language does a better job at catching errors than a compiler for stored procedures, where a lot of the errors will be caught at runtime.
3) Stored procedures are not portable.
My advice is, if you are only using the RDBMS as a persistence device and your data size is not huge, avoid stored procedures and create some sort of middle-tier object model. Only when performance is a impediment, use stored procedures. You might as well use a hybrid approach, try to model as much as you can in the middle-tier and implement stored procedures for those tasks which are performance intensive.
I work with people that worship UML and patterns as well with RDBMS Gods that can plow through pages and pages of stored procedures without blinking. As much as I love ULM and patterns there are some tasks that must be done in the RDBMS for performance reasons, and tasks that are simply more maintainable when done in the middle-tier. Both approaches have advantages and disadvantages, the trick is to use the best approach according to the situation.
As a programmer, I find that making a change to a query or table can cause me rewriting code in every application I've developed.
With stored procedures, I just refence the stored procedure name and leave the query tinkering to a DBA.
The only thing that I have to make changes for is when the DBA changes a column name in a table or a parameter for the stored procedure. Also when a stored procedure is in use, and it needs to be changed, I have to make the program use a second procedure name and switch procedure names each change, because if the procedure is changed as my program is running, it will break if a parameter is added or removed.
I had to work on a docket calendar program for a law firm and we used stored procedures with the reports. The managers and lawyers were always adding things to the reports which needed changes to the stored procedures. We eventually maxed out the max number of tables allowed, and each stored procedure was five pages long with if else statements because of all the things that the managers and lawyers wanted.
Using regular queries would not work because of the flexability that T-SQL had to meet the law firm's demand. MySQL would not have cut it. The reports were in Crystal Reports.
Remember, Slashdot does not have a -1 disagree moderation, and no, troll, flamebait, and overrated are not substitutes.
I'm asking for it, aren't I?
Personally, I think way too much stuifdf is stored in RDBMSs. I work as a Java programmer in a non-IT industry, and everyone is happy-go-lucky about making every object map to a table. But its a huge impeadance mismatch. We have layers of DTO, DAO, VO, etc. in the way.
I think the world would be a better place if most of the typical day-to-day was stored in an object-oriented, transparent database, and the relational database was left for storing things where an RBMS really shines (arbitrary relational queries, etc.).
Once you've gone the way of an OOBMS, you have objects, so naturally all of your logic stays in your objects. The fact that your objects happen to be persisted for you is irrelavent. All you car eis that you have your objects.
Especially for a web shop.
I run web for a medium size telecom, and recently hired two people. I questioned all the applicants extensively about web security and just about none of them had anything remotely resembling a clue. Most of them listed web sites they'd worked on on their resumes, and more than half of these sites were vulnerable to both Form and URL SQL injection attacks, which are largely (in our case, completely) defeated with stored procedures.
(Even more of a good thing when the PHBs insist on being an MS shop...)
Personally, I find Stored Procedures to be a very difficult thing to manage in the long term of software development.
If you are designing a web application, then I find it much more maintainable to utilize DAO interfaces & impls since this allows you to make changes that might be necessary should you experience an unexpected change in your environment.
Need to move from MySQL to Oracle? Simply override any db-specific code from your ANSI Impl, and go.
Although if there is no chance of an environment change, stored procedures become much more attractive.
On the other hand, you should never, ever put actual application logic in a stored procedure. The reasons are several. The most important is that stored procedure languages are all, to a greater or lesser extent, crap. This comment will cause me to be flamed to death by those who only know PL/SQL etc, but the fact is it's true. They are not general-purpose programming languages.
Sure, you might not RIGHT NOW want to fork off sendmail from your application, but some day you might. Or, horror of horrors, maybe you'd like to write directly to a system file? Or use a neat SNMP library you found? Although there are twisted, hacker-like ways to do these things in most DBMs they are hardly the model of reliability or professionalism. [1]
Secondly, they tie you in at a fundamental level to a particular database vendor. Database software is generally neither Free nor free. They want you to put your business logic in their stored procedure language because it will only run in their database products. Lock in is bad. OK, you'll be locked in whatever you do, but I'd rather be locked into Java or Python than PL/SQL.
Thirdly, you are losing control of your application's performance. You have very little control over how the code will be optimised or run.
Fourthly, you are breaking abstraction. It is very, very hard to write stored procedures which aren't entirely dominated by the structure of the underlying database.
Finally, assuming you probably will have to have a middle layer between the client and the database anyway, it's a bad idea from a maintainability point of view to bits of the same functionality among your layers.
[1] have you ever written a cron job to run a query to dump a table to a file to be parsed by a Perl script to send an email? You might be an Oracle Portal user.
In my experience, this is a difficult question to answer. There are many factors that you should consider in making a decision to use all stored procedures or not.
First, every business has different needs. Every software development group is also different in what they can or cannot provide. There are camps on both sides- many people in the database discipline will say "put everything in the database" while hard code developers will sometimes opt for queries in code.
Some considerations:
1) Consider the needs of your application. Is there a good chance your application will need to talk to another database platform or backend at some point in the future? This could be an argument for not using stored procedures. AS far as centralizing business logic goes, that can be done in just about any tier.
2) Where is your current bottleneck? How possible will it be to scale out your database server? If you are in a web farm scenario, your database server may be under significant load. Putting more logic on the database server can be a lot more expensive- it is typcially a lot cheaper to sacrifice performance on the backend for scalability. In other words, if you can keep your database server relatively load-free you can always add more web servers. I currently support a site that has over 2000 concurrent users at any given time, and currently our DB is the biggest bottleneck. It is a lot cheaper to cluster web servers than DB servers, since the DB is centralized and web servers can be duplicated easily.
3) Consider the experience of your staff and the culture of your IT department. If you have a lot of developers/dba's that are used to programming with stored procedures, and management is used to that paradigm, it may be difficult to change architectures without a compelling reason. "If it ain't broke don't fix it".
I'm sure there are other considerations, but those are probably the most three important ones I can think of right now.
DBA: "I think we should use more stored procedures"
Project Manager: "That would be consistent with our n-tiered strategy"
Programmer: "But all of our business rules are in object libraries"
Division Manager: "Could I get the ranch dressing? Thanks."
DBA: "The right way is to put business logic in stored procedures. It's faster"
Project Manager: "We're standardizing on faster programs across the enterprise"
Programmer: "But we'll have to re-write the entire system!"
Division Manager: "Pass the pepper. Thanks."
DBA: "Oh, that's alright. We were going to upgrade the servers too. We can finish both jobs at the same time"
Programmer: "We're upgrading the hardware too?! So basically we're building the entire company from nothing again, right?"
Project Manager: "You're not being much of a team player"
DBA: "The new system will be more robust and have new features"
Division Manager: "Lets get the system built as soon as possible. We're laying off the entire division in a few weeks"
Programmer: "WHAT?! I just signed a new car lease!!"
Division Manager: "Pass the croutons please"
Business isn't willing to pay for products, innovation and careers, so we get brands, mortgage commercials and layoffs.
A good discussion on this topic can be found at this blog entry: Don't use stored procedures yet? Must be suffering from NIHS (Not Invented Here Syndrome). The content of the blog post (by Rob Howard, a former Microsoft employee who most definitely knows his stuff) is definitely a good read, but the real gems are in the comments, which there are plenty. There's an equally interesting thread on this discussion by another blogger with his entry called Stored Procedures are Bad, M'Kay?. Both worth reading, if nothing else for the comments.
I could not justify my existence if I were a turkey farmer. Would I terminate myself? Undoubtably, yes.
I know there are exceptptions but in most cases....
1) Stored procedures are not written in an object oriented language and are almost always not written in an object oriented way.
2) Stored procedures are not checked into a version control engine.
3) There is no sane way to organize them beyond manimg tricks. No breaking up your stuff using directories for example.
4) No global compilation. No way to check ahead of time whether you just broke another SP by passing a string instead of a number in as a parameter. You won't know that till it runs.
5) No unit testing frameworks.
6) No cohesive way to examine code flow. What you end up with is a mountain of code snippets scattered all across your database. Cross your fingers and hope each step gets excuted properly.
7) No real debugger. No stepping through the code, no breakpoints, no watches.
8) Most commercial databases charge you per CPU. This means your CPU cycles are best used to keep data integrity, process queries and return recordsets. Most middle tiers are not licensed on a per CPU basis so you can afford to throw a lot of CPU cycles into executing code.
9) Last but not least you can couple your middle tier using a high speed interlink so there is no real need to use SPs.
Feel free to add to the list. SPs are not good.
evil is as evil does
having your apps portable across DB's is one of those things people always think is important but in my experience it's a total crock for an app that is going to be installed in a single location, i.e. a commerce website or an enterprise specific app.
.Net.
You are just as likely to want to change your middle tier as your DB. More likely, in my opinion. If your business logic is in sprocs, then it's as easy to call from Java as it is from
Thank you, just to add a bit:
Would you want to send 5 million rows to the application just to check a few fields in each of them, and how they relate to records before and after them? Hell no, sending 5 million rows uses a ton of bandwidth, even for small row sizes. Also, SQL is a language for set operations, while most (99%, Lisp/etc are other 1%) application languages are designed around a single value, not a set of values. For example:
To check if array a[50000] is a subset of array b[2000000], in most languages, you must somehow iterate through both arrays (sorting would help, if possible) and see if each value in a (all 50000 of them) also exists somewhere in the 2 million values for b.
In SQL, something like SELECT COUNT(*) FROM a WHERE a.key NOT IN (SELECT key FROM b) would do the same thing, only much faster than downloading 2,050,000 rows and comparing them. Each vendor handles internal set operations differently, but all of them optimize their internal data structures and abilities to do exactly these types of operations.
--That's the point of being root, you can do anything you want, even if it's stupid.
This is a hot-button topic for me, so I'll try to keep it simple.
When I came to my current job, my opinion was to keep as much of the logic in the application as possible, stored procedures should help make the developer's life easier, and make the database more robust (triggers mostly do this part).
When I came here, however, they forced EVERY SINGLE QUERY into a stored procedure, and like you said, it brought development speed to a halt, because every project was assigned "a database guy" to do the work, but there weren't enough database guys for the projects.
It was pathetic, and when our big layoff occurred (we got bought out), we found alot of really really bad stored procedures that the so-called experts made for us.
Worse, it makes code releases a nightmare, because not only does the codebase have to be updated, but the database too, every single time. Unlike code bases where you can checkout your latest release, with databases you have to a) find all the altered database scripts, b) package them up for the release, c) halt the database for the release and d) run the script at go-live and hope you didn't miss one more thing.
Mind you I still use stored procedures, views and triggers where they make sense (when it's obvious that it just belongs in the database), but mostly I try to put as much logic as possible in the app.
Now you want to make me get into NULL values and candidate keys issues too! Bah, no NULLs anywhere, #1, and #2, don't make up integer primary keys when an existing field will work. Don't let your database administrators talk you into it, I beg you!
I work with several in-house database applications implemented using PowerBuilder. They run against a Microsoft SQL server.
We decided to use stored procedures instead of inline SQL so that we could make modifications on the server instead of having to change the code in the application, recompile it, repackage it and deploy it to over 100 PCs.
The drawback is that it's not always easy to know what's going on because the application is broken into various parts. That is there are business rules being enforced in the PowerBuilder code and in the stored procedures on the server. I guess once you get to know the application really well it doesn't matter but it can make it a real pain in the ass for a new programmer.
The race isn't always to the swift... but that's the way to bet!
1) The DBAs
2) Junior Microsoft developers (typically VB)
3) J2EE developers (N-tier developers)
My experience has been that DBAs are usually proponents of stored procedures. There are two simple reasons for this.. familiarity of database engine and job security.
The Junior developers who typically develop two tier apps prefer stored procedures for performance reasons, because there is usually a big-a*s backend database server which can lift a lot of the data processing load.
Then there are the N-tier developers who are seasoned enough to pick the right architecture. These people prefer non-stored procedural approach. They understand that the database is a persistent store and business logic should be in the middle tier. It gives them flexibility to swap database engines and business logic modifications are localized to the code modules they are working with.
When was the last time you saw the CPU sustain a high load on your database server? Database machines are constrained by I/O (disk and network) and memory, not by CPU in most cases. Therefore, your 8-way SQL box is sitting at 10% utilization. Why not get a bit more use out of it?
Portability for the sake of portability is a waste of time. More importantly, given that every vendor has non-standard extensions, and the SQL definitions (SQL-92, SQL-99) don't go far enough, you'll find that you almost always need to use some vendor-specific features, whether you're writing ad-hoc queries or stored procedures. At least with stored procedures, you only have to change it in one place when you migrate, rather than changing it all over your code. Tell me, how do you add an AUTOINCREMENT/IDENTITY/auto-numbering column to a table in vendor-neutral SQL?
Says you. In my team, all of our stored procedures are stored in the same code tree as everything else, controlled by the same source control. We take it even a step further, and check in all of our database object generation code (tables, keys, indexes, triggers, etc). So, just because you don't track your stored procedures in a source control system doesn't mean it can't be done.
You're assuming that the speed benefit from stored procedures typically comes from the fact that they're compiled. While that's true, it's also not much of a benefit (you might win 50-100ms per query, yay!). No, what's more important is that your queries are centralized, so that you can optimize them more easily. What's easier to optimize: A single query to <insert operation here> in a stored procedure, with the procedure called by 100 different methods; or 100 different queries to <insert operation here> across all of your code? Yeah, sure, you can centralize that as well, but what's stopping a developer from writing his own query rather than using the optimized one you've already provided for him? With stored procedures, you can deny access to individual tables and force that person to use your stored procedure.
I agree with the point about including all business rules in a single location.
My tendency with data intensive applications is to put all of the business logic in Oracle stored procedures. I then have a variety of front end applications accessing the stored procedures. When the integrity of the database is the main concern of the application, I might write all of the business logic in a Java, PHP or C++ layer, hoping that no-one dinks with the data.
The big advantage of putting all of the business logic in the PL/SQL layer is that it helps make a very clean separation between the different tiers of the application.
Procedural code and SQL code are two separate programming language processes. The first directs the computer from a singular point of view. For instance "do this, then that, then go here and check this. etc.". And the other deals with groups of items. such as "everyone wearing blue shirts go to room 103", or "we don't need these anymore".
As far as intermixing these code bases, your procedural business logic and data business logic should be split when it makes sense. The database is optimized for merging and managing sets of data, and procedural code is good for binding this to a functional form. The business logic should be split into these two zones and implemented appropriately. It would be inappropriate to return a set from a database then loop through that set searching for some name or value. And at the same time it would be unwise to return two sets and join them in your code. With experience it seams cleaner to maintain these two zones of code. This doesn't mean that you need to use stored procedures though.
As far as stored procedures, they are a convenient way of separating these two types of languages, another way is to in place the Sql code into your procedural code, but it seams advisable to centralize this type of code in one place for visibility, and manageability. If stored procedures are not available or undesirable, then using classes or function that are located in some central, or locatable place, is recommended.
As far as for speed, implementing the data and logic in the appropriate place will speed your application, but stored procedures will not in there own right speed anything up. At least in MsSql server, stored procedures are not precompiled. They exist as plain text, just like issued queries. They do however get their own query cache, separate for the issued query cache, which could be of a little assistance.
Anyways. I am over talking about this. Take it, as u wants it.
-- Grimace1975
It's a slippery slope...
I used to work for a large web hosting company that wrote there billing software frontend in Delphi, and kept all the business logic in stored procedures on MS-SQL. It got so bad, they even ended up having a stored procedure that generated HTML/Text invoices for customers! Ever tried doing text layout in a stored procedure? It was absolutely nuts, but once they had started putting all the business logic (and much more) in stored procedures, its hard to stop without "splitting the code-base".
They were also scared to upgrade from MS-SQL 6.0 to anything newer for fear of it breaking their stored procedures. They tried at least once and failed miserably. As far as I know, to this day, they are still running MS-SQL 6.0.
This whole issue basically put a strangle hold on the company, it took forever to "innovate" and they eventually got bought out. The new parent company has spent over a year trying to migrate away from this "stored procedure" mess.
I think stored procedures are good, but only in very specific circumstances. If you design and code your application properly, there is usually very little need to start down the slippery slope of stored procedures.
Open Source Time and Attendance, Job Costing a
- If the database schema changes, you don't need to rewrite the application
- Your data are secure and the model is always in stable state
- The operations on the data model are well defined and well optimized. Nobody is reinventing a wheel.
Of course, any business logic has to go to your applications.If programs would be read like poetry, most programmers would be Vogons.
I am have been an Oracle DBA/Developer for more than 10 years. I also got into Java and more hardcore OO programming around 1996ish. In my opinion, from a design and implementation standpoint, the use of stored procedures is extremeley useful to provide an abstraction layer for data access. This is valuable because the stored proc can handle the renormalization of tabular data to best fit whatever object model is used in the application. Simplifies the application and protects it from necessary database changes (normalization/denormalization) in terms of how the object is implemented into tables. The great benefit of this is that it lets dbas and other database developers implement the data model independently of the application! This forces the relational data model design to resources that can program it most efficiently. Obviously there will be a small number of cases where this is not a viable approach to follow, but in general, it is a huge time saver, risk reducer, and productivity enhancer.
i'm quite unsure of myself about this, though. at the moment i'm working on a budgeting application, and both performance and productivity are becoming an issue.
example: to aggregate budgets over a time period, i retrieve the budget objects for each budget period individually, and accumulate the aggregate data in code. this takes quite a bit of extra coding, and execution time is quite slow, however; doing aggregation queries in the db would certainly give better performance, and it would be a lot easier to slap together some queries instead of writing all that code.
so, the way i look at it, it comes down to a question of science vs. engineering. the scientific impulse is to adhere to the theories of keeping logic in one place, and respecting the objects' ownership of their data. the engineering impulse is to use the technique which is faster and easier to implement.
i guess at heart i'm more of a scientist than an engineer.
i'd love to hear others' takes on this question, btw.
pr0n - keeping monitor glass spotless since 1981.
Spoken from the mouth of someone who doesnt have a clue..
WHY isnt it as easy to add more database servers then it is to add more web servers?
It's becouse you're not looking at the applications from the side of someone who manages data and databases.
It can work both ways. And in reality, it should work both ways..
Ever seen a database having rows locked by 100 different web servers becouse they DIDN'T use any form of stored procedures?
-- I'm the root of all that's evil, but you can call me cookie..
What offended me was his strong implication that the two -- PHP/MySQL programming and real DB development -- are mutually exclusive. It's an attitude that I see a lot on Slashdot, and it pisses me off.
:-)
Just:
*) Ensure that your company blows inordinate amounts of money on software with expensive support to cover your ass. It doesn't matter whether it works better, but you should be able to point at someone else to blame if something goes wrong. This also means that you don't actually have to understand the product, and can trust your DBMS vendor's salesmen for all your information, which will chew it up into bite-size pieces and spoon-feed it to you.
*) Act very self-important, talk down your nose to anyone that *doesn't* waste as much money as you, and treat your money-wasting as a mark of pride ("I have a $N million dollar budget this year!")
*) Place "real" before every class of software you work with, and vaguely insinuate that there are technical (especially reliability) failings in any cheaper product than the one you use. It worked marvelously for "real UNIX admins" for years.
*) Mention "Fortune X" frequently. For example "Fortune 500 DBAs with $5M budgets like me use *real* database software -- *we* can't afford downtime".
*) Avoid, if at all possible, knowing anything about software or computers other than database interfaces and possibly a scripting language or two. It's also good to choose a single OS, learn only that, and then slip occasional comments about how other OSes are unreliable.
*) Slowly begin slipping corporate acronyms into your speech. Use "ROI" whenever it's appropriate, and even when it isn't, to make it sound like other people are just slipshod and guessing numbers. "Enterprise class" is also good, since it implies, with only very vague factual requirements, that you work at a larger and more stodgy company that the person that you're trying to put down.
You'll be well on your way to doing "real" database work in no time.
May we never see th
First, I think that stored procs are not only often good but lead to better, more powerful, secure, and flexible applications that would be feasible without them. But on the other hand, they lead to hard to maintain applications which are explicitly tied to one database. So they are necessary but often misused.
Triggers are more important and usually use stored proceedures to ensure that the information in the database is always meaningful or that some other automated activity happens on an insert or update.
Unless I absolutely have to, I try to avoid having my application call stored proceedures directly. A relational database manager *should* be able to hide the stored procedures behind a view allowing a nice standard interface for your data. This means that if you have to move to another RDBMS later, porting is much more simple and mostly confined to the backend.
BTW, I agree with the points about having your business logic in one place. Stored procedures allow you to move business logic which is inherent to the database into the database, thus making it available from all clients regardless of the language they are written in. For a single app/db pair this is not an issue but if you have a large DB with many different clients, it is a larger issue. Maintaining your application in one location is a LOT less work than reimplimenting it all in every one of your apps.
Triggers, BTW, as I mentioned before are very powerful mechanisms. They are not called by the app directly but are run every time a row is inserted, updated, or deleted (some RDBMS's also allow select triggers, though some have alternate ways of implimenting this). They can be used to impliment additional security restrictions, enforce referential integrity, or more advanced stuff such as send email when a new order is added to the database. Again, this is done regardless of whether the order is filed using a web app or a Windows app for the rep in the call center. Since the logic is unknown to the app, it isn't even aware of what happens after the order is placed. Talk about clean interfaces..... This requires stored procedures.
So, these are the great things about stored procedures. But when they are used badly, you end up with the stored procedures reducing the agility of the application because they tie it in too closely to the database. What do you do when your app is tied to Sybase and your customers want MS SQL Server? What if *all* your logic is in that database? Do you rewrite *all* of it for MS SQL Server? Probably Not. You are stuck and out of your investment.
In my opinion, it is important to maintain a clear line between what is the database's job and what is the application's job. If this line is blurred, bad things can result. Stored procedures are a very easy way to blur this line.
I design my apps with the following layers to them:
UI
Application-specific logic
Database Access Layer
Information Presentation (SQL API using views)
Information Management and Data Logic (usually stored procedures)
Information Storage (highly normalized, usually).
This allows the database to be a server app in its own right, and the client logic to run in the apps themselves. HERMES (see my sig) is mostly built this way, but there are a few things I need to change before I am happy with the interfaces. This is one reason it no longer supports MySQL.
LedgerSMB: Open source Accounting/ERP
It's just you. I use stored procedures to:
1)Provide a consistant interface to applications regardless of changes to the underlying database structure.
2)Enforce row level security.
3)Enforce data integrity.
Trees in relational tables without recursion is a problem that has been solved for years. http://www.sqlteam.com/item.asp?ItemID=8866 It is trivial to implement nested sets with a few lines of SQL and after the tree is constructed, a simple SELECT will give you children, descendants, or ancestors.
Since the focal point is the webserver, shouldn't security be done there, rather than the DB?
Security should be done in every layer of the system. If you only did security on the web server, if crackers are able to compromise it, getting into the database is simple at that point. But if they can only run stored procedures, they then need to find a hole in the database which halves the likelihood of getting in.
Of course there are all the costs of maintaining the extra level of security. So you have to consider the cost/benefit of the security you put in place.
I understand some people use SPs for everything, as a security measure. I'm not that paranoid.
Stored procs are great, though, for processing lots of data on the server, without wasting resources (network bandwidth, client memory and CPU) sending the data to a client for processing and then posting results back to the database. Even if you have to do record-by-record processing instead of set operations, it's much more efficient to use a cursor on the server than a loop through a resultset on the client.
Stored procs are good at encapsulating a series of operations done periodically, such as end-of-month processing. In some databases you can define an event on the server that will automatically fire the SP. With such an event on the server rather than on a client machine, you don't need to worry about whether the correct client is running and connected when the event is supposed to fire.
Complex reports often require more data crunching that just a single query. I use an SP to process the data and leave the results in a temporary table. My client program runs the SP, then reports with a simple SELECT on the newly generated result table.
"[snip]At one time we had a client/server architecture so those three advantages were relevant. However, in the past 4 years we have moved everything to web front ends and I have argued that this is no longer true. Does it really matter if my business rules are centralized in stored procedures or in a set of php/asp scripts (ie, in the web tier)?[snip]"
You've argued this because you've only been around for a we fraction of time and have never watched technology change from one generation to the next. If you stick the logic in the database, what is the cost of developing a new platform or changing web technologies? What's the cost of change if you don't put your logic in the database? In five years, are you going to assume your PHP code is a tomb of business process logic that can be trusted because its logic has been segmented and acts independently from data presentation hacks or frufy one-off hack jobs?
"[snip]Since the focal point is the webserver, shouldn't security be done there, rather than the DB?[/snip]"
Web software is a moving target. For every platform, web program, service, etc., you have to enforce the same set of rules. Stored proceedures are to databases what functions are to programming languages. Use them or your business will suffer from inconsistencies ranging from data management to more serious problems such as data access and theft. I bet companies that process credit cards or people who work on medical records probably have strong and justified opinions on the topic of data security... think the cost of implementing consistent database proceedures is bigger or smaller than the cost from a screw-up that lets the data walk out the front door?
You mention MySQL in your post... use a real database such as PostgreSQL for two months and let me know how your opinion on databases has changed at the end of your trial run. Compare and contrast this experience with your experience stemming from application development with MySQL. If you think MySQL in its current state is still a valid core for an organization at the end of that duration, you obviously suffer from some form of brain damage.
I regularly freelance for projects where we have one template programmer, one middleware programmer (me), and one DBA. With the same team, we've tried projects with stored procedures and without. In general we've moved away from sprocs.
The first time we tried sprocs, we basically treated them as functions. I would pass in a bunch of arguments (db column contents) to sprocs that would insert a new Activity, for instance. This got old very fast. It was much faster for me to write the business object that would insert itself from the middleware layer, than it was to wait for the DBA to create the sproc, after which I would have to create a middleware layer to the sproc anyway. It also didn't make financial sense for the client, because DBA's usually charge a higher hourly rate than middleware programmers.
The other sprocs were ones where I would supply several search criteria to a sproc (basically portions of a where clause), where he would assemble it into a sql query and then return the result set back to me. That was a bit more useful, but ended up kind of silly too, because it wasn't efficient to involve the DBA in actual application logic - we kept on having to go back to him whenever someone wanted to add a new dropdown to the search form.
If you're been an intermediate programmer, you've painted yourself in the hellacious corner of trying to dynamically generate a sql query that may or may not join across multiple tables. It ended up being a lot easier for the DBA to simply create a view for each family of search queries. Then I'd assemble the sql query on the middleware layer. Easy then, because I'd never have to worry about dynamic joins - the view would already have the joins, and I'd only be querying against the single view. And if there was a query change, I wouldn't have to involve him unless is actually required adding new columns to the view.
Right now the complexity of our projects don't require the remaining cases where a sproc would really make a huge positive difference. One such case would be a multi-step atomic transaction where we were worried about performance. A sproc would be perfect for that. But in general you can do just fine with inserting into tables and selecting from views without having to deal with the cost of having a significant amount of your project in sprocs.
Finally, an important tiebreaker between having logic in sprocs and in middleware is a pragmatic one - system resources. If you're making a lot of changes, you're going to be dealing with source code management - trunks, branches, and multiple staging installations. It's much easier to do this with your code repository than your database. With ost companies I've worked with, it's a lot easier to set up a new vhost for a code installation than it is to set up a third oracle installation. If you have a lot of quick changes to make, it's easier to make them in the codebase.
Beyond that though, it really depends on the team. If we had a full-time DBA rather than our 10-hour/week guy, and a less competent middleware programmer than myself, and a project with more fixed requirements, then we might defer more to sprocs. But our DBA is swamped, our projects tend to have ever-changing scopes, and I'm quite comfortable with MVC and keeping the control layer thin, to be able to respond quickly to the scope changes without having to majorly rework business objects on the model layer. It works well for us - and these are for large scale bank intranets, not simple little webapp one-offs.
Many people that know too many buzzwords think that "the business object layer" by definition MEANS the database and sprocs. It doesn't have to. It can just as easily mean the Model layer of an MVC middleware layer. With my work style, it's faster to leave it there and then use the database for storage and data-level calculations that can be embedded in the queries themselves.
skkkoooonnnggggkkk ptui
Stored procedures solve a problem that predates languages like Java and C#. Once upon a time, you built client server apps using a client built in C, C++, or (god help you) PowerBuilder or VB. Database code either had to go in the user interface or the database. There was no other choice. So, to centralize business logic, stored procedures came about and people began placing the logic there.
In a three-tier or web architecture, stored procedures have no place. Centralize your business logic in business objects on the server. This makes your application independent of any underlying table structure or persistence mechanism. You can get the speed of stored procs by using prepared SQL in your database mapping code. These days, you can use mapping tools like JDO to avoid any database mapping code.
Where stored procs still have a place is inadministrative functionality, such as background batch processing. That's it.
I agree. However it should be noted that just changing the prodcedure language doesn't avoid proprietary lock-in. Writing a stored procedure in any language requires writing to a specific API with specific structures in a specific context for accessing, manipulating, and returning data. The fact it's written in C or Python or PHP or PL/SQL may make the stored procedure easier for someone to understand (if they're more familiar with those languages-- and I agree with a previous poster that PL/SQL beyond the most basic is difficult to understand, and don't get me started on T-SQL in MSSQL!), but most definitely doesn't make it portable. You can't really pull the procedures out and use them elsewhere--another DB or in some middle layer of software. Creating any significant DB functionality with stored procedures will result in a reduction in the portability of the application. You're locked in, even if it's a lock in to PostgreSQL.
Now don't get me wrong, I don't think this is a bad thing, and I'm in favor of using stored procedures, and have written quite a few of them, for Oracle and for PostgreSQL and MSSQL. They keep key functionality centralized, and allow less experienced coders to do useful work without having to understand all the complex data relationships, integrity rules, security policy, and such that are a part of any non-trivial DB application. I've also noticed that PostgreSQL's "native" procedure language is getting more and more compatible with Oracle's PL/SQL, making Oracle ironically perhaps the least subject to proprietary lock-in.
Everything I've ever learned the hard way was based on a statistically invalid sample.