Data Locking In a Web Application?
An anonymous reader writes "We recently developed a multi-user application and deployed it to our users. This is a web-based application that used to be a Windows application which was written in Delphi using Paradox databases for the client database. In the Windows application, we used the ability in Paradox to lock records which would prevent users from editing the same data. However, in the web application we did not add in a locking facility for the data due to its disconnected nature (at least that's how I was shot down). Now our users are asking to have the locking back, as they are stepping on each others' edits from time to time. I have been assigned to look at best practices for web application locking of data, and figured I would post the question here to see what others have done or to get some pointers to locations for best practices on doing locking with in a web application. I have an idea of how to do this, but don't want to taint the responses so I'll leave it off for the time being."
Lots and lots of Duct Tape.
Same as bugzilla does. Just use a timestamp or counter on the records so you can tell when an edit occurred while you were editing
Then you can review the edit.
If you want, you can use XHR (maybe with a slow load response for performance depending on the number of users) to notify that an edit happened.
-- perl -e'print pack"H*","6e656d6f406d38792e6f7267"'
on Google Books. You are welcome
So that's what the song "Tainted Love" is really about! Who knew.
You make sure that edits are handled in a form on a web page with a submit button. The user gets to fiddle all the bits they want on the web page, then they hit the submit button. At that point the web app goes and locks the stuff it needs to do to update the database to reflect the user's changes. It then applies those changes, then commits them, thereby releasing all the locks.
If two users might potentially be editing the same records, keep an SHA-256 hash of the original data around as a hidden form field. Then when the update proceeds, check the data to make sure the SHA-256 hash matches the data you fetched when you generated the form page (helpfully put into a hidden form field). If the hash doesn't match, tell the person who did the submit that some fields may have changed and somehow present them with what those changes might be.
Need a Python, C++, Unix, Linux develop
is a great book about this. Even if you don't use Hibernate or Java. If you hate Java just take the parts of the book dealing with Java and burn them. The rest of the book has awesome discussion of database design for the web... and those parts are worth the purchase price by themselves.
"Hibernate in Action" covers "Optimistic Locking" which is a simple technique. Just put a versionNumber column in every table and never let anyone insert any version number less than the one in the database... http://en.wikipedia.org/wiki/Optimistic_locking ... if you have another scheme... even if it is smarter and better than this in every conceivable way DO NOT DO IT without FIRST getting Optimistic Locking working.
[signature]
Slashdot is hardly the right venue to get a good answer to this question (how the hell did it end up in the Hardware category?), but I've dealt with this a zillion times, so I'll give a pointer to what is very likely the correct answer: optimistic locking.
Hard locks are probably not what you want in a stateless web app. (E.g. What happens if someone locks a record and then is hit by a bus?) Instead, here's how it works:
This is also known in the vernacular as "second save loses". It may sound too harsh, but it is much better than "first save loses and user isn't notified", which is what you get if you have no currency checking at all. And it's also much more web friendly that your old desktop app (which uses an approach that is technically called "pessimistic locking").
The most rabid believers in American Exceptionalism are the exact same people whose policies are destroying it.
Look at Confluence by Atlassian. When you edit a page they track the edit action. When another user goes to edit the page they are warned that "John Doe is currently editing this page, last edit at date/time". They also do polling via AJAX so if you're working on a page and another user starts actually editing it you see a message on the page "Jane Doe started editing this page". They also save page drafts scoped to the user to help people resolve edit conflicts. It seems to balance things well with not explicitly forcing locks but actively letting users know when they are heading for a conflict.
Check out CouchDB. It is built around the concepts of distributed (and even offline) databases and handles conflict resolution. It employs optimistic locking.
Instead do optimistic locking... Assume there are no conflicting edits (or that they are at least rare). Then version each row (with a monotonically increasing number for example). At the beginning of the transaction also retrieve the version, and upon save verify that the version did not change - if it has changed there was a conflicting edit in the meanwhile and the current save should be prevented (you could then get fancy and retrieve the current version of the row from the database and show it to the user, etc).
One can actually show that if the rate of collisions is low optimistic locking even performs better, whereas in scenarios where the contention is high (a significant fraction of transaction result in a conflict) pessimistic database locks performs better.
You could use an incrementing hash. Then you'd both be happy.
Do you even lift?
These aren't the 'roids you're looking for.
Actually, I don't blame them. The first instinct of people coming from a client-server background is to introduce to some form of record locking. Since this isn't "in the box" with web app frameworks, it makes sense to push back on the feature until you have user feedback or other analysis that it's actually required. Otherwise you are spending valuable time coding/debugging a feature that will rarely ever be used.
Business. Numbers. Money. People. Computer World.
I just made a console a month ago that handled this problem as follows:
Records that need to be processed are in 'pending' (unconfirmed actually) status, once any user clicks to select the record, it is timestamped and the user is 'locking' it. Actually the user is assigned to it and all other users see that this record is 'locked' by the first user who selected it.
However, now anyone can open the details of the record and do the following: they will see a button 'Take the lock away from $user$', so they can take the lock away! But there is history of who took whose lock, so the problem will be solved outside of the applicaction if they take each other locks away.
If the user locked a record he has a choice of 'save', 'save and release lock', 'release lock' buttons on the record details.
The users are allowed taking the lock away from each other so the lock resolution is pushed into the real world and out of the app.
You can't handle the truth.
Although I don't know what your implementation, or even what server-side language or database you use... I'll comment.
- For the sake of simplicity, add 2 columns to the table you want to be able to lock. Call them `lk` (lock) and `lkts` (lock timestamp).
- When a user is currently editing the row/document/whatever it is, have an icon of a lock or something to display in the list if someone is currently editing it.
- When generating the list of 'documents'(/whatever it is), check `lkts` and compare it to the current time. If it is stale (5 mins old), clear it and allow people to edit it. Always allow users to view the data.
- When a user clicks on the 'Edit' button, change the `lk` column to 1 and `lkts` to the current timestamp (UNIX_TIMESTAMP under sane DB's)
- Use RPC or XML-RPC to save the document periodically (every 60 seconds or so). Every save, update the `lkts` with a fresh timestamp.
- When the user clicks "Save and Exit" or "Save and Continue", have it submit the form the old fashioned way, save the data, and set `lk` and `lkts` to 0.
- Use Javascript to detect how long of a period of time passes for of no-activity. If it goes on past, say, 10 minutes, submit the form (thereby clearing `lk` and `lkts` and allowing other users in to edit)
This is bottom-of-the-barrel designed for simplicity. No security or anything in mind, but simply something that will work even after a browser crash or someone leaves the computer with a 'document' open and walks away.
Disclaimer: Just worked 14 hours. Very tired, don't want to go into any more detail. If this makes absolutely no sense to anybody, please discard this message.
Optimistic Concurrency
Both the curse and the blessing of web applications. Most of the work is offloaded to the browser, thus not bogging down the database servers with keeping a ton of row level locks in memory, or even worse, page level locks.
For the programmers POV you use some back end language, php, java, ruby, python, it matters not, write a program, it launchs, connects to a database, ( no matter how much middle-ware you slap in ) sends it a query, gets the data, returns it for presentation, consideration and subsequent modification ( or not! ) by the user and then the program ends. You are no longer connected to the database, heck your browser is no longer connected to the server!
Some have mentioned AJAX <sigh...> AJAX is nothing but bundling together a few different bits of tech to do ONE thing, make a call to the server without refreshing the page. No matter how you slice it and dice it, thats all it does, it makes a call through the web server, to launch a program written in one of the afore mentioned languages and it follows the same set of steps, through either the post method or the get method and nothing has changed!
So you need a scheme to know if you can write to a record without overwriting someone else changes.
The only real choice is to use a timestamp value, all databases support them, usually down to the millisecond of accuracy. It is a simple process which you can make more complicated as you desire. As many have mentioned, you read the record making sure you get the timestamp of the last update. That timestamp gets sent to the browser along with the data. When the user clicks save the stored procedure that does the actual update then compares the timestamp you are sending with the one on the current record as in "select for update ...." and if the one you are sending along does not match the one on the current record, then your update loses and the stored procedure reports that back and then you deal with the user feedback in any way you see fit. Typically this is done by sending back the record in is new state and telling the user, "sorry, but you have to star over.".
Now having said that there is nothing to say that you cannot be imaginative with a bit of javascript or something like that, or even with the php array_diff() function or an equivalent in some other language then insert some fields above or below the the data that was previously changed to at least have the conflicting data shown in both forms eg: what it is NOW and what they wanted it to BE.
Hey KID! Yeah you, get the fuck off my lawn!
We ran into the same problem.
What we finally did is lock the editing page, so that if someone else had it opened you were not allowed to update it until they removed the lock on that page.
Or the user could over ride the page lock if they felt pretty sure that the other user was not using it for editing ( Maybe they just had it open on their desktop).
In a table we put the page, user identification, and timestamp when the lock was created.
So whenever the page was opened, it checked the table to see if it was locked. If it was locked, then it displayed header showing who had it locked and how long they have had it.
We generally only have 3 to 4 users that may open a page for editing and they soon learn that if you are going to edit something after it has been sitting for some time to update the page.
We should probably update this with ajax so that at least the header of the page tells the user someone else has taken the lock.
But currently happens though is that the page won't update if it doesn't have a lock and the user has to go back if and start over if someone stole the lock. So far I haven't heard of it happening, because they usually open or update right before they start editing so they know they have the lock.
But handling it in this manner has greatly reduced our problems.
Yeah, it's amazing how if you think it could happen it will. And most of our problems, I think, were caused by users opening the same page on multiple computers and then instead of closing the page, they were updating the page with the old information.
I think we'd need way more information to come up wiht a good solution - this is an overall application architecture problem, not just a locking problem.
What are the use cases? what kind of app is it? what is it that you are trying to lock, exactly?
When you are ordering tickets through TicketMaster.com, they hold the seat assignment for you for 10 minutes. If you don't complete the transaction within that time frame, the tickets become public again.
In your database setting, the user Alice wants to edit the customer Carol's record. The application gives Alice a lock on Carol's record for five minutes. If user Bob tries to edit Carol's record within the five minute window, he gets a message telling him to wait for 3:42 while Alice finishes her edit. When Alice is finished, the lock is released and if she doesn't finish in five minutes, the lock is released anyway and her edits are lost.
You could also add the ability for the user to set the lock time, within a reasonable window, say 5-15 minutes. Also, consider adding the ability for the user to renew the lock.
BTW - Paradox is still around? I haven't used it since 1993 or so. Wow.
This is more a question of requirements than implementation. If your users want wikipedia style optimistic locking, just do that. If your users want hard locking with a timeout, do that. Just like your online bank does.
If users ask for hard locks without timeout, ask them what their real requirements are.
don't cut it off www.mgmbill.org
Locking is a solved problem in most Database Management Systems. I think you are worried about the wrong layer of your application. Web and Application code is most often agnostic to how records are retrieved, updated, and locked for concurrency. For reference, look up the ACID properties of a typical RDBMS.
Wikipedia gets by without locking because it keeps multiple versions. If you really want to do locking, just throw a column in your database called "locked_by" and lock a record with "update foo set locked_by=CURRENT_USER where foo.id=WHATEVER and locked_by is null" and make all your updates conditional like "update foo set bar=baz, ..., locked_by=null where foo.id=WHATEVER and (locked_by is null or locked_by = CURRENT_USER)". Databases have atomic transactions for a reason...
I have a separate locks-table with lock_id, id of the other record and timestamp. When a user brings up the record it is read only. He have to click Edit to begin editing and thus requiring a lock. This is where the lock table is checked. If there are no locks a lock entry is inserted and its lock_id is propagated to the edit form. Upon saving one checks that it is the same lock id, saves and removes lock. If a user tries to edit a locked record one can give him the option of either wait, or override if the users credentials allow it. He then get a new lock id. The other user either get an error message when saving stating that another one override his edit or some AJAX controller periodically checks the lock and takes action if it disappears. If the user just closes the edit form, forgets about it one can expire the lock by deleting old locks.
i don't argee. while he shouldn't be leading the project, he can still work on it. did you know everything about every application you've ever worked on operated??? this is how real world experience is gained, you don't start out the expert.
If you mod me down, I will become more powerful than you can imagine....
This is how we did it in an in-house AJAX app for a big corporation.
There would be a temp-table for every editable data-table in the db, that has the same structure plus a ID_User field. When a users starts editing a data-set, the data is copied into that temp-table. Other users trying to edit that data will get a view-only version of the page and info on who is editing the data. On "submit" the data is copied back to the "real" table, the lock thereby released.
Should a user decide to abandon his session, crash his browser or anything like that, he will have his unsaved edits back on next login. He will even be redirected to the edit page automatically after login, if there are pending edits.
A user can "steal" his own lock (actually session, that is) when he tries to log in with a fresh session (e.g. from another workstation at a colleague's). The old session will bail out automatically saying "session hijacked" (on any next ajax action) and the fresh session will get loaded with the old one's data.
Other users can only steal locks after a timeout (to avoid problems with people going on vacation with locked data on them)
----
This eactually worked pretty well from a user's perspective, because as opposed to with optimistic locking there would never be a situation where a user would have done some edits in vain or have extra work rechecking it.
From a developer's perspective it was a pain in the ass (maintain 2 almost identical tables, maintain checkout/checkin-code everytime).
From support's point of view, it wasn't too bad. Seldomly people would call and demand release of a lock because someone just went to have coffee and he had to edit the data real quick or managment would kill him on the spot because the data would be frozen in like 7 minutes to generate reports.
OK. Here's what to do.