Data Locking In a Web Application?

← Back to Stories (view on slashdot.org)

Data Locking In a Web Application?

Posted by timothy on Thursday September 24, 2009 @02:46PM from the get-offa-my-cloud dept.

An anonymous reader writes "We recently developed a multi-user application and deployed it to our users. This is a web-based application that used to be a Windows application which was written in Delphi using Paradox databases for the client database. In the Windows application, we used the ability in Paradox to lock records which would prevent users from editing the same data. However, in the web application we did not add in a locking facility for the data due to its disconnected nature (at least that's how I was shot down). Now our users are asking to have the locking back, as they are stepping on each others' edits from time to time. I have been assigned to look at best practices for web application locking of data, and figured I would post the question here to see what others have done or to get some pointers to locations for best practices on doing locking with in a web application. I have an idea of how to do this, but don't want to taint the responses so I'll leave it off for the time being."

19 of 283 comments (clear)

Duct Tape by Anonymous Coward · 2009-09-24 14:49 · Score: 5, Funny

Lots and lots of Duct Tape.
Same as bugzilla? by Derek+Pomery · 2009-09-24 14:51 · Score: 5, Informative

Same as bugzilla does. Just use a timestamp or counter on the records so you can tell when an edit occurred while you were editing
Then you can review the edit.
If you want, you can use XHR (maybe with a slow load response for performance depending on the number of users) to notify that an edit happened.

--
-- perl -e'print pack"H*","6e656d6f406d38792e6f7267"' /. ate my old sig. Bastards.
1. Re:Same as bugzilla? by Anonymous Coward · 2009-09-24 14:58 · Score: 4, Informative
  
  Exactly, this is how I have done it in every web application I have developed. If someone updates the data while someone else is editing then they will get a message saying someone did an edit. They then get a chance to review the new data and modify their edit if needed.
  NOTE: It is critical that the user not lose their edit. Save that data even if you don't actually do the update. There is nothing more annoying than spending 15 minutes carefully putting in a bunch of data just to have it lost due to someone else editing the same record. Let the user review what happened and then modify (or not) their own data they were putting in.
2. Re:Same as bugzilla? by Zarf · 2009-09-24 15:20 · Score: 4, Informative
  
  For the record this is called: http://en.wikipedia.org/wiki/Optimistic_locking
  
  --
  [signature]
3. Re:Same as bugzilla? by Dexx · 2009-09-24 16:01 · Score: 5, Insightful
  
  To ensure the edit isn't lost, we handle this by kicking the user back to the form with a message. You could go one step further and get the modified record from the DB, then highlight the field in question and give the user the option to keep/override. You could make it more intelligent by detecting the collision, analysing the difference, then committing if no fields conflict. Depends on the business logic, I guess.
  
  --
  Feel the fear and do it anyway.
4. Re:Same as bugzilla? by nahdude812 · 2009-09-24 23:59 · Score: 5, Informative
  
  When a client wanted to know while they were working on a record that someone else had it open (they truly wanted the record locked while one user had it up on the screen), we used a LOCKED_BY and LOCKED_UNTIL field on each relevant record. While editing, records are read-only if LOCKED_UNTIL is in the future and LOCKED_BY is not the current user.
  On the edit page, an AJAX call is made on a 10 second interval which updates LOCKED_UNTIL to be +30 seconds (this way even if there are network issues of some sort, three consecutive status updates need to fail in a row). If the browser is closed or the computer blue screens, etc, after 30 seconds the record unlocks itself. When you save the record, LOCKED_BY is nulled, and LOCKED_UNTIL is set to the epoch.
  We also employed a version ID so that if all else fails and your client for some reason stops keeping the record locked (eg you suspend your laptop and come back to it later), when you submit your edits; if anyone else had made edits while your client was unable to keep the record locked, you're still given an indication that another user updated the record. The interval update checks the version ID too (a single SQL statement with PostgreSQL's excellent UPDATE RETURNING syntax) and warns the client if somehow someone else updated the version without this client having been able to maintain the lock - as soon as the next update interval succeeds the user gets notice.
  The ajax call was basically something like:
  UPDATE tbl_something
  SET locked_by = (current_user), locked_until = (time+30)
  WHERE record_id = (record_id)
  AND locked_by = (current_user)
  RETURNING
  locked_by, version_id
  Double check that locked_by is still the current user and version_id is still the known version of this record.
  
  --
  Slay a dragon... over lunch!
Re:The euphemism treadmill by Anonymous Coward · 2009-09-24 14:54 · Score: 4, Funny

So that's what the song "Tainted Love" is really about! Who knew.
The way this is generally handled... by Omnifarious · 2009-09-24 14:55 · Score: 4, Informative

You make sure that edits are handled in a form on a web page with a submit button. The user gets to fiddle all the bits they want on the web page, then they hit the submit button. At that point the web app goes and locks the stuff it needs to do to update the database to reflect the user's changes. It then applies those changes, then commits them, thereby releasing all the locks.
If two users might potentially be editing the same records, keep an SHA-256 hash of the original data around as a hidden form field. Then when the update proceeds, check the data to make sure the SHA-256 hash matches the data you fetched when you generated the form page (helpfully put into a hidden form field). If the hash doesn't match, tell the person who did the submit that some fields may have changed and somehow present them with what those changes might be.

--
Need a Python, C++, Unix, Linux develop
1. Re:The way this is generally handled... by palegray.net · 2009-09-24 15:01 · Score: 5, Interesting
  
  My company has an internal app that approaches locking in a different manner. When you start updating a record, it uses an AJAX routine to set a lock on the record being updated. As long as you're still on that page, you "have the lock" and other users are notified of this if they attempt to edit the record. Once your changes are submitted, the lock is released automatically. It's possible to "steal" a lock in our model; this may not work for everyone. If you didn't want to allow this, you could incorporate a timeout for locks, whereby the original user would be notified that the lock had expired due to inactivity.
  
  --
  512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.
Optimistic concurrency by Shimmer · 2009-09-24 15:12 · Score: 5, Informative
Slashdot is hardly the right venue to get a good answer to this question (how the hell did it end up in the Hardware category?), but I've dealt with this a zillion times, so I'll give a pointer to what is very likely the correct answer: optimistic locking.
Hard locks are probably not what you want in a stateless web app. (E.g. What happens if someone locks a record and then is hit by a bus?) Instead, here's how it works:
1. User X fetches version 1 of Record A.
2. User Y fetches version 1 of Record A.
3. User X modifies her copy of Record A and attempts to save the change.
4. System checks whether incoming version (1) matches database version (1). It does, so the save proceeds and the version number on the record is updated to 2.
5. User Y modifies his copy of Record A and attempts to save the change.
6. System checks whether incoming version (1) matches database version (2). It does not, so User Y is notified that he cannot save his changes.
7. User Y fetches version 2 of Record A and tries again.
This is also known in the vernacular as "second save loses". It may sound too harsh, but it is much better than "first save loses and user isn't notified", which is what you get if you have no currency checking at all. And it's also much more web friendly that your old desktop app (which uses an approach that is technically called "pessimistic locking").
--
The most rabid believers in American Exceptionalism are the exact same people whose policies are destroying it.
1. Re:Optimistic concurrency by gr7 · 2009-09-24 15:26 · Score: 4, Informative
  
  What shimmer says is exactly what you should do with 2 possible additions. Often people leave themselves in a web page for an hour and then start to make edits. So when the user makes the first edit, use ajax to see if there was already an edit done in the meantime so they know before they make lots of changes.
  Also you should consider using sequences instead of checking if the data changed. Both are good ideas in certain situations. For example with a table that is only edited once every few months, I use a sequence on the whole table. For a table that is changed 100 times per day by 3 different users, either do row based sequences or check to see if the 'from' part of the changes match the database.
2. Re:Optimistic concurrency by hibiki_r · 2009-09-24 15:47 · Score: 4, Insightful
  
  +5 Is not enough for the value of the parent post. Optimistic Locking is the right answer in 99% of the cases. The issue then becomes how you want to deal with re-submitted of changes. If the entities to be saved are small and very atomic, asking the user to retype, making sure their changes are still sensible on the modified record makes sense. If your records are very large and/or very complex, then you might consider using some business knowledge to see if changes to the record can be grouped logically, and maybe even committed individually: If someone changed data for X shipment of a purchase order, while someone else changed Y, then the changes don't really have to conflict.
  But whatever you do, build it around optimistic locking: Don't try to lock a record because somebody just has it open somewhere on a remote location. That path leads to madness.
3. Re:Optimistic concurrency by argodk · 2009-09-24 16:29 · Score: 4, Informative
  
  Absolutely correct, but that just means that there has to be server-side locks for the commitment phase (4-6), it doesn't impact the client-side. This has an implication for performance of the commitment phase, but luckily, database vendors have been struggling with efficient implementation of commit for years, so using the transaction features of whatever database is used for storage should resolve most of those problems (i.e. check and update the version number in the database in a single transaction).
4. Re:Optimistic concurrency by cerberusss · 2009-09-24 19:42 · Score: 5, Funny
  
  Hard locks are probably not what you want in a stateless web app. (E.g. What happens if someone locks a record and then is hit by a bus?)
  There's a Firefox extension for that.
  In our company, users' pulses are tethered to the USB bus. The Firefox extension can then use this information. People spend hours in our time accounting system, which has a pessimistic locking scheme. The Firefox extension sends an 'unlock' when the user's pulse stops for whatever reason. We've had buses driving users over, we've had rabid squirrels, a janitor going postal, exploding Sony laptops and a manager doing the 'Godfather-baseball-bat-routine' on an unsuspecting employee. Our time accounting system runs great, we've never had a stray lock.
  
  --
  8 of 13 people found this answer helpful. Did you?
Confluence by goofy183 · 2009-09-24 15:28 · Score: 4, Informative

Look at Confluence by Atlassian. When you edit a page they track the edit action. When another user goes to edit the page they are warned that "John Doe is currently editing this page, last edit at date/time". They also do polling via AJAX so if you're working on a page and another user starts actually editing it you see a message on the page "Jane Doe started editing this page". They also save page drafts scoped to the user to help people resolve edit conflicts. It seems to balance things well with not explicitly forcing locks but actively letting users know when they are heading for a conflict.
CouchDB by deweller · 2009-09-24 15:35 · Score: 4, Informative

Check out CouchDB. It is built around the concepts of distributed (and even offline) databases and handles conflict resolution. It employs optimistic locking.
Use Optimistic Locking by linuxhansl · 2009-09-24 15:36 · Score: 4, Informative

Don't take out a database lock (also referred to as pessimistic lock sometimes). Web transactions tend to be long lived and there's typically no easy way to know when the user just abandoned the edit (and hence you would not really know when it save to release the lock, unless it is by timeout or explicit release by the user).
Instead do optimistic locking... Assume there are no conflicting edits (or that they are at least rare). Then version each row (with a monotonically increasing number for example). At the beginning of the transaction also retrieve the version, and upon save verify that the version did not change - if it has changed there was a conflicting edit in the meanwhile and the current save should be prevented (you could then get fancy and retrieve the current version of the row from the database and show it to the user, etc).
One can actually show that if the rate of collisions is low optimistic locking even performs better, whereas in scenarios where the contention is high (a significant fraction of transaction result in a conflict) pessimistic database locks performs better.
The Only Choice is... by FlyingGuy · 2009-09-24 16:15 · Score: 4, Informative

Optimistic Concurrency
Both the curse and the blessing of web applications. Most of the work is offloaded to the browser, thus not bogging down the database servers with keeping a ton of row level locks in memory, or even worse, page level locks.
For the programmers POV you use some back end language, php, java, ruby, python, it matters not, write a program, it launchs, connects to a database, ( no matter how much middle-ware you slap in ) sends it a query, gets the data, returns it for presentation, consideration and subsequent modification ( or not! ) by the user and then the program ends. You are no longer connected to the database, heck your browser is no longer connected to the server!
Some have mentioned AJAX <sigh...> AJAX is nothing but bundling together a few different bits of tech to do ONE thing, make a call to the server without refreshing the page. No matter how you slice it and dice it, thats all it does, it makes a call through the web server, to launch a program written in one of the afore mentioned languages and it follows the same set of steps, through either the post method or the get method and nothing has changed!
So you need a scheme to know if you can write to a record without overwriting someone else changes.
The only real choice is to use a timestamp value, all databases support them, usually down to the millisecond of accuracy. It is a simple process which you can make more complicated as you desire. As many have mentioned, you read the record making sure you get the timestamp of the last update. That timestamp gets sent to the browser along with the data. When the user clicks save the stored procedure that does the actual update then compares the timestamp you are sending with the one on the current record as in "select for update ...." and if the one you are sending along does not match the one on the current record, then your update loses and the stored procedure reports that back and then you deal with the user feedback in any way you see fit. Typically this is done by sending back the record in is new state and telling the user, "sorry, but you have to star over.".
Now having said that there is nothing to say that you cannot be imaginative with a bit of javascript or something like that, or even with the php array_diff() function or an equivalent in some other language then insert some fields above or below the the data that was previously changed to at least have the conflicting data shown in both forms eg: what it is NOW and what they wanted it to BE.

--
Hey KID! Yeah you, get the fuck off my lawn!
This is user requirements, not implementation by viking80 · 2009-09-24 16:36 · Score: 4, Insightful

This is more a question of requirements than implementation. If your users want wikipedia style optimistic locking, just do that. If your users want hard locking with a timeout, do that. Just like your online bank does.
If users ask for hard locks without timeout, ask them what their real requirements are.

--
don't cut it off www.mgmbill.org