Tuesday, February 14, 2012

Concurrency Control

Hi! I'm building a web application with ASP.NET, and using MS SQL 2000 for m
y
database server.
How should I do to guarantee the integrity of the data in spite of the
concurrent access? Meaning... how can I make sure that more than 1 user can
update 1 table at the same time, while no error will occur? Do I need to add
some codes at my aspx file? Or do I need to do something to my database? Or
do I not have to worry about it?
Thank you.The easiest way is to use a rowversion (timestamp) type column. Look it up
in the books online for most of the details, but simply put, it is a
automatically maintained value that will change everytime your rows in your
database change. Then, as part of every update and delete of rows you have
cached, you can compare these values in the where clause and if they don't
match you knwo someone else has modified the row.
Then you just fetch the row back to the client and let them know what
changed.
----
Louis Davidson - drsql@.hotmail.com
SQL Server MVP
Compass Technology Management - www.compass.net
Pro SQL Server 2000 Database Design -
http://www.apress.com/book/bookDisplay.html?bID=266
Blog - http://spaces.msn.com/members/drsql/
Note: Please reply to the newsgroups only unless you are interested in
consulting services. All other replies may be ignored :)
"wrytat" <wrytat@.discussions.microsoft.com> wrote in message
news:12D9DA82-8D0B-451E-881F-6AAEE852DC14@.microsoft.com...
> Hi! I'm building a web application with ASP.NET, and using MS SQL 2000 for
> my
> database server.
> How should I do to guarantee the integrity of the data in spite of the
> concurrent access? Meaning... how can I make sure that more than 1 user
> can
> update 1 table at the same time, while no error will occur? Do I need to
> add
> some codes at my aspx file? Or do I need to do something to my database?
> Or
> do I not have to worry about it?
> Thank you.|||That all depends on what is actually needed. You have two viable choices in
my opinion:
1. Last write wins
2. First write wins
In the first scenario, the data from the person that makes the last write
call is the data that sticks. This is by far the easiest, but it has the
obvious downside of not informing the first user who just wrote their data
that their write was superceded.
In the second scenario, the data from the first person that writes to a row
sticks and an error is thrown for all other users. This is far more
complicated as you have to decide how to handle concurrency errors. Do you
let users choose whether to overwrite the data? Do you simply overwrite it?
Do you throw an error and force the user to re-enter their information? If
on a multi-row write, only of subset of writes fails, do you rollback all
writes?
As you can see, the second scenario is far more complicated to handle.
However, if you want to use the second scenario, then you need a way of
determining whether the row has changed since you last read the data. That
can be done in a couple of ways:
1. Timestamp. You add a timestamp column for every table that participates
in the First Write Wins scenario. For every write, you ask whether the
existing timestamp on the record is the same as the one you have when you
read the data. If they are different, you have a concurrency error,
otherwise write the data.
2. Other row version data type. Same solution as #1, except you use
something other than an actual Timestamp field. I have used Guids in the
past successfully.
3. You compare all fields you are trying to update with the values you had
on initial read. In order to really do this, you need to use dynamic SQL. It
is simply too cumbersome to try to handle with stored procs. The .NET
SqlDataAdapter does a passable job at this. There are also Object Relation
Mappers that will do this work for you or you can roll your own. Beyond not
being able to use stored procs, the other downside to this approach is that
the SQL statements can very long very fast, especially if some of the fields
are Text or Image.
4. Checksum. SQL has a checksum function that XOR the data in all the
columns to produce a quasi-unique number representing the contents of the
row. The catch is that SQL's checksum function does not account for Text or
Image columns.
IMO, you really need a very strong justification to going with First Write
Wins. The additional complexity added is significant and must be justified
by a high expected rate of collision and the absolute need for
user-consistency on all writes.
HTH,
Thomas
"wrytat" <wrytat@.discussions.microsoft.com> wrote in message
news:12D9DA82-8D0B-451E-881F-6AAEE852DC14@.microsoft.com...
> Hi! I'm building a web application with ASP.NET, and using MS SQL 2000 for
> my
> database server.
> How should I do to guarantee the integrity of the data in spite of the
> concurrent access? Meaning... how can I make sure that more than 1 user
> can
> update 1 table at the same time, while no error will occur? Do I need to
> add
> some codes at my aspx file? Or do I need to do something to my database?
> Or
> do I not have to worry about it?
> Thank you.|||in ado.net the concept of pessimistic locking is a bit difficult.
basically, ado.net only directly supports optimistic locking.
you'll need to read up on that in a good ado.net book.
sql server has concept of isolation level that you will likely want to read
up on. You can modify the Isolation level via ADO.NET via the sqlConnection
object.
Not to be taken lightly as High concurrency will yield low Consistency and
High consistency will yield low concurrency.
need to know what your application is doing and how tolerant it can be in
regards to dirty reads, etc.
study up. Also Load Test and Performance test Early and Often.
Greg Jackson
PDX, Oregon|||Thank you all of you. I've also read this article
(http://msdn.microsoft.com/library/d...ncychecking.asp) and learned a lot from it.
Then say if by default, I select the Optimistic Concurrency option in the
DataAdapter Configuration Wizard of the Visual Studio to handle Concurrency
Control. Do I still need to add a datetimestamp column to the database table
and compare the value of that column every time I update the table?
And must I also include the comparison of this columns for insert and delete
command as well? I guess the answer is no?
Anyway, someone told me that using SQL 2000 already provides the concurrency
control by default, such that even without ADO.NET, the amount of research o
n
concurrency control in SQL 2000 gurantees data integrity. I don't really kno
w
how true it is.
"pdxJaxon" wrote:

> in ado.net the concept of pessimistic locking is a bit difficult.
> basically, ado.net only directly supports optimistic locking.
> you'll need to read up on that in a good ado.net book.
> sql server has concept of isolation level that you will likely want to rea
d
> up on. You can modify the Isolation level via ADO.NET via the sqlConnectio
n
> object.
> Not to be taken lightly as High concurrency will yield low Consistency and
> High consistency will yield low concurrency.
> need to know what your application is doing and how tolerant it can be in
> regards to dirty reads, etc.
> study up. Also Load Test and Performance test Early and Often.
>
> Greg Jackson
> PDX, Oregon
>
>|||> Do I still need to add a datetimestamp column to the database table
> and compare the value of that column every time I update the table?
No, ADO.NET caches the values of *all* rows and compares the values against
*all* these rows. As far
as I can understand, you can customize this behavior.

> And must I also include the comparison of this columns for insert and dele
te
> command as well? I guess the answer is no?
No.
--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
"wrytat" <wrytat@.discussions.microsoft.com> wrote in message
news:00E5B1E6-97E5-4997-8F7A-BF17E1641332@.microsoft.com...
> Thank you all of you. I've also read this article
> (http://msdn.microsoft.com/library/d...kin
g.asp)
> and learned a lot from it.
> Then say if by default, I select the Optimistic Concurrency option in the
> DataAdapter Configuration Wizard of the Visual Studio to handle Concurrenc
y
> Control. Do I still need to add a datetimestamp column to the database tab
le
> and compare the value of that column every time I update the table?
> And must I also include the comparison of this columns for insert and dele
te
> command as well? I guess the answer is no?
> Anyway, someone told me that using SQL 2000 already provides the concurren
cy
> control by default, such that even without ADO.NET, the amount of research
on
> concurrency control in SQL 2000 gurantees data integrity. I don't really k
now
> how true it is.
> "pdxJaxon" wrote:
>|||On Wed, 6 Apr 2005 20:23:15 -0700, Thomas wrote:
(snip)
>2. Other row version data type. Same solution as #1, except you use
>something other than an actual Timestamp field. I have used Guids in the
>past successfully.
Hi Thomas,
Why on earth would you do that?
A timestamp (aka rowversion) column takes 8 bytes, a uniqueidentifier
takes 16 bytes.
A timestamp column is changed automatically each time a row is inserted
or updated, a uniqueidentifier column is changed only when you
explicitly assign it a new value - so if you forget to, you'll miss the
change.
The timestamp datatype is designed explicitly for this goal, the
uniqueidentifier datatype is designed to be used as a surrogate key.
What is the advantage of using uniqueidentifier for concurrency control?

>4. Checksum. SQL has a checksum function that XOR the data in all the
>columns to produce a quasi-unique number representing the contents of the
>row. The catch is that SQL's checksum function does not account for Text or
>Image columns.
And another catch is that not all changes in the data result in a
different checksum. A different checksum guarantees changes, but an
unchanged checksum doesn't guarantee that the data is unchanged.
SELECT CHECKSUM('a'), CHECKSUM ('aaaaaaaaaaaaaaaaa')

>IMO, you really need a very strong justification to going with First Write
>Wins. The additional complexity added is significant and must be justified
>by a high expected rate of collision and the absolute need for
>user-consistency on all writes.
I couldn't disagree more. The risk of updates being overwritten without
anyone ever noticing is enough justification to take whatever steps are
necessary to prevent this in any serious business.
If you don't want the extra complexity, you should consider pessimistic
locking: get an exclusive rowlock when reading the data, keep the lock
as long as the data is displayed on the users' screen and only release
the lock after either the update is made or the user has decided to quit
the screen without saving changes. This will impact concurrency, but if
you expect very few collisions, it won't be a big deal. If you expect
lots of collisions, go for optimistic locking.
Never use Last Write Wins in any serious multiuser business application.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||On Wed, 6 Apr 2005 23:01:01 -0700, wrytat wrote:

>And must I also include the comparison of this columns for insert and delet
e
>command as well? I guess the answer is no?
Hi wrytat,
I have to disagree with Tibor's answer. You don't need it for inserts
(it's new data, so it can't be already locked), but you probably do need
it for deletes.
Example scenario: I start the application and show customer #123 on my
screen. You start the application and show the same customer. You see
that the information is outdated, find some new information in your
files and you make all changes to reflect the new status.Then you save
your data. Meanwhile, I see that this customre is outdated and I decide
that this old information might just as well be deleted. I hit delete.
The program will then happily delete the new information you just
entered for customer #123.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||> (snip)
<snip>
> What is the advantage of using uniqueidentifier for
> concurrency control?
The primary reason is that the calling code can generate the
new Id. It doesn't require an extra round trip to get the
updated Timestamp. The second reason is that mickeysoft has
already admitted that they are changing the Timestamp column
to be more inline with ANSI standards. Translation: It's
changing and I haven't the foggiest idea how compatitbility
will work. Lastly, and this was far more true in the past,
various development technologies did not always play nice
with Timestamps.

> And another catch is that not all changes in the data
> result in a
> different checksum. A different checksum guarantees
> changes, but an
> unchanged checksum doesn't guarantee that the data is
> unchanged.
> SELECT CHECKSUM('a'), CHECKSUM ('aaaaaaaaaaaaaaaaa')
Yep. It's far more problematic. You can of course roll your
own checksum to get around these problems, but it is a
option depending on how accurate the concurrency must be.

> I couldn't disagree more. The risk of updates being
> overwritten without
> anyone ever noticing is enough justification to take
> whatever steps are
> necessary to prevent this in any serious business.
It's a cost-benefit analysis. Does the substantial
additional development time for all development projects
justify the benefit of handling collisions? If there are few
collisions and the data involved is not that critical, then
it is not justified IMO. Finding other means of preventing
the collisions from happening in the first place is far
better choice if available.
Don't overlook the cost of handling first-write-wins
concurrency. Every system that touches the database has to
know how to deal with it. Every system has to potentially
present users with options about what to do in the case of a
collision. Situations like multiple row deletes (e.g. Delete
From Table Where ForeignKey = Value and thus multiple rows
would be deleted) are much more difficult if not impossible
because someone could have changed a row during the middle
of a delete. Isolation levels become much more of an issue
as some writes must be serialized which adversly affects
scalability. The list goes on. It basically becomes a
problem that has to be dealt with at all tiers of the
application.
It is a massive cost that, IMO, must be justified. Sometimes
it is. Most times it isn't.

> If you don't want the extra complexity, you should
> consider pessimistic
> locking: get an exclusive rowlock when reading the data,
> keep the lock
> as long as the data is displayed on the users' screen and
> only release
> the lock after either the update is made or the user has
> decided to quit
> the screen without saving changes. This will impact
> concurrency, but if
> you expect very few collisions, it won't be a big deal. If
> you expect
> lots of collisions, go for optimistic locking.
Pessimistic locking is even worse and should require an even
*more* extensive justification. There are places where it
makes sense, but they are extraordinarily rare IMO (SCMs are
a good example). Its biggest killer is scalability. Someone
locks a records and walks away from their desk. Now you have
to code timeouts and leases. It requires much more work and
hurts scalability.

> Never use Last Write Wins in any serious multiuser
> business application.
Nonsense. I have seen, and built myself, numerous
successful, large multiuser systems built with
Last-Write -Wins. They were designed such that collisions
could not happen or if, in the oft-chance they did, the
impact was minimal. The benefit of significantly shorter
development time more than justified the cost of an
occasional oddity due to collision.
Claiming that Last-Writes-Wins should never be used is
simply ignorant. It shows a blindness to other possiblities
that make First-Write-Wins unnecessary.
Thomas|||On Thu, 7 Apr 2005 13:20:45 -0700, Thomas wrote:
(snip major part of the post)
Hi Thomas,
Thanks for your post! I won't go into a debate. I think it's good that
both cases have now been defended. The OP can read both our posts,
weight the arguments and counter-arguments, decide how they apply in his
situation and then decide what works best for him. That's what I tried
to accomplish with my post.

>Pessimistic locking is even worse and should require an even
>*more* extensive justification. There are places where it
>makes sense, but they are extraordinarily rare IMO (SCMs are
>a good example). Its biggest killer is scalability. Someone
>locks a records and walks away from their desk. Now you have
>to code timeouts and leases. It requires much more work and
>hurts scalability.
I agree that pessimistic locking is worse than optimistic locking. I
would only consider using it for tables with extremely low update
frequency.
But it won't come as a surprise that I still think that pessimistic
locking is better (or rather: less bad) than Last-Write-Wins :-P
(snip)
>Claiming that Last-Writes-Wins should never be used is
>simply ignorant. It shows a blindness to other possiblities
>that make First-Write-Wins unnecessary.
Okay, I admit that this statement was too bold. I probably should have
written something along the lines of:
IMO, you really need a very strong justification to go with Last Write
Wins. The risk of losing relevant changes without any warning *and* the
consequences of such data should be weighed against the benefit of
reduced complexity and development costs. Because of the possible impact
of unnoticed data loss, this should ideally be a management decision.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)

No comments:

Post a Comment