What is a “distributed transaction”?

Question

The Wikipedia article for Distributed transaction isn't very helpful.

Can you give a high-level description of what a distributed transaction is?

Also, can you give an example of why an application or database should perform a transaction that updates data on two or more networked computers? I'm understand the classic bank example; I care more about distributed transactions in Web-scale databases like Dynamo, Bigtable, HBase, or Cassandra.

@Matt Ball: Yessir! This question is about distributed transactions. — Zombie, Nov 18 '10 at 16:49

Aaron McIver · Answer 1 · 2010-11-18 16:44:49Z

Distributed transactions span multiple physical systems, whereas standard transactions do not. Synchronization amongst the systems becomes a need which traditionally would not exist in a standard transaction.

From your Wikipedia reference...

...a distributed transaction can be seen as a database transaction that must be synchronized (or provide ACID properties) among multiple participating databases which are distributed among different physical locations...

+1 for the quote. To me ACID is quite a definition by itself. — Dunaril, Feb 23 '11 at 14:54

Heinzi · Answer 2 · 2010-11-18 16:46:41Z

Usually, transactions occur on one database server:

BEGIN TRANSACTION
SELECT something FROM myTable
UPDATE something IN myTable
COMMIT

A distributed transaction involves multiple servers:

BEGIN TRANSACTION
UPDATE amount = amount - 100 IN bankAccounts WHERE accountNr = 1
UPDATE amount = amount + 100 IN someRemoteDatabaseAtSomeOtherBank.bankAccounts WHERE accountNr = 2
COMMIT

The difficulty comes from the fact that the servers must communicate to ensure that transactional properties such as atomicity are satisfied on both servers: If the transaction succeeds, the values must be updated on both servers. If the transaction fails, the transaction must be rollbacked on both servers. It must never happen that the values are updated on one server but not updated on the other.

icyrock.com · Answer 3 · 2010-11-18 16:48:09Z

Check this article:

http://starlet.deltatelecom.ru/rdb$doc/oraclerdb/distrans7/twopc_pro_2pcintro.html

It's for Java and for databases, but should be a good resource.

Jerry Coffin · Answer 4 · 2010-11-18 16:48:13Z

A distributed transaction is a transaction on a distributed database (i.e., one where the data is stored on a number of physically separate systems). It's noteworthy because there's a fair amount of complexity involved (especially in the communications) to assure that all the machines remain in agreement, so either the whole transaction succeeds, or else it appears that nothing happened at all.

Klaus Byskov Pedersen · Answer 5 · 2010-11-18 16:46:20Z

A distributed transaction is a transaction that works across several computers. Say you start a transaction in some method in a program on computer A. You then make some changes to data in the method on computer A, and afterwords the method calls a web service on computer B. The web service method on computer B fails and rolls the transaction back. Since the transaction is distributed, this means that any changes made on computer A also need to be rolled back. The combination of the distributed transaction coordinator on windows and the .net framework facilitate this functionality.

Kumar's Blog

Monday, November 25, 2013

What is a “distributed transaction”?

5 Answers

No comments: