Tuesday 8 July 2014

Transaction Management and DAL errors

Introduction
This newsletter is intended to make sure you are aware of good transaction management related to error handling.


Situation A
Suppose a database transaction is being done on the item master table tcibd001. For all items of type ‘Purchased’ the field tcibd001.fldx is changed to ‘value_x’. There are 50 items of type ‘Purchased’ in table tcibd001. What is the best way to update these 50 records?

Solution 1 (not so good)
The following code is written to do the update as mentioned in the situation A description:
long ret.val
db.retry.point()
select tcibd001.*
from tcibd001 for update
where tcibd001._index6 = {tckitm.purchase}
selectdo
ret.val = dal.change.object("tcibd001")
dal.set.field("tcibd001.fldx", value_x)
 ret.val = dal.save.object("tcibd001")
if ret.val <> 0 then
     print.error.to.report(...)
endif
endselect
commit.transaction()

As you see, the return value ret.val is used to print an error to the report. What happens with the records for which the dal.save.object(…) hook returns a DALHOOKERROR? Obviously, these items will be printed to the report. But, the update to the database may still have been performed! For example if the DALHOOKERROR was returned from the after.save.object() hook. Also updates done to dependent objects may have been performed!

Important Note: a DALHOOKERROR only means a signal that something is not correct. But, it does not mean the insert/update/delete action on the table or dependent tables has been skipped.

Solution 2 (updating a small range)
There is only one way to skip the update of a record to the database: use abort.transaction(). The following code is written to do the update as mentioned in the situation A description:
long ret.val
db.retry.point()
select tcibd001.*
from tcibd001 for update
where tcibd001._index6 = {tckitm.purchase}
order by tcibd001._index6 with retry
selectdo
ret.val = dal.change.object("tcibd001")
if ret.val = 0 then
 dal.set.field("tcibd001.fldx", value_x)
 ret.val = dal.save.object("tcibd001")
endif
if ret.val <> 0 then
     abort.transaction()
     print.error.to.report(...)
else
     commit.transaction()
endif
endselect

This solution is much better than the previous one. All return values are checked and the records are only saved if no errors occur. The ‘order by with retry’ is added to the select statement to make sure that successfully updated records are not updated again in case of a db.retry.

Important Note: Make logical database transactions. A logical transaction is a sequence of related database actions that are treated as a unit. The database actions that make up a logical transaction are processed in their entirety, or not at all. A logical transaction is a transaction that mostly applies to one parent record and its dependent objects. This means that the complete transaction must be committed or, in case of errors, aborted. Neither a commit nor an abort will lead to inconsistent data in the database.




Is there any disadvantage of this solution? Unfortunately, yes. For 50 records the solution is ok. But what about the following situation?

Situation B
Suppose a database transaction is being done on the item master table tcibd001. For all items of type ‘Purchased’ field tcibd001.fldx is changed to ‘value_x’. There are one million items of type ‘Purchased’ in table tcibd001. What is the best way to update all these records?

Solution 3 (updating a large range)
The code from solution 2 would result into one million separate transactions. This will lead very likely to performance issues. In quite some cases, the usage of a so-called counted commit is used to reduce the number of transactions. The code is then mostly as follows:
#define COMMIT.SIZE 1000
long ret.val
long commit.counter
db.retry.point()
commit.counter = 0
select tcibd001.* 
from tcibd001 for update 
where tcibd001._index6 = {tckitm.purchase}
order by tcibd001._index6 with retry 
selectdo
ret.val = dal.change.object("tcibd001")
if ret.val = 0 then
     dal.set.field("tcibd001.fldx", value_x)
     ret.val = dal.save.object("tcibd001")
endif
if ret.val <> 0 then
     print.error.to.report(...)
endif
if commit.counter < COMMIT.SIZE then
     commit.counter = commit.counter + 1
else
     commit.transaction()
     commit.counter = 0
endif
selecteos
if commit.counter > 0 then
     commit.transaction()
endif
endselect

In this solution the performance has obviously been improved. The commits are only done after 1000 records have been updated. Unfortunately, we are again saving the records for which a DALHOOKERROR occurred. And it is quite difficult to solve this.

Important Note: It is not easy to perform proper error handling if you do not use logical database transactions because of performance or any other reasons.

Some possible solutions if record 1234 gives an error:
1. Save the key values of this record. Do an abort and perform the transaction again for records 1001- 1233. Skip record 1234 and continue.
2. Set the commit.counter to 1 and do an abort. Restart from record 1001. A commit per record is now performed. As soon as a record (very likely record 1234) gives an error, an abort is done. After that set the commit.counter to COMMIT.SIZE and continue with record 1235.

It is clear that each solution requires quite some (technical) code. Moreover, we do not have a good method to abort a transaction and force the system to re-start from the last retry point. With the current tool set it is recommended to use a counted commit only if the transactions will very likely not result in a DALHOOKERROR.


Conclusion
After reading this newsletter you understand that:
1. A DALHOOKERROR only means a signal that something is not correct. But, it does not mean the insert/update/delete action on the table or dependent tables has been skipped.
2. Database transactions should be logically organized.
3. It is not easy to implement error handling properly if you do not use logical database transactions. 

No comments:

Post a Comment