Skip to content

New two phase transaction in thin mode has different default timeout than thick mode #376

Open
@CaselIT

Description

@CaselIT

Hi,

With the 2.3 release I've added support for oracledb two phase commit in sqlalchemy. sqlalchemy/sqlalchemy@a9c0487
In the ci we noticed that a test was unreliable and decided to skip it in the ci. The test is this one https://github.com/sqlalchemy/sqlalchemy/blob/6cf5e2a188fc5e337d22a098a5fe9a9fe10cc7e7/test/engine/test_transaction.py#L476-L527 when commit=True. The same code using only oracledb is reported below.

Initially the idea was to ask guidance since it seemed there was some kind of timeout, but trying the standalone example with different timeout and with and without the oracle client, seem to have surfaced a bug in the thin mode implementation.

  1. What versions are you using?

2.3.0

  1. Is it an error or a hang or a crash?

error

  1. What error(s) or behavior you are seeing?
oracledb.exceptions.DatabaseError: ORA-24756: transaction with identifier gti: 0x000c371f.666F6F bid: 0x626172
 does not exist
  1. Does your application call init_oracle_client()?

no, calling it fixes it

  1. Include a runnable Python script that shows the problem.
import time
import oracledb
import random
import sys
import platform

print("platform.platform:", platform.platform())
print("sys.maxsize > 2**32:", sys.maxsize > 2**32)
print("platform.python_version:", platform.python_version())
print("oracledb.__version__:", oracledb.__version__)

# oracledb.init_oracle_client() # calling this fixes it
url = dict(user="scott", password="tiger", host="127.0.0.1", port=1521, service_name="FREEPDB1")

def setup():
    with oracledb.connect(**url) as connection, connection.cursor() as c:
        c.execute("CREATE TABLE tbl(id integer, name varchar2(42))")
        connection.commit()

def teardown():
    with oracledb.connect(**url) as connection, connection.cursor() as c:
        c.execute("drop table tbl")
        connection.commit()
        for x in connection.tpc_recover():
            try:
                connection.tpc_forget(x)
            except:
                pass

def using_oracledb():
    connection = oracledb.connect(**url)
    xid_tuple = (base + i, b"foo", b"bar")
    xid = connection.xid(*xid_tuple)
    connection.tpc_begin(xid)
    with connection.cursor() as c:
        c.execute(
            "insert into tbl(id, name) values (:id, :name)",
            dict(id=1, name="user1"),
        )
    assert connection.tpc_prepare()
    connection.close()

    with oracledb.connect(**url) as connection2, connection2.cursor() as c:
        assert c.execute("select id from tbl").fetchall() == []

    with oracledb.connect(**url) as connection3:
        recoverables = connection3.tpc_recover()
        assert xid in recoverables
        # time.sleep(2) # adding a timeout makes it error in very few tries (1 or 2)
        connection3.tpc_commit(xid)

        with connection3.cursor() as c:
            assert c.execute("select id from tbl").fetchall() == [(1,)]
        print("ok", xid)

    with oracledb.connect(**url) as connection4, connection4.cursor() as c:
        c.execute("delete from tbl")
        connection4.commit()

# teardown()
setup()
try:
    base = random.randint(0, 1_000_000)
    i = 0
    while True:
        print("try", i + 1, '\n')
        using_oracledb()
        i += 1
        if i % 20 == 0 and input(f"Run {i}. Exit? (y)") == "y":
            break
finally:
    teardown()

The above script errors usually after 10 to 30 runs:

platform.platform: Windows-11-10.0.22631-SP0
sys.maxsize > 2**32: True
platform.python_version: 3.12.4
oracledb.__version__: 2.3.0
try 1

ok Xid(format_id=375793, global_transaction_id=b'foo', branch_qualifier=b'bar')
try 2

ok Xid(format_id=375794, global_transaction_id=b'foo', branch_qualifier=b'bar')
...
try 15

Traceback (most recent call last):
  File "path\to\test4.py", line 77, in <module>
    using_oracledb()
  File "path\to\test4.py", line 59, in using_oracledb
    connection3.tpc_commit(xid)
  File "path\to\oracledb\connection.py", line 1057, in tpc_commit
    self._impl.tpc_commit(xid, one_phase)
  File "src\\oracledb\\impl/thin/connection.pyx", line 470, in oracledb.thin_impl.ThinConnImpl.tpc_commit
  File "src\\oracledb\\impl/thin/protocol.pyx", line 434, in oracledb.thin_impl.Protocol._process_single_message
  File "src\\oracledb\\impl/thin/protocol.pyx", line 435, in oracledb.thin_impl.Protocol._process_single_message
  File "src\\oracledb\\impl/thin/protocol.pyx", line 428, in oracledb.thin_impl.Protocol._process_message
  File "src\\oracledb\\impl/thin/messages.pyx", line 72, in oracledb.thin_impl.Message._check_and_raise_exception
oracledb.exceptions.DatabaseError: ORA-24756: transaction with identifier gti: 0x0005bbff.666F6F bid: 0x626172
 does not exist
Help: https://docs.oracle.com/error-help/db/ora-24756/

uncommenting the timeout makes the fail even quicker:

platform.platform: Windows-11-10.0.22631-SP0
sys.maxsize > 2**32: True
platform.python_version: 3.12.4
oracledb.__version__: 2.3.0
try 1

Traceback (most recent call last):
  File "path\to\test4.py", line 77, in <module>
    using_oracledb()
  File "path\to\test4.py", line 59, in using_oracledb
    connection3.tpc_commit(xid)
  File "path\to\oracledb\connection.py", line 1057, in tpc_commit
    self._impl.tpc_commit(xid, one_phase)
  File "src\\oracledb\\impl/thin/connection.pyx", line 470, in oracledb.thin_impl.ThinConnImpl.tpc_commit
  File "src\\oracledb\\impl/thin/protocol.pyx", line 434, in oracledb.thin_impl.Protocol._process_single_message
  File "src\\oracledb\\impl/thin/protocol.pyx", line 435, in oracledb.thin_impl.Protocol._process_single_message
  File "src\\oracledb\\impl/thin/protocol.pyx", line 428, in oracledb.thin_impl.Protocol._process_message
  File "src\\oracledb\\impl/thin/messages.pyx", line 72, in oracledb.thin_impl.Message._check_and_raise_exception
oracledb.exceptions.DatabaseError: ORA-24756: transaction with identifier gti: 0x00064460.666F6F bid: 0x626172
 does not exist
Help: https://docs.oracle.com/error-help/db/ora-24756/

With or without timeout using the oracle client results in no error (run about 100 try without timeout and 60 with). I've tried only on windows, but @zzzeek can reproduce locally on linux (fedora I think) using a different oracle version and config

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions