Python `autocommit` Off By Default

REFERENCE
8 min read

The snake doesn't like to commit. In Python autocommit is off by default in the Database API. This is different than other standard database connectors like ODBC and JDBC. Most common databases have autocommit on by default including SQL Server, MySQL, and PostgreSQL.

autocommit on by default seems to be the standard. Python deviates from the standard. This confuses people. This blog explains why and what this means for your Python code.

Python autocommit off

What is autocommit?

From wikipedia:

autocommit is a mode of operation of a database connection. Each individual database interaction (ie. each SQL statement) submitted through the database connection in autocommit mode will be executed in its own transaction that is implicitly committed.

With autocommit on, you don't need to worry about START TRANSACTION, COMMIT, and ROLLBACK, the database handles transaction complexity for you. Reading and writing to a database "just works" even with multiple connections.

With autocommit off, the developer needs to worry about transactions. If you want to read your writes, you must COMMIT them and START TRANSACTION on other connections. autocommit off is more complicated but gives the developer more control. It's especially useful in high concurrency environments where the developer is expecting conflicting concurrent writes to a row.

Why is autocommit off by default in Python?

autocommit has been off by default in Python since PEP-249, released in 1999. PEP stands for Python Enhancement Proposal. In the proposal there are no stated reasons, just this note:

Note that if the database supports an auto-commit feature, this must be initially off. An interface method may be provided to turn it back on.

However, with a bit more internet sleuthing it seems the topic was debated extensively on the Python developer mailing list. This comment:

Auto commit is a bad thing and a pretty evil invention of ODBC. While it does make writing ODBC drivers simpler (ones which don't support transactions that is), it is potentially dangerous at times, e.g. take a crashing program: there is no way to recover from errors because the database has no way of knowing which data is valid and which is not. No commercial application handling "mission critical" (I love that term ;-) data would ever want to run in auto-commit mode.

and this gem (love the strategic use of capitalization) by M.A Lemburg, author of the Python database API:

ANY serious application MUST manage its own transactions, as otherwise you can't ever hope to control failure modes.

seem to point to a firm disdain for autocommit that influenced the Python standard.

Moreover, in 1999 autocommit on had not yet become the standard. Databases like MySQL and Postgres were in their infancy so this decision was less contentious. In 2023, autocommit off by default seems like a blatant disregard for the standard. The most popular database connectors, ODBC and JDBC, have autocommit on by default. MySQL and Postgres became the most popular Relational Database Management Systems (RDBMS) and they both ship with autocommit on by default.

What does autocommit off mean?

With autocommit off you have to explicitly handle START TRANSACTION, COMMIT, and ROLLBACK. This is most noticeable in a multiple connection scenario.

autocommit on Example

Let's start with an autocommit on example using raw SQL. You have a database autocommit_example with a single empty table t with two columns, id and words. Connect to it with two clients.

On connection one:

MySQL [autocommit_example]> insert into t values (0, "Every other connection can read this");
Query OK, 1 row affected (0.021 sec)

MySQL [autocommit_example]> select * from t;
+------+--------------------------------------+
| id   | words                                |
+------+--------------------------------------+
|    0 | Every other connection can read this |
+------+--------------------------------------+
1 row in set (0.007 sec)

Now, on connection two:

MySQL [autocommit_example]> select * from t;
+------+--------------------------------------+
| id   | words                                |
+------+--------------------------------------+
|    0 | Every other connection can read this |
+------+--------------------------------------+
1 row in set (0.003 sec)

autocommit on is simple. Other connections can read your writes.

autocommit off Example

Now for autocommit off with the same example. We have a database autocommit_example with a single empty table t with two columns, id and words. We connect to it with two clients.

Note, I have to turn autocommit off because it is on by default in most databases. I'm using Dolt, a MySQL compatible database with Git-style version control, for this example.

On connection one:

MySQL [autocommit_example]> set autocommit=0;
Query OK, 1 row affected (0.004 sec)

MySQL [autocommit_example]> insert into t values (0, "Only my connection can read this until I commit");
Query OK, 1 row affected (0.003 sec)

MySQL [autocommit_example]> select * from t;
+------+-------------------------------------------------+
| id   | words                                           |
+------+-------------------------------------------------+
|    0 | Only my connection can read this until I commit |
+------+-------------------------------------------------+
1 row in set (0.005 sec)

Now, on connection two:

MySQL [autocommit_example]> set autocommit=0;
Query OK, 1 row affected (0.009 sec)

MySQL [autocommit_example]> select * from t;
Empty set (0.007 sec)

I can't see the write on connection one because the transaction is not committed. So, I commit on connection one:

MySQL [autocommit_example]> commit;
Query OK, 0 rows affected (0.015 sec)

But I still can't read it on connection two because I'm still in the transaction I started when I connected:

MySQL [autocommit_example]> select * from t;
Empty set (0.004 sec)

To read on connection two I must start a new transaction.

MySQL [autocommit_example]> start transaction;
Query OK, 0 rows affected (0.008 sec)

MySQL [autocommit_example]> select * from t;
+------+-------------------------------------------------+
| id   | words                                           |
+------+-------------------------------------------------+
|    0 | Only my connection can read this until I commit |
+------+-------------------------------------------------+
1 row in set (0.003 sec)

Confusing, right? This is the default behavior in Python.

In Python

Now let's show you the same flow in Python. Again, we have a MySQL-compatible database database named autocommit_example running locally on port 3306. The database has a single empty table t with two columns, id and words. These scripts can be found at this GitHub repository.

autocommit on Example

Let's start with autocommit on. We do this by setting autocommit to True in the connection arguments.

This code is pretty simple. We connect to the database with two connections. On one connection, we insert a row. On the other connection we read the row.

import mysql.connector

def main():
    conn_args = {
        "user": "root",
        "host": "127.0.0.1",
        "port": 3306,
        "database": "autocommit_example",
        "autocommit": True
    }

    # Clean up so I can run again                                               
    conn = mysql.connector.connect(**conn_args)
    other_conn = mysql.connector.connect(**conn_args)

    cursor = conn.cursor()
    cursor.execute("delete from t")

    # Insert a value                                                            
    cursor.execute("insert into t values (0, 'Every other connection can read this')")
    # Read the value on same connection                                         
    cursor.execute("select * from t")
    results = cursor.fetchall()
    print(results)

    # Read the value on the other connection  
    cursor = other_conn.cursor()                                  
    cursor.execute("select * from t")
    results = cursor.fetchall()
    print(results)

    conn.close()
    other_conn.close()

main()

We name the script autocommit.pyand run it in a shell to produce the following output.

$ python autocommit.py              
[(0, 'Every other connection can read this')]
[(0, 'Every other connection can read this')]

Everything works as expected.

autocommit off Example

With autocommit off, things are a bit more complicated. I won't explain the script in detail. The code comments walk you through the process. Notice how with the default connection setting autocommit is off.

import mysql.connector

def main():
    conn_args = {
        "user": "root",
        "host": "127.0.0.1",
        "port": 3306,
        "database": "autocommit_example",
    }

    # Clean up so I can run again                                               
    conn = mysql.connector.connect(**conn_args)

    cursor = conn.cursor()
    cursor.execute("delete from t")
    cursor.execute("commit")

    # Only establish the connection once I'm sure I've cleaned up               
    other_conn = mysql.connector.connect(**conn_args)

    # Insert a value                                                            
    cursor.execute("insert into t values (0, 'Only this connection can read this')")
    
    # Read the value on same connection                                         
    cursor.execute("select * from t")
    results = cursor.fetchall()
    print(results)

    # Read the value on the other connection
    other_cursor = other_conn.cursor()                                   
    other_cursor.execute("select * from t")
    results = other_cursor.fetchall()
    print("Will be empty")
    print(results)

    # Run commit on the original connection
    cursor = conn.cursor()
    cursor.execute("commit")

    # I still can't read the write because I have not started 
    # a new transaction on this connection
    other_cursor.execute("select * from t")
    results = other_cursor.fetchall()
    print("Still empty")
    print(results)

    # To read writes on the other connection I must start a 
    # new transaction
    other_cursor.execute("start transaction")
    other_cursor.execute("select * from t")
    results = other_cursor.fetchall()
    print("Finally I can see the row")
    print(results)

    conn.close()
    other_conn.close()

main()

We name the script no_autocommit.pyand run it in a shell to produce the following output.

$ python no_autocommit.py
[(0, 'Only this connection can read this')]
Will be empty
[]
Still empty
[]
Finally I can see the row
[(0, 'Only this connection can read this')]

As you can see, with autocommit off, you have to take a lot more care in handling multiple connections to your database.

autocommit and Connection Pools

Further complicating matters are connection pools. The Python MySQL connector ships with a built in connection pool utility. If you use a connection pool with autocommit off you have to take the same care as above for managing state in your connections. Connection pools add a another confounding factor: randomness. You never know what connection your program is going to grab from the pool. So, you have to be extra diligent with your COMMITs and START TRANSACTIONs.

import mysql.connector

def main():
    # Clean up so I can run again                                               
    # Can't use a pool here because a transaction begins                        
    # when the pool is created. If I get unlucky, I'll get duplicate           
    # primary key errors.                                                       
    no_pool_conn = mysql.connector.connect(user = "root",
                                           host = "127.0.0.1",
                                           port = 3306,
                                           database = "autocommit_example")

    cursor = no_pool_conn.cursor()
    cursor.execute("delete from t")
    cursor.execute("commit")
    cursor.close()
    no_pool_conn.close()

    # No transactions                                                           
    conn1 = mysql.connector.connect(user = "root",
                                    host = "127.0.0.1",
                                    port = 3306,
                                    database = "autocommit_example",
                                    pool_name = "mypool",
                                    pool_size=3)
    cursor = conn1.cursor()
    cursor.execute("insert into t values (0,'An uncommitted transaction')")
    cursor.close()

    conn2 = mysql.connector.connect(pool_name = "mypool")
    cursor = conn2.cursor()
    cursor.execute("select * from t")
    results = cursor.fetchall()
    print("Reading an uncommitted transaction. Will show empty.")
    print(results)
    cursor.close()
    conn2.close()

    # Rollback on conn1                                                         
    conn1.cursor().execute("rollback")

    # Transactions                                                              
    conn = mysql.connector.connect(pool_name = "mypool")
    cursor = conn.cursor()
    cursor.execute("insert into t values (0,'A committed transaction.')")
    cursor.execute("commit")
    cursor.close()
    conn.close()

    # Leaving this here to show you it's reading off an                         
    # old commit (potentially)                                                  
    conn = mysql.connector.connect(pool_name = "mypool")
    cursor = conn.cursor()
    cursor.execute("select * from t")
    results = cursor.fetchall()
    print("Potentially Reading off transaction begun when pool was made. Will likely show empty.")
    print(results)
    cursor.close()
    conn.close()

    # Start a new transaction to read committed transaction    
    conn = mysql.connector.connect(pool_name = "mypool")
    cursor = conn.cursor()
    cursor.execute("start transaction")
    cursor.execute("select * from t")
    results = cursor.fetchall()
    print("Will show new row.")
    print(results)
    cursor.close()
    conn.close()

    # autocommit on works as you'd expect                                         
    auto_conn = mysql.connector.connect(user = "root",
                                        host = "127.0.0.1",
                                        port = 3306,
                                        database = "autocommit_example",
                                        pool_name = "autopool",
                                        pool_size=3,
                                        autocommit=True)

    auto_conn.cursor().execute("insert into t values (1,'Every connection can read this')")
    cursor = auto_conn.cursor()
    cursor.execute("select * from t")
    results = cursor.fetchall()
    print("Will show two rows.")
    print(results)

main()

We name the script pool.pyand run it in a shell to produce the following output.

$ python pool.py
Reading an uncommitted transaction. Will show empty.
[]
Potentially Reading off transaction begun when pool was made. Will likely show empty.
[]
Will show new row.
[(0, 'A committed transaction.')]
Will show two rows.
[(0, 'A committed transaction.'), (1, 'Every connection can read this')]

As you can see, you need to be very careful when dealing with a connection pool and autocommit is off.

Extra Confusing in Dolt

How did I come to be such an expert in Python autocommit behavior? Here at DoltHub, we built Dolt, the world's first and only version controlled database. Think Git and MySQL had a baby.

Both Git and SQL have commits. Dolt must support both so Dolt overloads the term commit. This causes extra confusion especially with autocommit off where you must make explicit SQL commits and explicit Dolt commits. A typical conversation goes like:

User: "I called commit and I don't see a commit in my Dolt log."
Dolts: "Did you SQL commit of Dolt commit?"
User: "Huh?"

Thus, here at DoltHub, we've become experts in the SQL commit behavior of various tools and libraries in all manner of languages, including Python.

Conclusion

Python's choice of default autocommit off is a major source of confusion. This post explains what autocommit is, why it is off by default in Python, and how to handle that using example code. Interested in talking Python, databases, or version control? Come by our Discord and join our growing community.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.