State of Dolt

4 min read

In June 2021, I took to this blog and announced Dolt is a database. Though it wasn't called "State of Dolt", that was the first blog where I presented an overview of Dolt stability, correctness, performance, and features.

On May 1, 2023, we launched Dolt 1.0 signalling to the world that Dolt was ready for production workloads. In that blog, the next "State of Dolt", we suggested that Dolt had achieved (1) forward storage compatibility (2) production performance (3) MySQL compatibility and (4) a stable version control interface.

What's the current "State of Dolt"? This blog will review Dolt stability, correctness, performance, and version control features as of April 2024.

Stability

Dolt is stable enough for your production data.

The new Dolt storage format, incrementally released in the year leading up to Dolt 1.0, is performant and stable. All customers we know of have migrated to the new format. We reiterate our pledge for forward storage compatibility made with Dolt 1.0.

Multiple customers run Dolt in production across a number of use cases. Hosted Dolt is the preferred way for customers to deploy Dolt. With Hosted Dolt, you get 24-hour on call support, automated backups, replication, and a built in SQL workbench, complete with Pull Requests.

Of course, you can also self-host Dolt. Dolt is proudly free and open source.

Gaps

We release a new version of Dolt approximately twice per week. Our version scheme is MAJOR.MINOR.PATCH. PATCH releases are fully backwards compatible. MINOR versions signal a backwards incompatibility and we note the breaking change at the top of the release notes. MAJOR versions are reserved for releases that require a migration.

On average, we've released a non-backwards compatible version of Dolt every 1.5 weeks, 35 since May 1, 2023. We would like to slow this pace over the next year to one every 2 to 3 weeks. Backwards incompatible changes can be disruptive to customers. Dolt stability requires fewer backwards incompatible changes.

Correctness

A Dolt engineer will fix your correctness bug in 24 hours or less.

A big recent Dolt milestone was 100% correctness on sqllogictest, a test suite of 6 million SQL queries. We started measuring Dolt correctness against the sqllogictest benchmark in 2019 and quickly got to 90% correct. We slowly worked our way up adding 9s and fixed the last 600 or so tests in the last quarter.

Dolt is 100% Correct

We're so confident in Dolt's correctness that we pledge to fix correctness bugs in 24 hours or less. This pledge is a function of the low frequency of bug reports, the ease of most correctness fixes, and Dolt's outstanding test coverage to prevent regressions.

Moreover, the latest SQL features released were fulltext indexes, scheduled events, and table statistics for faster queries. As you can see, Dolt supports almost all of the MySQL feature set.

Gaps

Now that we're at 100% correctness on the sqllogictest suite, what's next? We are targeting MySQL function coverage and our own home grown suite of SQL engine tests. We are currently at 69% coverage of MySQL functions and 99.6% passing on our engine tests.

On SQL features, the serializable transaction isolation level, ie. select for update, is the next big feature. Dolt is repeatable read. On the bright side, this topic led me to create this image, some of the best work of my career.

Lost Update

Lastly, I'd be remiss to not mention Doltgres, the Postgres-flavored Dolt. Here we are 86% correct, about where Dolt was at the end of 2019. We can close that gap more quickly but some of our correctness investment will be in Doltgres.

Performance

The gap between MySQL and Dolt query performance is unnoticeable in most applications.

We are 1.8X slower than MySQL on a standard suite of sysbench tests. We monitor this performance actively for regressions and have made a ton of progress over time, though not much since Dolt 1.0.

Relative Dolt Performance over time

sysbench performance replicates to production performance. Anecdotally, we see approximately 2X performance difference between Dolt and MySQL across a number of customer queries and schemas. In other words, sysbench tests are representative of query performance most users see in production. 2X slower is barely noticeable at the application layer because in modern SQL databases, many queries return sub-millisecond.

Gaps

Dolt is 4.5X slower than MySQL on TPC-C. TPC-C is a a standard transactional benchmark for measuring database throughput and scalability. It tests multi-table read/write throughput. Dolt compares less favorably to MySQL on the benchmark than on sysbench. The goal is to bring Dolt under 3X MySQL on TPC-C in the next six months.

Version Control

Dolt has all the version control functionality you know and love from Git.

From launch, core Git functionality like log, branch, diff and merge was supported in Dolt. Git-style version control on SQL tables instead of files has always been Dolt's core value proposition.

This year we added a number of additional version control features and previously unsupported Git functionality. As we've gotten SQL performance and correctness under control, we had more resources to improve version control functionality. Recently, we've added the following version control features:

As you can see, we continue to grow our functionality lead in the version controlled database category.

Gaps

What version control features are we considering next?

The biggest version control gap is history compression. Potential users worry about the disk space required to store the entire history of their database online. We are actively working on this problem. The first step here is delta compression similar to how Git stores file chunks. In our initial testing, delta compression of Dolt chunks shrinks the size of Dolt databases by 60-80%.

After delta compression, we will allow for offline history archiving via a remote. We implemented shallow clone and history archiving is very similar. Imagine only storing the last N commits or the 30 days of commits on your production Dolt instance with the rest of your history archived on a remote, like DoltHub. If you need deeper history access, clone the database on a large disk host, do your digging through history, and when you're finished delete the clone.

Additionally, a few users have asked to stage a partial working set, also known as dolt add -p. We're still working out a table-based user interface for this but we expect to get to this feature in the next couple quarters.

Conclusion

Dolt has the stability, correctness, performance, and features to support your production application. Sound like something that could help you at your company? Join our Discord and let's discuss your use case.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.