Structural Sharing with Schema Changes
Dolt is a MySQL-compatible database that supports Git-like version control features, including commit, diff, branch, merge, clone, push and pull. In order to make these operations efficient, it stores databases in a Merkle DAG, within which table data is stored in a history-independent block-based search tree called a Prolly Tree.
Back in 2020, I wrote a blog post that included some information about how much structural sharing Dolt achieves for particular types of data mutations — for example, point inserts into a table, appends at the end of a lexicographically stored primary key, etc. Since that blog post was written, Dolt got a new storage format, still based on Prolly Trees but serializing the tuples of the table and the tree metadata a bit differently. That blog post also didn't go into specifics of what kinds of schema changes would cause a table rewrite, and which would simply change the schema associated with the table and allow for shared history with the prior versions of the table going forward.
This blog post is a short reference for types of schema changes currently force Dolt to rewrite a table's storage, and consequently lose all structural sharing with previous versions of the table, versus which types do not cause any changes to the existing tuple storage, and consequently maintain all structural sharing with previous versions of the data going forward.
Structural Sharing and its Benefits
Before we dig into what operations cause table rewrites, it's helpful to have in mind why we like structural sharing and the Prolly Tree in general. Dolt treats Dolt Commits on a branch as snapshots of the database, and by default it keeps all reachable versions of the database around forever. By rewriting a table across a commit boundary, Dolt will be storing both copies of the table into perpetuity.
Additionally, Dolt's diff and merge algorithms are based on finding differences
between different Prolly Trees by only considering the portions of the trees
which might be different. They do this by using the Merkle tree structure of
the Prolly Tree to quickly walk to the places in the trees where the storage is
actually different. Across a table rewrite, nothing in the Merkle Tree is
shared between the two versions, and so the diff will become O(n)
in the size
of the table, instead of O(m * lg n)
where m
is the size of the actual
differences between the tables.
On the other hand, Dolt is still called upon to be a performant OLTP SQL database where the vast majority of operations happen against the HEAD of branches. And so Dolt's binary storage format makes tradeoffs to be effective for all of its use cases.
Schema Changes and Table Rewrites
Dolt stores table data as serialized tuples, typically in clustered indexes on
the primary key of a table. The tuples themselves are stored in a relatively
flexible general purpose format. The value of every non-NULL
field is stored
sequentially as serialized bytes; let's call this the value bytes. Then a list
of fixed-width offsets is stored, each offset indicating the start of a new
value in the tuple's value bytes; let's call this the offset bytes. Then the
count of the number of offsets is stored. Within the value bytes, fixed width
fields in the schema are stored fixed width and variable-width inline fields
are stored inline — their size is derived from the offsets stored in the
offset bytes.
For the given tuple encoding, a NULL
value is stored in one of two ways:
-
If the
NULL
value appears at the end of a tuple, then no representation of the field is stored in the tuple. The count will not reflect the field and the offset bytes will not include offsets for the field. -
If the
NULL
value is not at the end of the tuple, that is, a non-NULL
value comes in schema-order after it, then theNULL
value is represented by its offset being the same as the offset for the field which comes after it. In this case, the range of bytes making up its value is empty, and so there is no value for this field; i.e., it isNULL
.
Some values in Dolt are stored outside of the tuple itself. Instead, an address
to the value is stored within the tuple. This includes TEXT
, JSON
, and
GEOMETRY
types.
All in all, each tuple stored on disk looks a like:
+---------+---------+-----+---------+----------+----------+-----+----------+-------+
| Value 0 | Value 1 | ... | Value K | Offset 1 | Offset 2 | ... | Offset K | Count |
+---------+---------+-----+---------+----------+----------+-----+----------+-------+
All storage in Dolt is intended to be history independent — Dolt will arrive at the same content-addressed Merkle DAG whenever it constructs the same database values, regardless of the order of operations that were taken to construct them. As a consequence, schema changes always make the storage for the table look exactly like it would have if the target schema was in place and all the rows of the table were then inserted.
With all of that being said, here are some guidelines for when a schema change results in a change to the tuples stored on disk for a Dolt table.
-
The names of columns in a schema do not affect tuple storage at all. They can be changed without effecting tuple storage on disk.
-
Adding
NULL
-able columns withNULL
values to the end of a table does not change the tuples stored on disk. Each stored tuple only stores the prefix of non-NULL
values and so this never rewrites table storage. -
It is sometimes the case that changing the type of a fixed-width type to another type which has the same width will not change the tuples stored on disk. For example, changing from
SMALLINT
toSMALLINT UNSIGNED
will not rewrite the table. However, it does require every existing value to fit within the domain ofSMALLINT UNSIGNED
. The following types all have distinct widths or encodings, and changing between any of them will rewrite storage:TINYINT
,SMALLINT
,INT
,BIGINT
,FLOAT
, andDOUBLE
. -
Dolt treats
CHAR(...)
values the same asVARCHAR(...)
values for the purpose of tuple storage. You can switch betweenCHAR
andVARCHAR
without rewriting table storage. -
Dolt stores string data in tuples as UTF-8 internally. As a result, changing the character set of a database, table or column will not affect the tuple storage. However, changing the collation on a column can effect the order of the tuples in indexes, and can consequently cause storage changes if the collation is changed on an indexed column.
-
A
NULL
value is always encoded the same way in a tuple, regardless of its type. As a consequence, changing the type of a column where the value isNULL
will not change the stored tuple. Knowledge of this can potentially come in handy in some cases. For example, if you have a very sparse column which is not at the end of the columns in a table, and you are removing it and adding a new very sparse column of a different type, it can be better for structural sharing to reuse the same column, changing its type and its name, instead of dropping the old column and adding a new column at the end of the table. -
Default values are always materialized in the stored tuples. As a consequence, you can change the default value of a given column without changing any of the currently stored tuples. But if you update the stored tuples to take on the new default value, that will of course rewrite the tuples in storage.
-
An
ENUM
is stored as aSMALLINT
which refers to the ordinal of the chosen value. So you can add new options to the end of an existingENUM
without effecting existing storage. -
DECIMAL
types are stored as a 4-byte twos-complement exponent, a sign byte, and a variable-width zero-extended big-endian unsigned coefficient, taking up however many bytes are necessary to represent the value. As a consequence, it never causes a storage rewrite to increase the precision of aDECIMAL
type. If all the existing values fit exactly in a new smaller precision, that can also be also be done without changing tuple storage.
In general, almost all other schema changes will cause a table rewrite. This includes:
-
Changing the primary key of a table.
-
Dropping columns from a table (unless those columns are at the end of the table and mostly or entirely
NULL
). -
Adding a non-
NULL
column to a table. -
Adding a column with a default value to a table.
-
Changing the type of an existing column, except for the cases described above.
-
Adding a
NULL
able column to the middle of a table. -
Changing the order of the columns in a table.
So these types of changes will lose structural sharing with prior versions of the table and diffs and merges across the rewrite boundary will become less efficient.
An Alternative Approach to Schema Migrations
Depending on your use case, it may be more more appropriate to migrate the entire history of the database to the new schema. This can be true for OLTP applications that work with the historical snapshots of the data, for example. As the schema changes over time, it can be burdensome to maintain application logic to work against all previous versions of the schema.
If your use case is less concerned with exact historical snapshots of the
database, and is more interested in accessing variant versions of the database
as it would look today, then be sure to check out dolt
filter-branch
which can apply a schema migration to every commit reachable from a given
branch HEAD, for example. In this case, an archival use case could still be
served via remotes or backups, but by rewriting all the historical data for the
online use case we can retain structural sharing and efficient diff and merge
across all versions of the history.
Conclusion
It can be useful to know some details about how Dolt's table storage works and the ways different operations can impact storage overhead and the efficiency of things like diff and merge for historical versions of the data in the database. Today we looked at some details of Dolt's tuple storage and when versions of a table in Dolt can share storage with prior versions even across schema changes. We learned saw that most common schema changes cause a full table rewrite, and consequently no sharing in the storage of the tables, but specific types of schema changes do retain sharing.
If you have any questions about the behavior here or want to chat about possible use cases or improvements, reach out on our Discord.