Entity Framework Core with Dolt

December 4, 2023

14 min read

DoltDB is the world's first fully-versioned relational database. You can branch, merge, diff, push, and pull your relational data in the same ways that Git allows you to work with your source code files. Dolt is built from the ground up on open source software, and while we target drop-in compatibility with MySQL, Dolt is a complete rewrite from the ground up, in Golang, and doesn't actually share any code with MySQL. In addition to the MySQL compatible product, we're also building Doltgres – a version of Dolt that supports the PostgreSQL SQL dialect, the PostgreSQL wire protocol, and works with PostgreSQL tooling.

MySQL Compatibility

Over the years, we've dramatically improved Dolt's compatibility with MySQL. We added support for information_schema, MySQL binlog replication, MySQL's event scheduler, and a LOT more over the past few years. If there's a behavior difference or feature gap between Dolt and MySQL, we're happy to track it down and fix it! Please don't be shy to report compatibility issues to us on GitHub or just swing by our Discord server and chat with our team. We love being able to respond quickly to customer feedback.

We've shown several examples of how Dolt is a drop-in replacement for MySQL in various applications or tools that use a MySQL database. In particular, we've shown several Object-Relational Mappers (ORMs), and how they work with Dolt as a drop-in replacement for MySQL. Most recently, we showed that Hibernate works with Dolt, KnexJS works with Dolt, and SQLAlchemy for Python works with Dolt. But today... we're gunna give .NET developers some attention and talk about how .NET Entity Framework Core works with Dolt! 🤖

Entity Framework Core

.NET Entity Framework Core, or just EF Core for short, is Microsoft's open-source, cross-platform Object-Relational Mapping (ORM) framework that is widely used for database access from .NET applications. EF Core was first released in 2016 (it was originally called "Entity Framework 7" at the time), but its predecessor, the original Entity Framework, has been around a lot longer – since 2008.

There's a lot to like about EF Core. Working with data sources is easy and generally involves:

Installing EF Core – This means adding the framework libraries to your project, but also includes additional tooling, such as extensions to the dotnet CLI for working with EF Core as well as any EF Core provider for your specific database engine.
Defining Models – Your C# model classes represent the data entities in your application. A major part of using EF Core is correctly modeling your entities and the relationships between them. In the sample application below, we'll see a many-to-many relationship between two entities stored in a Dolt database.
Creating a DbContext – The DbContext is the central point in an EF Core application for configuring the connection to your database. For Dolt, we'll use the popular Pomelo EF Core library for connecting to Dolt via the MySQL protocol, since it currently has better support for the just released version 8 of EF Core.
Querying a Data Source – Last but not least, we want to store and load data in our Models from our database of course! EF Core lets us use Microsoft's Language Integrated Query (LINQ) language to access heterogeneous data sources, but there are several ways to access data including storing and loading model classes, or running SQL queries directly, as we'll see below when we look at some sample code.

We'll show each of these in a sample application below, but there's a LOT more that Entity Framework can do. For example, Entity Framework Core has great support for database migrations – you can plan your migrations as your database schema changes from changes to your models and then apply those changes in several ways. EF Core also has nice support for async operations, dependency injection, concurrency control, shadow properties, and a lot more!

Dolt + .NET Entity Framework Core

Let's walk through a sample application and see how EF Core works with Dolt! We're going to cover the basic steps in working with a Dolt database through EF Core, so if we don't cover something you'd like to see, swing by our Discord server and let us know. We'll be happy to write about it in a future post.

You can find the complete source for this sample project on GitHub as well as instructions on running it yourself. At the time of writing this, .NET Entity Framework Core 8 was just released a few weeks ago, so we're going to be using the latest and greatest EF Core version.

Let's walk through each of the major areas in our sample application where we interact with Dolt through EF Core...

Connecting to a Dolt Database with EF Core

In order to connect to a MySQL, or MySQL compatible, database with Entity Framework Core, we need to install a provider. The MySQL .NET Connector has Entity Framework support, but we had some issues using it with the latest release of Entity Framework. Instead, we found the Pomelo.EntityFrameworkCore.MySql package, which is a popular MySQL provider for EF Core. At the time of writing this, the latest version was 8.0.0-beta.2 (although the official 8.0.0 release for the new Entity Framework Core version 8 release is expected to be released very soon).

Here's the exact command I ran to add this dependency to the sample project:

dotnet add package Pomelo.EntityFrameworkCore.MySql --version 8.0.0-beta.2

Once we have the MySQL EF Core provider installed in our project, we can use it to define the connection to our database. This is defined in the OnConfiguring method of the DbContext implementation in your EF Core application. Let's take a look at the OnConfiguring method in our sample project's DbContext:

protected override void OnConfiguring(DbContextOptionsBuilder options) {
    // NOTE: To control which branch of the database we connect to, we can specify an existing branch 
    //       after the database name, separated by a forward slash.
    //       More info: https://docs.dolthub.com/sql-reference/version-control/branches#specify-a-database-revision-in-the-connection-string
    var dbAndBranch = "dolt/" + this.branchName;
    var connectionString = "server=localhost;user=root;password=;database=" + dbAndBranch + ";port=11229";
    var serverVersion = new MySqlServerVersion(new Version(8, 0, 34));
    options.UseMySql(connectionString, serverVersion);
}

There are a few interesting things happening in the OnConfiguring method. The first thing to notice is the call to options.UseMySql. This method is an extension provided by the Pomelo.EntityFrameworkCore.MySql package we installed earlier and it's what configures our connection to our MySQL compatible database. The connection string we pass in is pretty standard – we provide the network location of the server, how to login, and the name of the database we want to connect to.

Notice in this code that we're also specifying a branch name by using Dolt's branch revision database naming scheme: <database>/<branch>. This is a key point for working with Dolt database branches in tools like ORMs that manage a connection pool for your application code. We'll come back and discuss this in more depth later, as we walk through the sample code that works with branches.

Modeling Dolt Data with EF Core

Entity Framework Core supports a TON of ways to model your entities. There's rich support for all sorts of relationships between entities, and lots of ways to customize the behavior. The model in our sample application is very simple: we have a Product entity and a Customer entity, and there is a many-to-many relationship between them (i.e. each product can be used by many customers, and each customer can use many products).

One important point here is that we've avoided the use of auto_increment columns and instead used GUIDs for unique IDs. You can read a lot more about why we think you should prefer GUIDs for unique IDs over auto_increment columns, but the short version is that GUIDs allow for people to insert records in forks of Dolt database, or on other branches, and then cleanly merge those changes back into a main branch. auto_increment columns just don't work very well when you need to merge data from multiple sources since you'll have conflicts with the sequentially generated IDs.

Here's one of our model classes where we've used a Guid for the unique ID:

public class Product {
    public Guid ProductId { get; set; }
    public string Name { get; set; }
    public DateOnly? ReleaseDate { get; set; }
    public List<Customer> Customers { get; set; } = new();
}

Note that we're declaring the type as Guid, not as string. This gives EF Core a signal that we want the unique identifier to be generated by EF Core if it isn't explicitly specified in the application code. When you're using an ORM like EF Core, it's important that the ORM generates this unique ID. Initially, I had EF Core configured to specify UUID() as the default value for this field in the database schema, which makes the database generate the unique ID for you. This works great for some applications, but it turns out this isn't compatible with EF Core, and likely any other ORM. The issue is that to track a new Product entity, EF Core needs to know the value of the primary key for that new entity. For auto_increment columns, ORMs are able to look up the generated value using the LAST_INSERT_ID() function, but that workaround won't work for columns where the database generates a unique UUID value. Since we already ruled out using auto_increment columns, the best thing to do here is let our ORM generate the unique ID for us before the new entity is inserted into the database.

Querying Dolt Data with EF Core

Now that we've seen how to configure a connection to our Dolt database and how to model our entities, let's walk through the main application code in our sample and see how we're using EF Core to make calls to a Dolt database.

The first bit of code shows how to execute an arbitrary SQL command, without going through our modeled entities. We're running Dolt's active_branch() function to see what branch we're currently connected to. Because this function returns a scalar result, it's very easy to call it using EF Core, even though we aren't technically using any of our modeled entities. We'll see a more complex example later where the result is not a scalar value and we have to drop down to some lower-level APIs to run the query.

// Dolt functions and stored procedures can be accessed by executing SQL queries directly.
// Here, we're using active_branch() to see what branch we're working with.
// https://docs.dolthub.com/sql-reference/version-control/dolt-sql-functions#active_branch
using (var db = new DoltDbContext()) {
    var activeBranch = db.Database.SqlQuery<string>($"select active_branch();").ToList();
    Console.WriteLine($"Current branch: {activeBranch.First()} \n");
}

The next block of code in our sample's main program, is pretty standard EF Core code – we create some instances of our entities, add them to the database, and then save our changes. To make the sample data slightly more fun, we've pulled in the cryptic-wizard/random-word-generator library to create random names for our products and customers. Notice at the very end of our using block that we make another direct SQL query against the database, to call the dolt_commit() stored procedure. The SQL transaction has already been committed to our database as part of the call to db.SaveChanges(), but until we call dolt_commit(), that change still lives in the working set and hasn't been committed to the Dolt commit graph.

// Create and Query some of our modeled data. Note that we're using the default DoltDbContext, which 
// uses the main branch, like we saw above. 
using (var db = new DoltDbContext()) {
    Console.WriteLine("Inserting new products and customers..");
    WordGenerator nameGenerator = new WordGenerator();
    List<PartOfSpeech> pattern = [PartOfSpeech.adv, PartOfSpeech.noun];
    string productName = CultureInfo.InvariantCulture.TextInfo.ToTitleCase(nameGenerator.GetPattern(pattern, ' '));
    string customerName = CultureInfo.InvariantCulture.TextInfo.ToTitleCase(nameGenerator.GetPattern(pattern, ' '));
    var product1 = new Product {Name = productName};
    var customer1 = new Customer { Name = customerName, Products = new List<Product>{product1}};
    db.Add(product1);
    db.Add(customer1);
    db.SaveChanges();

    // Query some data
    Console.WriteLine($"Querying for customers using product ID {product1.ProductId}:");
    var customer = db.Customers
        .Where(c => c.Products.Contains(product1))
        .OrderBy(c => c.Name)
        .First();
    Console.WriteLine($"  - {customer.Name} (ID: {customer.CustomerId})");

    // Create a Dolt commit
    db.Database.ExecuteSql($"call dolt_commit('-Am', 'inserting test data');");
}

Working with Dolt Branches through EF Core

Moving along in our sample EF Core application code, the next section of Program.cs shows some code that works with Dolt branches. The first section of this code calls the dolt_branch() stored procedure to create a new branch for us to work with.

// Now let's create a new branch and then write some data to it. We do this by calling the
// dolt_branch() stored procedure. (We use a Guid to ensure the branch name is unique so 
// that we can run this program multiple times without having to clean up data.)
// https://docs.dolthub.com/sql-reference/version-control/dolt-sql-procedures#dolt_branch
string branchName;
using (var db = new DoltDbContext()) {
    var id = Guid.NewGuid();
    branchName = "branch-" + id;
    Console.WriteLine($"Creating new branch '{branchName}'\n");
    db.Database.ExecuteSql($"call dolt_branch({branchName});");
}

Now that we've got a branch created, how do we use that branch? Before we can answer that, we have to take a quick detour and talk about why it's important to understand how your ORM's connection management works, particularly when you're working with Dolt branches... In a simple application where your application owns the connection to the database, it can be safe to simply call dolt_checkout() to switch your session over to use a different branch. However, with an ORM, things are slightly more complicated. ORMs typically manage a pool of database connections for your application, so that they can be reused across multiple requests to improve performance, since establishing a database connection is an expensive operation. With an ORM, where your application code doesn't directly own or manage the database connections, it can be dangerous to call dolt_checkout(), because that branch will still be the checked out branch when the ORM hands that same connection to a different part of your application to reuse. Your application probably won't be expecting that a different branch has been checked out, and it may lead to unexpected behavior!

So... how can we safely use Dolt branches with an ORM like EF Core that manages and reuses database connections? Before we look at the next section of code in Program.cs, let's go back and take a closer look at DoltDbContext. Notice DoltDbContext provides a default constructor as well a constructor that takes a Dolt branch name. In the examples above, we've only been using the default constructor, which defaults the checked out branch on our connections to be the main branch. However, we can use the alternate constructor to create a DbContext that targets a specific Dolt branch.

private string branchName;

public DoltDbContext() {
    this.branchName = "main";
}

public DoltDbContext(string branchName) {
    this.branchName = branchName;
}

This works because the OnConfiguring method in our DbContext implementation uses a feature of Dolt that lets you specify the branch name in the connection string. This "branch revision database" is formed by adding a forward slash after the database name and then adding the branch name. When we specify a database name in that format (<databaseName>/<branchName>) Dolt will automatically check out that branch for us. Additionally, EF Core pools connections by the database connection string, so because the connection string is different, this triggers EF Core to create a new, separate connection pool for this branch, and it prevents our application from getting a connection from the pool with the wrong branch checked out.

The code below shows how we form the connection string using the branch revision database notation:

var dbAndBranch = "dolt/" + this.branchName;
var connectionString = "server=localhost;user=root;password=;database=" + dbAndBranch + ";port=11229";

Okay, now we have all the pieces to safely use Dolt branches from EF Core, so let's see how we do it. The biggest difference in the sample code below is that when we create our DoltDbContext, we use the alternate constructor that allows us to specify the branch to connect to. Everything else is exactly the same as the previous examples we looked at!

// Now that we've created a new branch, we use a new DoltDbContext and specify the branch name.
// This causes the DbContext to use a different connection string, and forces our connection to
// use a specific branch of the database. 
// 
// Note that because our connections are being pooled and managed by the ORM, we don't want to
// use dolt_checkout() to change our branch, because that will change the branch for the shared
// connection, and when that connection is reused, the code that is using it probably won't 
// expect that it has been pointed to a different branch. 
using (var db = new DoltDbContext(branchName)) {
    var activeBranch = db.Database.SqlQuery<string>($"select active_branch();").ToList();
    Console.WriteLine($"Switched to branch: {activeBranch.First()}");

    // Add a new customer on this branch
    db.Add(new Customer { Name = $"Customer from {branchName}" });
    db.SaveChanges();

    // And create a Dolt commit on this branch
    db.Database.ExecuteSql($"call dolt_commit('-Am', 'inserting test data');");
}

Using Dolt Functionality with EF Core

We've already seen how you can access Dolt stored procedures such as dolt_commit() and dolt_branch(), and we also saw an example of calling Dolt's active_branch() function that returns a scalar value. There's LOTS of Dolt specific functionality that you access from your application, such as looking at the versioned history of your data or calculating diffs between different revisions of your data. In the example code below, we show how to use the dolt_diff_summary() table function to calculate a summary of the differences between our two branches. It shows us which tables have changed between the two branches and whether the changes were data changes, schema changes, or both.

// Using the main branch of our database again, we can query a summary of the differences between
// main and our new branch using the dolt_diff_summary() table function.
// https://docs.dolthub.com/sql-reference/version-control/dolt-sql-functions#dolt_diff_summary
using (var db = new DoltDbContext()) {
    var conn = db.Database.GetDbConnection();

    Console.WriteLine($"Changed tables between main and {branchName}:");
    using (var cmd = conn.CreateCommand()) {
        cmd.CommandText = $"select * from dolt_diff_summary('main', '{branchName}');";
        conn.Open();
        using (var reader = cmd.ExecuteReader()) {
            while (reader.Read()) {
                string tableName  = reader.GetString(reader.GetOrdinal("from_table_name"));
                string diffType   = reader.GetString(reader.GetOrdinal("diff_type"));
                bool dataChange   = reader.GetBoolean(reader.GetOrdinal("data_change"));
                bool schemaChange = reader.GetBoolean(reader.GetOrdinal("schema_change"));

                string changeType = "";
                if (dataChange && schemaChange) {
                    changeType = "data and schema change";
                } else if (dataChange) {
                    changeType = "data change";
                } else if (schemaChange) {
                    changeType = "schema change";
                }
                Console.WriteLine($"  - {tableName} ({diffType} - {changeType})");
            }
        }
    }
}

This example is a little more interesting because we have relational data coming back from the table function, instead of just a scalar. If you wanted to tidy this code up for a real application, you could create a modeled entity for the returned schema from dolt_diff_summary() and you could use the EF Core API to easily map the result set back into that model. For this quick sample, we're dropping down to a slightly lower level SQL API to run the query and read the result set directly. This makes our code a little more verbose, but it's a good technique to know about.

Wrap Up

Microsoft's Entity Framework Core is a popular and powerful Object-Relational Mapping framework for .NET applications with a TON of great features. EF Core is a great way to work with relational databases from .NET applications, and you can easily work with Dolt databases through the MySQL support in EF Core. We've only scratched the surface of what you can do with EF Core and Dolt, but hopefully this post helps you get started.

If you hit any problems using Dolt with EF Core, we want to know about it so we can help! If you have ideas for other good example cases to show, or any feature suggestions for Dolt, I hope you'll hit us up on GitHub with an issue or swing by our Discord server and chat with us. Our dev team hangs out on Discord every day while we're working and we're always happy to talk about databases, versioning, and programming frameworks! 🤓

Blog