Dolt and Ecto/Elixir
Dolt is a SQL database with Git-style versioning. A couple of months ago, our team was introduced to an engineering team that wanted to use Elixir with Dolt. Elixir is a "dynamic, functional language", based off the Erlang VM. We thought it was pretty cool to use Elixir with Dolt, so we decided to blog about it!
Elixir specifically uses the Ecto package to interact with databases. Ecto is a "database wrapper and query generator for Elixir." Ecto is primarily used with Postgres and MySQL databases. Today we'll walk through the process of connecting Dolt, which is MySQL compatible, with Ecto. Specifically, we will create Dolt primitives of branching, committing, and diffing for Ecto applications. This makes it easier for Elixir developers to bring version control properties to their applications.
Getting Started
Let's start by creating a new project with mix. Or if you want to jump ahead you can check out the final code here
mix new ecto_blog --module Decto
This creates a directory named ecto_blog
with our main module Repo being named Decto. In the dependencies portion of
mix.exs
be sure to add ecto_sql
and myxql
. For example, my dependencies look like this
defp deps do
[
{:ecto_sql, "~> 3.0"},
{:myxql, "~> 0.5.0"}
]
end
Now let's configure our Ecto Database connection. In the same directory of your mix project run the following command
mkdir decto && cd decto && dolt init
This will initialize a dolt database in a directory titled decto
. We can use the following configuration file (stored
in the decto directory) to work correctly with dolt.
log_level: warning
behavior:
read_only: false
user:
name: "root"
password: ""
listener:
host: localhost
port: 3306
max_connections: 128
read_timeout_millis: 28800000
write_timeout_millis: 28800000
Now start your server in another terminal with dolt sql-server --config=config.yaml
. Let's take a step back to see what
exactly we did so far. We just set up and started a MySQL compatible database that can scale to 100s of GBs in one single
command.
Configuring Ecto
To set up Ecto, create the file config/config.exs
in your mix project directory. Add the following lines of code
import Config
config :ecto_blog, Decto.Repo,
database: "decto",
username: "root",
password: "",
hostname: "localhost",
port: 3306
config :ecto_blog, ecto_repos: [Decto.Repo]
Then in lib/decto.ex
add the following module
defmodule Decto.Repo do
use Ecto.Repo,
otp_app: :ecto_blog,
adapter: Ecto.Adapters.MyXQL
end
The first module represents our database configuration. The second sets up an Ecto Repo with the MyXQL adapter.
Let's run the following steps to get quickly started:
mix compile
iex -S mix
- Now in the iex terminal run
ecto.Repo.start_link()
you should see something like this
Interactive Elixir (1o.12.1) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> Decto.Repo.start_link()
{:ok, #PID<0.1297.0>}
Note: You should not run mix ecto.create
as dolt init
already creates the database for you. ecto.create
will just
create an inmemory database instead
A Simple Migration
Let's create a simple migration file that models a Person schema.
Now let's walk through the process of using an Ecto Schema for working with Dolt specific functions. Specifically let's
create a table of people
. Run the command mix ecto.gen.migration create_people
. In the new migrations file fill in the
following change function.
def change do
create table(:people) do
add :first_name, :string
add :last_name, :string
add :age, :integer
end
end
Now run mix ecto.migrate
. You should see the table created
Creating Dolt Schemas
Let's add the following modules to our lib/decto.ex
file.
defmodule Decto.Branch do
use Ecto.Schema
alias Decto.{Repo}
@primary_key {:name, :string, autogenerate: :false}
schema "dolt_branches" do
field :hash, :string
field :latest_committer, :string
field :latest_committer_email, :string
field :latest_commit_date, :utc_datetime
field :latest_commit_message, :string
end
def active_branch() do
Repo.query!("SELECT active_branch();")
def checkout_new(branch) do
Repo.query!("SELECT DOLT_CHECKOUT('-b', '#{branch}');")
end
def checkout_existing(branch) do
Repo.query!("SELECT DOLT_CHECKOUT('#{branch}');")
end
def merge(branch) do
Repo.query!("SELECT DOLT_MERGE('#{branch}');")
end
end
defmodule Decto.Commit do
use Ecto.Schema
alias Decto.{Repo}
@primary_key {:commit_hash, :string, autogenerate: :false}
schema "dolt_log" do
field :committer, :string
field :email, :string
field :date, :utc_datetime
field :message, :string
end
def commit(message) do
Repo.query!("SELECT DOLT_COMMIT('-a', '-m', '#{message}');")
end
end
defmodule Decto.Diff do
alias Decto.{Repo}
def diff_table_between_branches(table, branch1, branch2) do
Repo.query!("SELECT * FROM dolt_commit_diff_#{table} where from_commit=HASHOF('#{branch1}') and to_commit = HASHOF('#{branch2}')")
end
end
defmodule Decto.App do
alias Decto.{Repo, Person}
def insertPerson(first_name, last_name, age) do
person = %Person{first_name: first_name, last_name: last_name, age: age}
Repo.insert(person)
end
end
The first module is our Branches module which models the schema of the dolt_branches
system table. It includes functions
for returning the current branch, checking out a new branch. checking out an existing branch and merging. The second
module is the Commit module which models the dolt_log
system table. It has additional functions for writing a commit
to the database. Finally, we have a small Diff module that includes a function for reading branch level diffs.
Let's write some code to see why this is useful.
Example 1 - Querying all Commits
Let's fire up our iex terminal and write a query to see why defining these schemas are useful.
ecto_blog iex -S mix
Erlang/OTP 24 [erts-12.0.2] [source] [64-bit] [smp:12:12] [ds:12:12:10] [async-threads:1] [jit] [dtrace]
Interactive Elixir (1.12.1) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> Decto.Repo.start_link()
{:ok, #PID<0.221.0>}
iex(2)> Decto.Commit |> Decto.Repo.all
16:45:26.866 [debug] QUERY OK source="dolt_log" db=1.6ms decode=0.9ms queue=3.7ms idle=968.5ms
SELECT d0.`commit_hash`, d0.`committer`, d0.`email`, d0.`date`, d0.`message` FROM `dolt_log` AS d0 []
[
%Decto.Commit{
__meta__: #Ecto.Schema.Metadata<:loaded, "dolt_log">,
commit_hash: "illa3139h980fpd889n3g9chp160db7s",
committer: "vinai",
date: ~U[2021-07-16 00:00:16Z],
email: "vinai@dolthub.com",
message: "Initialize data repository"
}
]
In one line we were able to list every single commit we had in our database.
Example 2 - Branching and diffing
Let's go through a more complex example where we create some branches and show a diff.
- Start the connection to our db
iex(1)> Decto.Repo.start_link()
{:ok, #PID<0.221.0>}
- Insert a new Person
iex(2)> Decto.App.insertPerson("Vinai", "Rachakonda", 22)
17:08:02.704 [debug] QUERY OK db=4.0ms decode=1.3ms queue=3.3ms idle=1433.9ms
INSERT INTO `people` (`age`,`first_name`,`last_name`) VALUES (?,?,?) [22, "Vinai", "Rachakonda"]
{:ok,
%Decto.Person{
__meta__: #Ecto.Schema.Metadata<:loaded, "people">,
age: 22,
first_name: "Vinai",
id: 1,
last_name: "Rachakonda"
}}
- Insert a commit into master
iex(3)> Decto.Commit.commit("Insert the person Vinai")
17:08:24.970 [debug] QUERY OK db=14.5ms queue=0.6ms idle=1703.9ms
SELECT DOLT_COMMIT('-a', '-m', 'Insert the person Vinai'); []
%MyXQL.Result{
columns: ["DOLT_COMMIT('-a', '-m', 'Insert the person Vinai')"],
connection_id: 36,
last_insert_id: nil,
num_rows: 1,
num_warnings: 0,
rows: [["n2qph817064pg97c1i44br9qgdariefu"]]
}
- Checkout to a new branch
iex(4)> Decto.Branch.checkout_new("test-branch")
17:08:48.653 [debug] QUERY OK db=5.5ms queue=0.8ms idle=1395.1ms
SELECT DOLT_CHECKOUT('-b', 'test-branch'); []
%MyXQL.Result{
columns: ["DOLT_CHECKOUT('-b', 'test-branch')"],
connection_id: 37,
last_insert_id: nil,
num_rows: 1,
num_warnings: 0,
rows: [[0]]
}
- Insert into our new branch
iex(5)> Decto.App.insertPerson("John", "Doe", 35)
17:09:16.752 [debug] QUERY OK db=3.8ms queue=1.2ms idle=1495.4ms
INSERT INTO `people` (`age`,`first_name`,`last_name`) VALUES (?,?,?) [35, "John", "Doe"]
{:ok,
%Decto.Person{
__meta__: #Ecto.Schema.Metadata<:loaded, "people">,
age: 35,
first_name: "John",
id: 2,
last_name: "Doe"
}}
- Commit on our new branch
iex(6)> Decto.Commit.commit("Insert John Doe")
17:09:33.321 [debug] QUERY OK db=13.4ms queue=1.1ms idle=1055.6ms
SELECT DOLT_COMMIT('-a', '-m', 'Insert John Doe'); []
%MyXQL.Result{
columns: ["DOLT_COMMIT('-a', '-m', 'Insert John Doe')"],
connection_id: 33,
last_insert_id: nil,
num_rows: 1,
num_warnings: 0,
rows: [["ils75lrdaumoc13n10lfsg5ol884ov40"]]
}
- Query a diff between our two branches
iex(7)> Decto.Diff.diff_table_between_branches("people", "master", "test-branch")
17:10:53.620 [debug] QUERY OK db=3.5ms queue=2.3ms idle=1362.3ms
SELECT * FROM dolt_commit_diff_people where from_commit=HASHOF('master') and to_commit = HASHOF('test-branch') []
%MyXQL.Result{
columns: ["to_last_name", "to_age", "to_first_name", "to_id", "to_commit",
"to_commit_date", "from_last_name", "from_age", "from_first_name", "from_id",
"from_commit", "from_commit_date", "diff_type"],
connection_id: 36,
last_insert_id: nil,
num_rows: 1,
num_warnings: 0,
rows: [
[nil, nil, nil, nil, "n2qph817064pg97c1i44br9qgdariefu",
~N[2021-07-16 00:08:24.956000], "Doe", 35, "John", 2,
"ils75lrdaumoc13n10lfsg5ol884ov40", ~N[2021-07-16 00:09:33.308000],
"removed"]
]
}
Conclusion
In this blog post we went through the basic steps of integrating Ecto and Dolt. You can now bring the benefits of a version controlled database to any one of your favorite Elixir applications.
The most important part was defining the core primitives of Dolt (Commit, Diff, Branching) with the expressive nature of Ecto. By modeling these primitives it becomes much easier to bring Git-behavior to your application. There's still plenty more work to be done to make Dolt better with Ecto, specifically with multi-tenancy. If you're interested in using Dolt please join our Discord here.