Getting Started with Dolt and Knex.js

INTEGRATION
20 min read

Dolt is the world's first version controlled SQL database. Dolt is MySQL-compatible so all the tools you use with MySQL work the same way with Dolt, including Knex.js (which is surprisingly pronounced /k'nex/ instead of /nex/ - Tim would like credit for being right for once).

Knex.js is a "batteries included" SQL query builder for many SQL databases, including MySQL. It features both traditional node style callbacks as well as a promise interface for cleaner async flow control, a stream interface, full-featured query and schema builders, transaction support (with savepoints), connection pooling and standardized responses between different query clients and dialects.

We have a few customers that use Knex.js in their applications and Dolt has received a few Knex-related GitHub issues over the years. This blog is inspired by Tim's Getting Started with SQLAlchemy and Dolt blog and will also walk through the Getting Started example from the Dolt README, except in Javascript using Knex.js and Node.

Dolt + Knex.js

TL;DR

If you don't want to run through this whole tutorial and just want the Javascript code, it is available in this GitHub repository. You can come with any Dolt SQL server and specify the connection information in a .env file in the repository root.

You must have Git and Node installed to run the code. This example uses a connection to a Hosted Dolt database, but the code will also work if you run a local dolt sql-server. If you are using a local dolt sql-server instead of Hosted Dolt you must also install Dolt.

You can find directions for running the code in the repository README. But essentially all you need to do is:

% npm install
% node index.js

This script will reset the database to its original state every time the script runs. The code shows off table creation, Dolt commits, reading Dolt system tables, rollback using Dolt reset, branching, and merging all using Knex.js.

Create a Hosted Dolt deployment

Unlike the SQLAlchemy example, this example will connect to a Dolt database that is cloud-hosted using our Hosted Dolt product.

First, go to the Create Deployment page. Choose whatever instance you want (we're using our $50 trial instance) and click Create Deployment. We're adding one read replica for a later example that utilizes read replicas.

Create deployment

Once the deployment has started, you'll see the connectivity information in the Database tab.

Knex example connectivity information

We then add this information to a .env file in the root of the example code repository.

DB_HOST="dolthub-knex-example.dbs.hosted.doltdb.com"
DB_PORT=3306
DB_USER="14i5lj63h3u8l4kv"
DB_PASSWORD="xxxxxxxxxxxxxxxxxxx"
DB_NAME="knextest"
DB_SSL_PATH="/hosted_ca.pem"

You will also need to download the Certificate and add it to the root of the repository. The file name should match the DB_SSL_PATH env.

As you can see, this configuration also includes a database named knextest. We can create this database from the Hosted Dolt UI by checking Enable workbench writes from the Workbench tab and clicking Create database.

Create database

Connect to the Hosted database

Now time to get into the Javascript. You can follow along with the code here. In the following sections we'll walk through the Javascript code and explain what it does and why.

The knex module itself is a function which takes a configuration object for Knex, accepting a few parameters. The client parameter is required and determines which client adapter will be used with the library. Dolt is MySQL-compatible so we can use either the "mysql" or "mysql2" client.

We use dotenv to load the environment variables we set earlier in the .env file. We will use these to create a connection pool to our Hosted database:

require("dotenv").config();
const knex = require("knex");
const fs = require("fs");

const poolConfig = { min: 0, max: 7 };

const config = {
  host: process.env.DB_HOST,
  port: process.env.DB_PORT,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,
  database: process.env.DB_NAME,

  ssl: process.env.DB_SSL_PATH
    ? { ca: fs.readFileSync(__dirname + process.env.DB_SSL_PATH) }
    : false,
};

const db = knex({
  client: "mysql2",
  connection: config,
  pool: poolConfig,
});

Once we've created our Knex instance, we can start running queries. We'll start with a function that wraps the Dolt function active_branch().

printActiveBranch();

async function printActiveBranch() {
  const branch = await db.raw(`SELECT ACTIVE_BRANCH()`);
  console.log("Active branch:", branch[0][0]["ACTIVE_BRANCH()"]);
}

This prints:

Active branch: main

Notice that this function uses async/await. Promises are the preferred way of dealing with queries in Knex, as they allow you to return values from a fulfillment handler, which in turn become the value of the promise. The main benefit of promises are the ability to catch thrown errors without crashing the Node app, making your code behave like a .try / .catch / .finally in synchronous code.

Create tables

Now let's create some tables. In this example our database will have three tables: employees, teams, and employees_teams. We can utilize Knex's schema builder method createTable, which creates a new table on the database with a callback function to modify the table's structure.

setupDatabase();

async function setupDatabase() {
  await db.schema.createTable("employees", (table) => {
    table.integer("id").primary();
    table.string("last_name");
    table.string("first_name");
  });
  await db.schema.createTable("teams", (table) => {
    table.integer("id").primary();
    table.string("name");
  });
  await db.schema.createTable("employees_teams", (table) => {
    table.integer("employee_id").references("id").inTable("employees");
    table.integer("team_id").references("id").inTable("teams");
    table.primary(["employee_id", "team_id"]);
  });
}

Dolt supports foreign keys, secondary indexes, triggers, check constraints, and stored procedures. It's a modern, feature-rich SQL database.

Then to examine our changes, we run a SHOW TABLES query and print the results.

printTables();

async function printTables() {
  const res = await db.raw("SHOW TABLES");
  const tables = res[0]
    .map((table) => table[`Tables_in_${database}`])
    .join(", ");
  console.log("Tables in database:", tables);
}

This outputs:

Tables in database: employees, employees_teams, teams

The employees, teams, and employees_teams tables have been created.

One of the many benefits of using Hosted Dolt is that it comes with a SQL Workbench where you can view your data and changes across revisions. When you click on Launch SQL Workbench from the top right of your deployment page, you'll see the tables we just created, as well as an ER diagram that illustrates table relations.

Hosted ER Diagram

Make a Dolt commit

Next, we'll Dolt commit our new tables. Both Git and SQL have the concept of commits and since Dolt is a combination of Git and SQL it must support both. This can be confusing. A Dolt commit makes an entry in the commit log for versioning purposes. A SQL transaction commit is required to persist your database writes to disk so other connections can see them.

In order to make a Dolt commit, we use the DOLT_COMMIT() procedure. Dolt exposes version control write operations as procedures. The naming of these procedures follows the Git command line standard. git add on the Git command line becomes dolt add on the Dolt command line becomes DOLT_ADD() as a Dolt SQL procedure. Arguments mimic Git as well. If you know Git, you already know how to use Dolt.

Similar to SQLAlchemy, Knex.js doesn't have a query builder for procedures. We have to use a raw SQL statement using Knex's raw() method. For this method, you can specify a commit author and message, which are passed as arguments into the Dolt procedure. Like Git, Dolt has a staging area, so we include a -A option to add all tables before making a commit. The resulting code looks like this:

doltCommit("Taylor <taylor@dolthub.com>", "Created tables");

async function doltCommit(author, msg) {
  const res = await db.raw(`CALL DOLT_COMMIT('--author', ?, '-Am', ?)`, [
    author,
    msg,
  ]);
  console.log("Created commit:", res[0][0].hash);
}

And running it results in the output:

Created commit: 80ks3k1ook712vauvavdnnd6t3q86e3d

Examine the commit log

Let's examine the Dolt commit log. Dolt version control read operations are exposed in SQL as custom system tables or functions. The commit log can be read using the dolt_log system table, named after the git log and dolt log command line equivalents.

We can use Knex's query builder select method to select and order the log elements we want to print. The resulting code looks like this:

printCommitLog();

async function printCommitLog() {
  const res = await db
    .select("commit_hash", "committer", "message")
    .from("dolt_log")
    .orderBy("date", "desc");
  console.log("Commit log:");
  res.forEach((log) =>
    console.log(`  ${log.commit_hash}: ${log.message} by ${log.committer}`)
  );
}

And it outputs:

Commit log:
  80ks3k1ook712vauvavdnnd6t3q86e3d: Created tables by Taylor
  dtn5h4oquovaes0n719b3sagbecbkp1h: Initialize data repository by Dolt System Account

You can also check your commits in the Hosted Workbench:

Hosted Commit Log

Insert data

Now we're going to populate the tables with some data. We can use the insert method and optionally specify conflict behavior using onConflict().merge() or onConflict().ignore().

insertData();

async function insertData() {
  await db("employees").insert([
    { id: 0, last_name: "Sehn", first_name: "Tim" },
    { id: 1, last_name: "Hendriks", first_name: "Brian" },
    { id: 2, last_name: "Son", first_name: "Aaron" },
    { id: 3, last_name: "Fitzgerald", first_name: "Brian" },
  ]);

  await db("teams").insert([
    { id: 0, name: "Engineering" },
    { id: 1, name: "Sales" },
  ]);

  await db("employees_teams").insert([
    { employee_id: 0, team_id: 0 },
    { employee_id: 1, team_id: 0 },
    { employee_id: 2, team_id: 0 },
    { employee_id: 0, team_id: 1 },
    { employee_id: 3, team_id: 1 },
  ]);
}

We can make sure our inserts worked by displaying a summary table. Knex.js comes with a query builder that supports many types of complex SQL queries. In this example, we'll construct a three table join. Later in this example we'll change the schema, so we'll account for that in this function.

printSummaryTable();

async function printSummaryTable() {
  // Get all employees columns because we change the schema
  const colInfo = await db("employees").columnInfo();
  const employeeCols = Object.keys(colInfo)
    .filter((col) => col !== "id")
    .map((col) => `employees.${col}`);

  // Dolt supports up to 12 table joins. Here we do a 3 table join.
  const res = await db
    .select("teams.name", ...employeeCols)
    .from("employees")
    .join("employees_teams", "employees.id", "employees_teams.employee_id")
    .join("teams", "teams.id", "employees_teams.team_id")
    .orderBy("teams.name", "asc");

  console.log("Summary:");
  res.forEach((row) => {
    let startDate = "";
    if ("start_date" in row) {
      if (row.start_date === null) {
        startDate = "None";
      } else {
        const d = new Date(row.start_date);
        startDate = d.toDateString();
      }
    }
    console.log(
      `  ${row.name}: ${row.first_name} ${row.last_name} ${startDate}`
    );
  });
}

Which results in the following output:

Summary:
  Engineering: Tim Sehn
  Engineering: Brian Hendriks
  Engineering: Aaron Son
  Sales: Tim Sehn
  Sales: Brian Fitzgerald

Examine the status and diff

You can use the dolt_status system table to see what tables changed.

printStatus();

async function printStatus() {
  const res = await db.select("*").from("dolt_status");
  console.log("Status:");
  if (res.length === 0) {
    console.log("  No tables modified");
  } else {
    res.forEach((row) => {
      console.log(`  ${row.table_name}: ${row.status}`);
    });
  }
}

The resulting output looks like:

Status:
  employees_teams: modified
  employees: modified
  teams: modified

Now, that I see which tables changed and how, I want to see what rows changed in a particular table. Dolt is built from the ground up to provide fast diffs between table versions even for very large tables. In Dolt, there are a few ways to view diffs: a dolt_diff_<table> system table for each user defined table and a dolt_diff() table function. We filter the diff table down to only WORKING changes so we only see changes that aren't staged or committed.

printDiff("employees");

async function printDiff(table) {
  const res = await db
    .select("*")
    .from(`dolt_diff_${table}`)
    .where("to_commit", "WORKING");
  console.log(`Diff for ${table}:`);
  console.log(res);
}

The resulting output looks like:

Diff for employees:
[
  {
    to_id: 0,
    to_last_name: 'Sehn',
    to_first_name: 'Tim',
    to_commit: 'WORKING',
    to_commit_date: null,
    from_id: null,
    from_last_name: null,
    from_first_name: null,
    from_commit: '9iuopee4d7qdkbo8m8vn69j5qq9ee871',
    from_commit_date: 2023-09-20T05:59:42.672Z,
    diff_type: 'added'
  },
  {
    to_id: 1,
    to_last_name: 'Hendriks',
    to_first_name: 'Brian',
    to_commit: 'WORKING',
    to_commit_date: null,
    from_id: null,
    from_last_name: null,
    from_first_name: null,
    from_commit: '9iuopee4d7qdkbo8m8vn69j5qq9ee871',
    from_commit_date: 2023-09-20T05:59:42.672Z,
    diff_type: 'added'
  },
  {
    to_id: 2,
    to_last_name: 'Son',
    to_first_name: 'Aaron',
    to_commit: 'WORKING',
    to_commit_date: null,
    from_id: null,
    from_last_name: null,
    from_first_name: null,
    from_commit: '9iuopee4d7qdkbo8m8vn69j5qq9ee871',
    from_commit_date: 2023-09-20T05:59:42.672Z,
    diff_type: 'added'
  },
  {
    to_id: 3,
    to_last_name: 'Fitzgerald',
    to_first_name: 'Brian',
    to_commit: 'WORKING',
    to_commit_date: null,
    from_id: null,
    from_last_name: null,
    from_first_name: null,
    from_commit: '9iuopee4d7qdkbo8m8vn69j5qq9ee871',
    from_commit_date: 2023-09-20T05:59:42.672Z,
    diff_type: 'added'
  }
]

Before we go onto the next section let's Dolt commit our changes.

await doltCommit("Tim <tim@dolthub.com>", "Inserted data into tables");
await printCommitLog();

And you'll see our new commit in the log:

Created commit: flfiptf13lc7chlvbkikqtss3r02k2se
Commit log:
  flfiptf13lc7chlvbkikqtss3r02k2se: Inserted data into tables by Tim
  80ks3k1ook712vauvavdnnd6t3q86e3d: Created tables by Taylor
  dtn5h4oquovaes0n719b3sagbecbkp1h: Initialize data repository by Dolt System Account

Rolling back a mistake

Dolt has powerful rollback capabilities. Let's imagine I accidentally drop a table. The foreign keys will prevent me from dropping employees or teams, but employees_teams is not safe from an accident.

dropTable("employees_teams");

async function dropTable(table) {
  await db.schema.dropTable(table);
}

As we can see from status and SHOW TABLES it is gone.

await printStatus();
await printTables();
Status:
  employees_teams: deleted
Tables in database: employees, teams

In a traditional database, this could be disastrous. In Dolt, we can get it back with a simple call dolt_reset('hard'). This function takes an optional commit. If no commit is specified it resets to the HEAD commit.

await doltResetHard();
await printStatus();
await printTables();

async function doltResetHard(commit) {
  if (commit) {
    await db.raw(`CALL DOLT_RESET('--hard', ?)`, [commit]);
    console.log("Resetting to commit:", commit);
  } else {
    await db.raw(`CALL DOLT_RESET('--hard')`);
    console.log("Resetting to HEAD");
  }
}
Resetting to HEAD
Status:
  No tables modified
Tables in database: employees, employees_teams, teams

Dolt makes operating databases less error-prone. You can always back out changes you have in progress or rewind to a known good state.

Change data on a branch

Dolt is the only SQL database with branches and merges. We will create and switch branches, and then make some changes and commit them to this new branch. Later, we'll merge all the changes together. Think of a branch as a really long SQL transaction.

First, you need to create a branch. Creating a branch is a write so you do it with a procedure, dolt_branch(). In the Javascript code, we also consult the dolt_branches system table to make sure the branch does not already exist. Then we use dolt_checkout() to switch branches.

await createBranch("modify_data");
await checkoutBranch("modify_data");

async function getBranch(branch) {
  return db.select("name").from("dolt_branches").where("name", branch);
}

async function createBranch(branch) {
  const res = await getBranch(branch);
  if (res.length > 0) {
    console.log("Branch exists:", branch);
  } else {
    await db.raw(`CALL DOLT_BRANCH(?)`, [branch]);
    console.log("Created branch:", branch);
  }
}

async function checkoutBranch(branch) {
  await db.raw(`CALL DOLT_CHECKOUT(?)`, [branch]);
  console.log("Using branch:", branch);
}

Now that we're on a new branch, it's safe to make changes and the main branch will remain unchanged. We are going to use a transaction to insert, update, and delete using the Knex query builder. All queries within a transaction are executed on the same database connection, and run the entire set of queries as a single unit of work. Any failure will mean the database will rollback any queries executed on that connection to the pre-transaction state.

modifyData();

async function modifyData() {
  try {
    await db.transaction(async (trx) => {
      await trx("employees")
        .where("first_name", "Tim")
        .update("first_name", "Timothy");

      await trx("employees").insert({
        id: 4,
        last_name: "Bantle",
        first_name: "Taylor",
      });

      await trx("employees_teams").insert({
        employee_id: 4,
        team_id: 0,
      });

      await trx("employees_teams")
        .where("employee_id", 0)
        .where("employee_id", 1)
        .del();
    });
  } catch (err) {
    // Rolls back transaction
    console.error(err);
  }
}

Let's inspect what we've done to make sure it looks right.

await printStatus();
await printDiff("employees");
await printDiff("employees_teams");
await printSummaryTable();
Status:
  employees: modified
  employees_teams: modified
Diff for employees:
[
  {
    to_id: 0,
    to_last_name: 'Sehn',
    to_first_name: 'Timothy',
    to_commit: 'WORKING',
    to_commit_date: null,
    from_id: 0,
    from_last_name: 'Sehn',
    from_first_name: 'Tim',
    from_commit: 'bu68epe1k4el46l9cfsdishuf5665q7m',
    from_commit_date: 2023-09-20T06:20:43.351Z,
    diff_type: 'modified'
  },
  {
    to_id: 4,
    to_last_name: 'Bantle',
    to_first_name: 'Taylor',
    to_commit: 'WORKING',
    to_commit_date: null,
    from_id: null,
    from_last_name: null,
    from_first_name: null,
    from_commit: 'bu68epe1k4el46l9cfsdishuf5665q7m',
    from_commit_date: 2023-09-20T06:20:43.351Z,
    diff_type: 'added'
  }
]
Diff for employees_teams:
[
  {
    to_employee_id: 4,
    to_team_id: 0,
    to_commit: 'WORKING',
    to_commit_date: null,
    from_employee_id: null,
    from_team_id: null,
    from_commit: 'bu68epe1k4el46l9cfsdishuf5665q7m',
    from_commit_date: 2023-09-20T06:20:43.351Z,
    diff_type: 'added'
  }
]
Summary:
  Engineering: Brian Hendriks
  Engineering: Aaron Son
  Engineering: Taylor Bantle
  Sales: Timothy Sehn
  Sales: Brian Fitzgerald

I am added to the engineering team on the modify_data branch. Tim is no longer on the Sales team.

Finally, let's commit these changes so we can make different changes on another branch.

await doltCommit("Brian <brian@dolthub.com>", "Modified data on branch");
await printCommitLog();
Created commit: j9taft0oie0kg1rhss4glvvusv0tdao9
Commit log:
  j9taft0oie0kg1rhss4glvvusv0tdao9: Modified data on branch by Brian
  flfiptf13lc7chlvbkikqtss3r02k2se: Inserted data into tables by Tim
  80ks3k1ook712vauvavdnnd6t3q86e3d: Created tables by Taylor
  dtn5h4oquovaes0n719b3sagbecbkp1h: Initialize data repository by Dolt System Account

Change schema on another branch

We're going to make a schema change on another branch and make some data modifications. Below, you'll see we check out the main branch so the new branch has the correct base branch. Then, we create a new branch called modify_schema and run the modify_schema() function, which uses a transaction to add a start date column and populate it. We finally use status and diff to show off what changed.

await checkoutBranch("main");
await createBranch("modify_schema");
await checkoutBranch("modify_schema");
await printActiveBranch();
await modifySchema();
await printStatus();
await printDiff("employees");
await printSummaryTable();

async function modifySchema() {
  try {
    await db.transaction(async (trx) => {
      await trx.schema.alterTable("employees", (table) => {
        table.date("start_date");
      });

      await trx("employees").where("id", 0).update("start_date", "2018-08-06");
      await trx("employees").where("id", 1).update("start_date", "2018-08-06");
      await trx("employees").where("id", 2).update("start_date", "2018-08-06");
      await trx("employees").where("id", 3).update("start_date", "2021-04-19");
    });
  } catch (err) {
    // Rolls back transaction
    console.error(err);
  }
}

This outputs the following:

Using branch: main
Created branch: modify_schema
Using branch: modify_schema
Active branch: modify_schema
Status:
  employees: modified
Diff for employees:
[
  {
    to_id: 0,
    to_last_name: 'Sehn',
    to_first_name: 'Tim',
    to_start_date: 2018-08-06T07:00:00.000Z,
    to_commit: 'WORKING',
    to_commit_date: null,
    from_id: 0,
    from_last_name: 'Sehn',
    from_first_name: 'Tim',
    from_start_date: null,
    from_commit: '3li6nho3o9891ersocjh4q3l371uj880',
    from_commit_date: 2023-09-20T06:28:42.459Z,
    diff_type: 'modified'
  },
  {
    to_id: 1,
    to_last_name: 'Hendriks',
    to_first_name: 'Brian',
    to_start_date: 2018-08-06T07:00:00.000Z,
    to_commit: 'WORKING',
    to_commit_date: null,
    from_id: 1,
    from_last_name: 'Hendriks',
    from_first_name: 'Brian',
    from_start_date: null,
    from_commit: '3li6nho3o9891ersocjh4q3l371uj880',
    from_commit_date: 2023-09-20T06:28:42.459Z,
    diff_type: 'modified'
  },
  {
    to_id: 2,
    to_last_name: 'Son',
    to_first_name: 'Aaron',
    to_start_date: 2018-08-06T07:00:00.000Z,
    to_commit: 'WORKING',
    to_commit_date: null,
    from_id: 2,
    from_last_name: 'Son',
    from_first_name: 'Aaron',
    from_start_date: null,
    from_commit: '3li6nho3o9891ersocjh4q3l371uj880',
    from_commit_date: 2023-09-20T06:28:42.459Z,
    diff_type: 'modified'
  },
  {
    to_id: 3,
    to_last_name: 'Fitzgerald',
    to_first_name: 'Brian',
    to_start_date: 2021-04-19T07:00:00.000Z,
    to_commit: 'WORKING',
    to_commit_date: null,
    from_id: 3,
    from_last_name: 'Fitzgerald',
    from_first_name: 'Brian',
    from_start_date: null,
    from_commit: '3li6nho3o9891ersocjh4q3l371uj880',
    from_commit_date: 2023-09-20T06:28:42.459Z,
    diff_type: 'modified'
  }
]
Summary:
  Engineering: Brian Hendriks Mon Aug 06 2018
  Engineering: Aaron Son Mon Aug 06 2018
  Sales: Tim Sehn Mon Aug 06 2018
  Sales: Brian Fitzgerald Mon Apr 19 2021

As you can see our defensive coding in the Insert data section paid off and employee start dates are displayed. This looks good so we'll commit it.

await doltCommit("Taylor <taylor@dolthub.com>", "Modified schema on branch");
await printCommitLog();
Created commit: 4j10oomkf2vp1309eoo01bq4ha4t6iq2
Commit log:
  4j10oomkf2vp1309eoo01bq4ha4t6iq2: Modified schema on branch by Taylor
  flfiptf13lc7chlvbkikqtss3r02k2se: Inserted data into tables by Tim
  80ks3k1ook712vauvavdnnd6t3q86e3d: Created tables by Taylor
  dtn5h4oquovaes0n719b3sagbecbkp1h: Initialize data repository by Dolt System Account

Merge it all together

Now we will merge all the branches together and show the resulting summary table. To merge, use the procedure dolt_merge().

await checkoutBranch("main");
await printActiveBranch();
await printCommitLog();
await printSummaryTable();
await doltMerge("modify_data");
await printSummaryTable();
await printCommitLog();
await doltMerge("modify_schema");
await printSummaryTable();
await printCommitLog();

async function doltMerge(branch) {
  const res = await db.raw(`CALL DOLT_MERGE(?)`, [branch]);
  console.log("Merge complete for ", branch);
  console.log(`  Commit: ${res[0][0].hash}`);
  console.log(`  Fast forward: ${res[0][0].fast_forward}`);
  console.log(`  Conflicts: ${res[0][0].conflicts}`);
}

This outputs the following. You can see the data and schema evolving as we merge.

Using branch: main
Active branch: main
Commit log:
  flfiptf13lc7chlvbkikqtss3r02k2se: Inserted data into tables by Tim
  80ks3k1ook712vauvavdnnd6t3q86e3d: Created tables by Taylor
  dtn5h4oquovaes0n719b3sagbecbkp1h: Initialize data repository by Dolt System Account
Summary:
  Engineering: Brian Hendriks
  Engineering: Aaron Son
  Sales: Tim Sehn
  Sales: Brian Fitzgerald
Merge complete for  modify_data
  Commit: 9he3pav122o3qmh7vlm58muv0ie2600j
  Fast forward: 1
  Conflicts: 0
Summary:
  Engineering: Brian Hendriks
  Engineering: Aaron Son
  Engineering: Taylor Bantle
  Sales: Timothy Sehn
  Sales: Brian Fitzgerald
Commit log:
  j9taft0oie0kg1rhss4glvvusv0tdao9: Modified data on branch by Brian
  flfiptf13lc7chlvbkikqtss3r02k2se: Inserted data into tables by Tim
  80ks3k1ook712vauvavdnnd6t3q86e3d: Created tables by Taylor
  dtn5h4oquovaes0n719b3sagbecbkp1h: Initialize data repository by Dolt System Account
Merge complete for  modify_schema
  Commit: fjfuj3h9nuap933qtt0aa4fske2h8r6b
  Fast forward: 0
  Conflicts: 0
Summary:
  Engineering: Brian Hendriks Mon Aug 06 2018
  Engineering: Aaron Son Mon Aug 06 2018
  Engineering: Taylor Bantle None
  Sales: Timothy Sehn Mon Aug 06 2018
  Sales: Brian Fitzgerald Mon Apr 19 2021
Commit log:
  fjfuj3h9nuap933qtt0aa4fske2h8r6b: Merge branch 'modify_schema' into main by 14i5lj63h3u8l4kv
  4j10oomkf2vp1309eoo01bq4ha4t6iq2: Modified schema on branch by Taylor
  j9taft0oie0kg1rhss4glvvusv0tdao9: Modified data on branch by Brian
  flfiptf13lc7chlvbkikqtss3r02k2se: Inserted data into tables by Tim
  80ks3k1ook712vauvavdnnd6t3q86e3d: Created tables by Taylor
  dtn5h4oquovaes0n719b3sagbecbkp1h: Initialize data repository by Dolt System Account

You'll also see the final results in the Hosted Workbench:

Workbench employees table

Notice the first merge was a fast-forward merge just like in Git. Dolt will detect schema and data conflicts if you make them.

Bonus Section: Using a read replica

Hosted Dolt now supports read replicas, a read-only server you can run queries against. Every write to the primary server is replicated to the read replicas. This allows you to scale your read traffic horizontally.

Some SQL query builders and ORMs (such as Sequelize) support specifying one or more read replicas and will automatically route read traffic to the replicas. Knex.js does not have this feature yet. To accomplish something similar, we create two Knex instances - one for the primary and one for the read replica - and explicitly send read queries to the read replica. You can follow along on this branch in our example repository.

First, you'll need to specify the read replica host in the .env file.

DB_READ_REPLICA_HOST="r01-dolthub-knex-example.dbs.hosted.doltdb.com"

And then create an additional Knex instance using that host.

const db = knex({
  client: "mysql2",
  connection: { ...config, host: process.env.DB_HOST },
  pool: poolConfig,
});
const readDb = knex({
  client: "mysql2",
  connection: { ...config, host: process.env.DB_READ_REPLICA_HOST },
  pool: poolConfig,
});

Every time we write to the database using the primary instance, we will see it reflected in the read replica instance. For example, after we create our three tables in the setupDatabase function, we can use the read replica instance to show the tables.

await setupDatabase(); // Uses `db`
await printTables(); // Uses `readDb`

async function printTables() {
  const res = await readDb.raw("SHOW TABLES");
  const tables = res[0]
    .map((table) => table[`Tables_in_${database}`])
    .join(", ");
  console.log("Tables in database:", tables);
}

And it would output the tables we created:

Tables in database: employees, employees_teams, teams

Managing branches in your read replica

Unlike other SQL databases, Dolt has branches. The main challenge with the code we wrote without replicas is making sure the primary and read replica instances are pointing at the same branch when we run read and write queries. For example, if we create a new branch modify-data and check it out on the primary instance, printActiveBranch using the read replica will still show the main branch.

The code:

await checkoutBranch("main"); // Uses db
await printActiveBranch(); // Uses readDb
await checkoutBranch("modify-data");
await printActiveBranch();

Outputs:

Using branch: main
Active branch: main
Using branch: modify-data
Active branch: main

But we want it to output:

Using branch: main
Active branch: main
Using branch: modify-data
Active branch: modify-data

There are a few ways we can keep our read replica active branch in sync with our primary active branch, which are documented here.

The simplest option (which we use in the example code) is to run CALL DOLT_CHECKOUT() on the read replica every time we run it on the primary.

async function checkoutBranch(branch) {
  await db.raw(`CALL DOLT_CHECKOUT(?)`, [branch]);
  await readDb.raw(`CALL DOLT_CHECKOUT(?)`, [branch]);
  console.log("Using branch:", branch);
}

This would also work with a USE statement.

async function checkoutBranch(branch) {
  await db.raw(`USE \`${database}/${branch}\``);
  await readDb.raw(`USE \`${database}/${branch}\``);
  console.log("Using branch:", branch);
}

However, in some cases you may not want to change the active branch on the read replica and instead quickly compare changes between revisions. We can use fully-qualified references with database revisions or AS OF clauses to do so.

For example, if we accidentally wrote some code that drops a table on branch oops, we may want to check that the mistake wasn't committed to that branch and that main wasn't also affected.

await checkoutBranch("oops");
await printActiveBranch();
await dropTable("employees_teams");
await printTables("WORKING");
await printTables("oops");
await printTables("main");
await printActiveBranch();

async function printTables(activeBranch) {
  const res = await readDb.raw("SHOW TABLES AS OF ?", activeBranch);
  const tables = res[0]
    .map((table) => table[`Tables_in_${database}`])
    .join(", ");
  console.log("Tables in database:", tables);
}

Which will output:

Active branch: oops
Tables in database: employees, teams
Tables in database: employees, employees_teams, teams
Tables in database: employees, employees_teams, teams
Active branch: oops

Only our working set was affected and we can easily CALL DOLT_CHECKOUT('employees_teams') to reverse the mistake.

Conclusion

Congratulations on making it this far! You can now start building your own Dolt application using Node and Knex.js. If you have any questions or need more help getting starting, stop by our Discord.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.