Dolt + Metabase
A customer wanted to use Metabase to visualize Dolt databases. Dolt provides powerful versioning features to Metabase charts and dashboards. Dolt "just works" with Metabase using the MySQL connector so it's really easy to get started.
This blog will show you:
- How to set up Metabase and Dolt to work together.
- How to use the version control features of Dolt with Metabase.
Why Dolt + Metabase?
Dolt adds a versioning layer to your data or metrics. As a data scientist or analyst, you inspect a dashboard or table and something looks fishy. Was this number always 10? I thought I saw 8 the last time I ran this report? When did that change? Dolt allows your data scientists and data analysts to answer audit questions on your data and metrics by providing an audit log of every cell in the database: who changed what, when? Data versioning will improve the efficiency of your data team.
In this scenario, Dolt would version the tables output by your data warehouse jobs, think nightly metrics runs. This is a natural place to hook versioned Dolt databases up to your dashboarding infrastructure, like Metabase.
In addition to auditing metrics, Dolt makes your metrics layer safe for writes. Want to preserve a forecasting model but not have it affect production data? Do it on a branch. Branches can greatly improve data team collaboration. Data analyses that currently happen in spreadsheets or notebooks can be shared and collaborated on more easily.
Installing Metabase
You can install Metabase a bunch of different ways. I went with the jar file because it's "the simplest and most basic way of running Metabase".
I downloaded the Metabase jar. I copied it into ~/dolthub/metabase/
and then ran:
$ java -jar metabase.jar
My Java is installed using Homebrew and I initially got this error message:
$ java -jar metabase.jar
The operation couldn’t be completed. Unable to locate a Java Runtime.
Please visit http://www.java.com for information on installing Java.
After some quick Googling, the internet told me to make this symlink:
sudo ln -sfn /opt/homebrew/opt/openjdk/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk.jdk
Then it worked! Now I have a clean Metabase on localhost:3000
.
Connecting to Dolt
Because Metabase is a dashboarding tool, I made a simple_example database with some synthetic Daily Active User and Monthly Active user metrics. I'll use it when showing off branches later in this article.
I also grabbed one larger database from DoltHub using dolt clone
to see how Metabase does with larger databases: museum-collections. I placed both these databases in the same directory on my laptop. Dolt makes it really easy to copy and sync databases.
In that directory, I run dolt sql-server
to start a MySQL compatible Dolt server on port 3306. Connecting to this server will allow me to access those two databases. See our Version Controlled Database Getting Started if you need help with this.
$ cd ~
$ cd dolthub
$ mkdir metabase-dbs
$ cd metabase-dbs
$ dolt clone timsehn/simple_example
cloning https://doltremoteapi.dolthub.com/timsehn/simple_example
$ dolt clone dolthub/museum-collections
cloning https://doltremoteapi.dolthub.com/dolthub/museum-collections
...
...
$ dolt sql-server
Starting server with Config HP="localhost:3306"|T="28800000"|R="false"|L="info"|S="/tmp/mysql.sock"
Now, I head over to Metabase to set up my database connections. First I need to register. I click the "Let's Get Started" button and follow the steps. After creating my user, I'm asked to set up a database:
I click MySQL and now I need to enter connection information. I am going to start by connecting to the museum-collections
database. I enter localhost
as my host, leave the port as 3306, and use the root
user with no password.
Because I'm nice, I let Metabase collect usage events. I click "Finish" and then Metabase wants to send me email. I politely skip that step and say "Take Me to Metabase". Voila!
Metabase seems to be working. Let's see what they can do with this database by clicking "⚡️ A look at Objects". I then select the "Museum Collections" database and see this page:
Metabase detected this table had geographical data in it and created a map. Very cool!
Creating a Visualization
Now we get to some of the cool visualization and dashboarding features of Metabase. Let's try and recreate the Museum Objects by Country visualization Metabase made for us automagically.
To do this I click the "+ New" button in the top right corner. Then, I select the "* Question" option. I then select the "Museum Collections" database and the "Objects" table. The following page is rendered.
Now, I want the count of objects by country rendered on a map. In the Summarize section, I click "Pick the metric you want to see" and select the "Count of rows" option. Then I click "Pick a column to group by" and select "Institution Country". Finally, I click "Visualize".
Pretty easy and intuitive and works just fine with Dolt via the MySQL connector.
Viewing historical data using AS OF queries
Now on to the Dolt features. It was pretty easy to get Dolt features working in Metabase. Metabase allows you to run raw SQL queries so doing a call dolt_checkout()
on a branch or select * from table as of '<commit>'
worked fine.
To build the museum-collections database, we paid bounty hunters to scrape collection data from museum websites around the world and put the results in this database. Let's look at that same map about half way through the bounty.
First let's use SQL to inspect the dolt_log
. Click the "+ New" button in the top right corner again and this time select ">_ SQL Query". Then I enter select * from dolt_log
in the query box and click the play button to execute the query.
Then we'll scroll down and pick the commit when Pull Request (ie. PR) 100 was merged, 1j6f408eh8rvbcap9e6k9hj88fq0r5le
. That's right DoltHub has Pull Requests on data!
Then, we'll select from the objects table as of
that commit to time travel. We use this query select count(*), institution_country from objects as of '1j6f408eh8rvbcap9e6k9hj88fq0r5le';
, then click the Visualization button and make a world map of our results.
It looks like at that point in time, the Museum Collections database only had objects from the United Kingdom! Dolt makes it easy to time travel through the history of your database using as of
queries.
Using Branches
Now, an example of how you might use branches in Metabase. Let's go back to our simple_example
database that has fake daily active user and monthly active user metrics. Let's imagine your dashboard looks like this.
We've all been there. The metrics pipeline was down for the first four days of March and the data pipeline team has been working tirelessly to recover the missing data. Until the data is recovered, it's really distracting to have that hole in the data. Temporarily, we want to show estimated DAU and MAU metrics for those four days. We want to have an audit log of what we did so no one questions the integrity of our business metrics.
With Dolt, we can either put the broken data on a branch or make the dashboard reference an estimated branch. For this example, we'll switch the dashboard to an estimated branch. Let's assume I fixed the four data points on the estimated branch.
To do that I'll change the source of this chart to reference the estimated branch instead of the main branch. To do that I make a new SQL query and run a select * from user_counts as of 'estimated'
. I click Visualization and select a line chart and now the holes in the graph are fixed!
Notice the filled in values for March 1-4.
Once the data is fixed on the main branch I can go back to that chart. Because of Dolt, I have an audit log of the whole event, ensuring the integrity of our business metrics.
Conclusion
Dolt provides powerful versioning features to Metabase charts and dashboards. Dolt "just works" with Metabase using the MySQL connector so it's really easy to get started. The power of branches and history queries improve the efficiency of your data teams and integrity of your data. Curious to learn more? Head over to our Discord and let's chat.