Dolt + Excel

REFERENCE
7 min read

Dolt, DoltHub, and Hosted Dolt can be used for Spreadsheet Version Control. Many of our customers migrated to Dolt from a really complicated, heavily edited spreadsheet. Earlier this year, I wrote a blog about how to use Dolt with Google Sheets.

I've been waiting for a solid chunk of time I could work from home on my Windows machine to produce a companion Excel article because as we all know Windows Excel is the most popular Excel. Well, it being Thanksgiving week here in the US, I was able to work from home and produce this gem. Enjoy!

I first tested Dolt against Excel in fall of 2020 working from home during the pandemic. Dolt has changed a lot since then. Integration with Excel has never been better.

Dolt + Excel

Why Dolt and Excel

There is history in Excel using the "Show Changes" functionality. There is collaborative editing in Excel when combined with Sharepoint or when using the online version. But as I explain in my Spreadsheet Version Control article (with a slight edit), history and collaborative editing is not version control.

Version control allows for branch and merge, the ability to asynchronously collaborate on multiple versions in an organized way. Instead of everyone editing the same Google Sheet Excel Workbook, you can create you own copy, make edits, see the differences (ie. diffs) between your copy and another copy, and then merge two copies together. That workflow is true version control. No spreadsheet has those features.

Moreover, by adding DoltHub or Hosted Dolt to the mix, you add a Pull Request workflow. Pull Requests allow for human review of edits. Human review helps ensure data quality of changes to your spreadsheets.

Goals

To show off how Dolt can work with Excel, the goals are enumerated below.

  1. Read a Dolt table from the main branch into an Excel sheet
  2. Write to Dolt on a branch named branch using VBA
  3. Review differences in an Excel Sheet
  4. Create a pull request on Hosted Dolt from branch to main
  5. Perform a human review of the pull request
  6. Merge the changes into the main branch
  7. Read the new merged Dolt table into the original Excel Sheet

Setup

Dolt

For this demo, we created a very simple Dolt database named excel. The database has one table named test. The test table has two columns, id and words. id is a varchar(36) with a UUID default and the primary key. As described in this article, this is the best way to define a auto-generating primary key. words is a string of max size 255 characters.

Additionally, we create two branches main and branch and seeded some test data on both branches.

Hosted Dolt

Hosted Dolt

We create the database on Hosted Dolt because this is the easiest way to show off Excel connectivity. You can find the database named excel here. You will need to be granted permissions to see it. Either email me at tim@dolthub.com or come by our Discord and I'll give you access. Screenshots of the experience can be found below.

Hosted Dolt is a running Online Transaction Protocol (OLTP) database. You connect to it over the internet using any MySQL client. Hosted Dolt comes with a built in SQL Workbench which can be used to complement Excel for users more comfortable in a GUI environment.

Read using ODBC

The way to connect Excel to SQL databases is through the appropriate Open Database Connectivity (ODBC) Connector. Dolt is a MySQL-compatible database so you use the MySQL ODBC Connector.

Install the MySQL ODBC Connector on your computer. Downloading the appropriate installer and running it opens a standard Windows Installer. Once complete, navigate to Control Panel > Administrative Tools > ODBC Data Sources or just search for ODBC Data Sources. You'll be greeted by this window:

ODBC Data Source

Click Add... and you should see MySQL ODBC 8.0 ANSI Driver and MySQL ODBC 8.0 Unicode Driver, select the MySQL ODBC 8.0 ANSI Driver and click Finish. Now you need to enter your connection information. We call the data source "Dolt Excel", leave the server on TCP and enter timsehn-excel.dbs.hosted.doltdb.com as the server name, enter the user c6mc26245p2irqzn, enter the password as DQ0GSINxBtpNdbJlM3CDMoB1UwITXPHA, and the database as excel.

Enter Connection Information

After entering that information, clicking Test should result in a Connection Successful dialog. Click OK and the new Dolt Test ODBC connector will appear in the list.

Connection Added

Now it's time to read the data into Excel over that connection. Open a new Excel Workbook and navigate to Data > Get Data > From Other Sources > From ODBC. Pick Dolt Excel from the drop down and click OK. Excel requires you enter the username and password again so enter c6mc26245p2irqzn as the username and DQ0GSINxBtpNdbJlM3CDMoB1UwITXPHA as the password. This will bring up the following preview screen where you can navigate to and pick the table you want to import.

Excel Data Importer

I selected the excel database and test table. I am greeted with this nice data preview.

Excel Data Preview

From here we click "Load" to import the test table into a new sheet named test. I rename the sheet main.

Excel Sheet

Now, we want to read the test table off of a branch. To do this we need to use an as of query. If you want to return the results of a query instead of a whole table, navigate to Data > Get Data > From Other Sources > From ODBC. Pick Dolt Excel from the drop down and click Advanced instead of OK. Enter the SQL query select * from test as of 'branch' in the text box.

SQL Query

This brings up a data preview and you click "Load" to load the data into a new sheet. I rename that sheet "branch".

Excel Sheet

Finally, we want to show the diff between the two branches on a final sheet. To do this we follow the same steps as above to execute a custom SQL query. Dolt exposes version control read functionality as custom functions and procedures. To produce a diff between two branches we are going to use the dolt_diff() function. This function takes the two branches and table name as arguments. Enter select * from dolt_diff('main', 'branch', 'test') into the SQL query box, view the preview and click "Load". I called this sheet "diff".

Excel Sheet Diff

As you can see, there is a new row added on the branch branch.

Interacting with a running Hosted Dolt with Excel is easy on the read path. Everything is accessible via an manageable Graphical User Interface (GUI). Use SQL queries to make custom sheets and use the "Queries and Connections" menu to refresh the data whenever you would like.

Write Using VBA

Things get a little tricker when you want to write to Hosted Dolt from an Excel sheet. There is no built-in Graphical User Interface to do this. You must use Visual Basic for Applications (VBA).

To use VBA in Excel, you first need to enable the Developer Tab. This can be done on the File tab by going to Options > Customize Ribbon and select the Developer checkbox. Once you've enabled it, click on it and then click Visual Basic. You'll be greeted with a new Window that looks like a code editor.

Empty VBA

Next you must enable the "Microsoft ActiveX Data Objects (ADO) Library". In the VBA window, click on Tools > References. There's a bunch of them. The tutorial I worked with used 2.8 so I chose that one.

ActiveX Data Objects Selection

Now, it's time to write some code. I'm going to start simple because I'm not super familiar with VBA. Let's connect to Hosted Dolt and write a row on the branch branch. I enter the database as excel/branch so I connect to the correct Dolt branch to make the write. I also make sure to create a Dolt commit after I perform the insert. This is because my branch sheet only gets committed changes because of my use of as of.

Dim oConn As ADODB.Connection
Dim rs As ADODB.Recordset

Private Sub writeToDolt()

    Set oConn = New ADODB.Connection
    oConn.Open "DRIVER={MySQL ODBC 8.0 ANSI Driver};" & _
        "SERVER=timsehn-excel.dbs.hosted.doltdb.com;" & _
        "DATABASE=excel/branch;" & _
        "USER=c6mc26245p2irqzn;" & _
        "PASSWORD=DQ0GSINxBtpNdbJlM3CDMoB1UwITXPHA;" & _
        "Option=3"

    Set rs = New ADODB.Recordset

    strSQL = "insert into test(words) values ('VBA can make words')"
    
    rs.Open strSQL, oConn, adOpenDynamic, adLockOptimistic
    
    strSQL = "call dolt_commit('-am', 'VBA first commit')"
    
    rs.Open strSQL, oConn, adOpenDynamic, adLockOptimistic
  
End Sub

When I click the play button for the writeToDolt() function, go back to my branch sheet, and refresh using queries and connections, voila the data has been inserted!

VBA Insert Results

We can also see the change of the diff sheet if we refresh. This could be used to review changes locally.

VBA Diff Sheet

You can imagine making this code arbitrarily complicated, taking edits from the branch sheet itself and, inserting or updating the appropriate rows in Dolt. I have to be honest, I tried to do this for an afternoon but my VBA hacking skills were lacking. I did not want to work on Thanksgiving so I'll leave that as an exercise for the Reader. Major cop out, I know.

Hosted Dolt Pull Requests

I've written a new row and already had a branch row inserted. Let's go to Hosted Dolt, make a Pull Request and have those changes reviewed. I select main as my base branch and branch as my from branch.

New Pull Request

I click "Create pull request" and now I can have my changes reviewed. I send a link to this page to a colleague.

Pull Request

My colleague can view the diff on Hosted Dolt to review what I actually changed.

Excel Pull Request Diff

Everything looks fine so I click merge. The merge succeeds. Now, I can go back to Excel and make sure my changes showed up on the main branch.

Excel Merged Changes

As expected, my reviewed changes have been merged.

Conclusion

Dolt's integration with Excel has never been better. Hosted Dolt can be used to add a Pull Request Workflow to your Excel Spreadsheet. Interested in adding version control to your complicated spreadsheet, come by our Discord and we'll help you get started.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.