Correctness Update

SQL
4 min read

About a month ago, we provided an update to Dolt's correctness. Since then, we've continued to make improvements; here's what we've been up to.

Bug Fixes

As usual, we're always trying to make a dent in our bug queue. Here are some notable bug fixes that we've made this month:

Generated Foreign Key Names

A couple weeks ago, a customer was using Dolt as a replica for their MySQL database. A replica essentially mirrors queries run on their MySQL database to Dolt, providing a backup. However, when dropping a foreign key, they realized that Dolt was generating foreign key names differently than MySQL.

In MySQL, when you create a foreign key without specifying a name, MySQL generates a name for you.

mysql> create table parent (i int primary key);
mysql> create table child (j int primary key, foreign key (j) references parent (i));
mysql> show create table child;
+-------+-------------------------------------------------------------------------+
| Table | Create Table                                                            |
+-------+-------------------------------------------------------------------------+
| child | CREATE TABLE `child` (                                                  |
|       |   `j` int NOT NULL,                                                     |
|       |   PRIMARY KEY (`j`),                                                    |
|       |   CONSTRAINT `child_ibfk_1` FOREIGN KEY (`j`) REFERENCES `parent` (`i`) |
|       | ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin        |
+-------+-------------------------------------------------------------------------+

Dolt also generates a name for you, but the name is different.

dolt> create table parent (i int primary key);
dolt> create table child (j int primary key, foreign key (j) references parent (i));
dolt> show create table child;
+-------+-------------------------------------------------------------------------+
| Table | Create Table                                                            |
+-------+-------------------------------------------------------------------------+
| child | CREATE TABLE `child` (                                                  |
|       |   `j` int NOT NULL,                                                     |
|       |   PRIMARY KEY (`j`),                                                    |
|       |   CONSTRAINT `r4cf97vs` FOREIGN KEY (`j`) REFERENCES `parent` (`i`)     |
|       | ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin        |
+-------+-------------------------------------------------------------------------+

MySQL seems to use a pattern of <table_name>_ibfk_<number>, while Dolt actually used a hash generated from the table and columns involved. We strayed from MySQL's pattern because Dolt has merge capabilities, and we wanted to avoid avoidable conflicts when merging tables with foreign keys. For example, it is possible to have a branch create an unnamed FOREIGN KEY over a column t.a, while other another branch creates an unnamed FOREIGN KEY over a column t.b. Following MySQL's pattern these FOREIGN KEYs would have the same name, leading to a conflict when merging the branches. We realized this is not the best approach as Dolt strives to be a drop-in replacement for MySQL. So we fixed this issue, and Dolt now generates foreign key names in the same way MySQL does.

As for dealing with merge conflicts, one of the branches has to change the name of the foreign key to avoid the conflict. In MySQL, the only way to change the name of a foreign key is to drop and recreate it, which might be costly. Consequently, we improved upon MySQL's syntax, adding a RENAME CONSTRAINT clause to the ALTER TABLE statement.

dolt> alter table child rename constraint foreign key child_ibfk_1 to child_ibfk_2;
dolt> show create table child;
+-------+-------------------------------------------------------------------------+
| Table | Create Table                                                            |
+-------+-------------------------------------------------------------------------+
| child | CREATE TABLE `child` (                                                  |
|       |   `j` int NOT NULL,                                                     |
|       |   PRIMARY KEY (`j`),                                                    |
|       |   CONSTRAINT `child_ibfk_2` FOREIGN KEY (`j`) REFERENCES `parent` (`i`) |
|       | ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin        |
+-------+-------------------------------------------------------------------------+

Fix DECLARE in BEGIN block

Recently, a customer had created a TRIGGER that contained a DECLARE statement within a BEGIN block, and attempting to trigger the TRIGGER resulted in a panic.

It turns out that we had only properly supported DECLARE statements in STORED PROCEDURES, but still parsed them in TRIGGERS. The fix involved modifying some analyzer rules to fill out these DECLARE definitions.

Now, dolt can run statements like this:

tmp/main*> create table t (i int primary key, j int);
tmp/main*> delimiter //
tmp/main*> create trigger trig before insert on t for each row begin
        -> declare x int default 0;
        -> set x = new.i * 10;
        -> set new.j = x;
        -> end;
        -> //
tmp/main*> delimiter ;
tmp/main*> insert into t(i) values (1);
Query OK, 1 row affected (0.00 sec)
tmp/main*> insert into t(i) values (2);
Query OK, 1 row affected (0.00 sec)
tmp/main*> insert into t(i) values (3);
Query OK, 1 row affected (0.00 sec)
tmp/main*> select * from t;
+---+----+
| i | j  |
+---+----+
| 1 | 10 |
| 2 | 20 |
| 3 | 30 |
+---+----+
3 rows in set (0.00 sec)

New Functions

This time around, our new functions all revolve around time, specifically dealing with days.

TO_DAYS() and FROM_DAYS()

TO_DAYS() is a utility function in MySQL that tell you how many days have passed since the "zero" date, which is 0000-00-00.

select to_days('1999-11-05');
+-----------------------+
| to_days('1999-11-05') |
+-----------------------+
|                730428 |
+-----------------------+
1 row in set (0.0745 sec)

FROM_DAYS() is the inverse of TO_DAYS(). It takes a number of days and returns the date that many days after the "zero" date.

select from_days(730428);
+-------------------+
| from_days(730428) |
+-------------------+
| 1999-11-05        |
+-------------------+
1 row in set (0.0067 sec)

The trickiest part about implementing these two were accounting for leap days. Leap years happen every 4 years, except for years that are divisible by 100, unless they are also divisible by 400, so

func isLeapYear(year int64) bool {
	return year != 0 && ((year % 4 == 0 && year % 100 != 0) || year % 400 == 0)
}

Also, year 0000 is not a leap year.

LAST_DAY()

LAST_DAY() is a function that returns the last day of the month for a given date (also accounting for leap days).

select last_day('2000-02-02');
+------------------------+
| last_day('2000-02-02') |
+------------------------+
| 2000-02-29             |
+------------------------+
1 row in set (0.0021 sec)

Interestingly, there's a neat "knuckle" trick for remembering which months have 31 days. Knuckle Trick

Metrics Summary

We had 166 skipped tests out of 42021 total enginetests. Now, we have 195 skipped tests out of 42347 total enginetests. While the number of skipped tests has increased, the total number of enginetests has also increased, resulting in a 99.54 percent pass rate.

Although the actual pass rate has decreased slightly, a huge part of correctness is knowing what's broken.

Since last month, we've added a few new functions to our supported functions list. This brings our total supported functions from 304 to 308, out of a total 438 functions, resulting in a 70 percent coverage.

Conclusion

Be prepared to see more correctness updates in the future. Have a function or an issue you want us to prioritize? Make an issue on our Github Issues page. Alternatively, we're always available on our Discord.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.