Release notes generation for GitHub repos
Introduction
Today we're excited to announce the open sourcing of a tool to automatically generate markdown formatted release notes for GitHub repositories. Dolt is using this tool to generate our release notes going forward, and we've also used it to backfill our older releases.
Generated release notes contain summaries for every pull request merged and every issue closed since the previous release, ready to copy and paste into your release notes on GitHub. It also supports summarizing changes in dependencies for golang projects. Try it out today!
Our release notes were really bad
A customer called them "spartan." He was being generous. I would have used "embarrassing" or possibly "grossly negligent." Maybe even "execrable." Or "derelict."
As our CEO remarked: it looked like we hadn't done any work for three months.
The reason for this sorry state of affairs is simple: release notes are hard to assemble. We have 12 engineers at DoltHub now, and every release everybody was expected to report to the release manager a summary of what they had contributed since the last release. When even was that? I'm busy, Oscar.
It didn't work well. And eventually our release manager kind of gave up even trying.
If you visit our releases page today, you'll be treated to a veritable feast for the senses: a full summary of merged PRs and closed issues for every release we've ever put out.
This is an example where automation doesn't just make a process faster and less labor intensive. It makes it actually better. It leads to a completely different end result.
There has to be a better way
Frustrated with the sorry state of affairs, I went looking for a solution. Everybody has this problem, right? Somebody must have solved it.
Turns out lots of people have. The first solution I found was Ted the Releaser, which I installed and found didn't work, at all. Gren is the semi-official solution, which I'm embarrassed to admit that I somehow... just didn't find in my original search until now. If I had, I probably would have used it instead of writing my own. Was I missing a keyword in Google or something?
But Gren doesn't handle changes in dependencies, which was important to me since so much of the development of the Dolt SQL query engine takes place in another package, go-mysql-server. I really wanted the release notes generated to automatically include all changes in this dependency, and I didn't want to have to keep the two packages' releases in sync to achieve this. So I'm going to pretend that I wrote my own release notes generator because I wanted this vital feature (which is true) and not because I'm apparently terrible at Google searches.
While I'm on my soapbox, it would be really great if release note generation was just a built-in feature of the web UI for GitHub and not a custom module I needed to bring to the table.
Assembling release notes
Like all the best dev ops tools, this one is cobbled together in Perl
to glue together a bunch of system utilities into a unified
whole. Specifically, it shells out to curl
to make GitHub RPCs,
interrogates the state of the local repo with git
, and uses grep
to determine dependency versions in a go.mod
file.
For example: here's how we fetch the set of PRs for a release, after
having determined the time range to consider. curl
does all the HTTP
stuff, and a standard JSON parser library turns the response file into
a perl data object for us.
sub get_prs {
my $base_url = shift;
my $from_time = shift;
my $to_time = shift;
print STDERR "Looking for merged PRs between $from_time to $to_time\n";
$base_url .= '?state=closed&sort=created&direction=desc&per_page=100';
my $page = 1;
my $more = 0;
my @merged_prs;
do {
my $pulls_url = "$base_url&page=$page";
my $curl_pulls = curl_cmd($pulls_url, $token);
print STDERR "$curl_pulls\n";
system($curl_pulls) and die $!;
$more = 0;
my $pulls_json = json_file_to_perl($curl_file);
die "JSON file does not contain a list response" unless ref($pulls_json) eq 'ARRAY';
foreach my $pull (@$pulls_json) {
$more = 1;
next unless $pull->{merged_at};
return \@merged_prs if $pull->{created_at} lt $from_time;
my %pr = (
'url' => $pull->{html_url},
'number' => $pull->{number},
'title' => $pull->{title},
'body' => $pull->{body},
);
push (@merged_prs, \%pr) if !$to_time || $pull->{merged_at} le $to_time;
}
$page++;
} while $more;
return \@merged_prs;
}
And here's how we determine the version of a dependency at the start
and end of a release range by using grep
.
# Returns the SHA version of the dependency named at the repository SHA given.
sub get_dependency_version {
my $dependency = shift;
my $hash = shift;
my $cmd = "git show $hash:go/go.mod | grep $dependency";
print STDERR "$cmd\n";
my $line = `$cmd`;
# TODO: this only works for commit versions, not actual releases like most software uses
# github.com/dolthub/go-mysql-server v0.6.1-0.20210107193823-566f0ba75abc
if ($line =~ m/\S+\s+.*-([0-9a-f]{12})/) {
return $1;
}
die "Couldn't determine dependency version in $line";
}
Perl (get off my lawn) is uniquely great at cobbling together small utilities like this into a working solution. It's so natural to transition from command line experimentation directly to full blown product without even consciously deciding to.
GitHub RPCs work well and are fast, but like any API you spend most of
your development time understanding the data model and behavior
through a combination of reading documentation (tedious, sometimes
apocryphal) and reverse engineering (virtuous, thrilling). The details
are mostly boring, and it's easy to read the script code to see the
RPCs being made. We fetch pull requests, issues, and releases and then
iterate over the results. There were some surprises in there, like the
fact that the issues
RPC also returns pull requests in the same
namespace. But mostly making the RPCs was the easy part of this
project, and GitHub deserves credit for a well-built API.
Example release notes
Merged PRs
- 1170: Updating to latest go-mysql-server
- 1169: go/libraries/doltcore/sqle: Keyless tables don't have PK index -- fix describe panic
- 1168: /.github/{scripts,workflows}: fix, pod to job, handle pod errors
- 1167: C# test for alternate MySQL connector library, upgraded existing to u… …se dotnet 5 (up from 3)
- 1165: /.github/workflows/ci-performance-benchmarks.yaml: fix id
- 1163: Db/ci performance
- 1162: unrolled decode varint decode loop 30% faster on the benchmark in this PR. BenchmarkUnrolledDecodeUVarint/binary.UVarint-8 1000000000 0.0372 ns/op BenchmarkUnrolledDecodeUVarint/unrolled-8 1000000000 0.0258 ns/op
- 1147: Fixed indentiation in YAML syntax for Discord notifications
- 1143: First cut of Discrod notifications
This implements the following policy:
- notify on cancellation or failure of any job
- notify on release, including success
GITHUB_TOKEN
. We will devise a workaround. - 256: added describe queries for keyless tables
- 255: This function implement an Naryfunction type. Allows you to define sqle functions that have multiple children.
- 254: Fixed UNHEX/HEX roundtrip
Simple fix but I ended up completely reevaluating our binary type implementation. Fixed a bug found in the
cast
package we were using to convert strings, and also changedUNHEX
to return the proper SQL type. - 252: Added hash functions
- 249: Alias bug fixes Fixes a number of buggy behaviors involving column indexes and table name resolution.
- 248: additional tests add a table with multiple keys an index that has a subset of those keys in a different order a couple queries
Closed Issues
- 1161: Primary keyless tables seem to break DESCRIBE
- 1153: p.StopWithErr(err) is hanging on large imports
Conclusion
Hopefully you find the tool useful for your own release note generation. If you're in the same boat as us, managing multiple golang repositories, then the unique capability to include changes from dependencies might make it worthwhile to stray off the more well-traveled path and give it a chance.
Have something to tell us about this, or about Dolt? Come chat with us on Discord. We're always happy to hear from new customers!