❮ Back to index

First commits

Writing proposals? Try Proppy, our new product!

Inspired by a recent post on reddit about Git's first commit I checked out the first commits for a variety of projects. Git's first commit contains a concise, fantastic description of what Git is and how it works. How do other projects compare?

Beginnings are always interesting because well-known projects do not yet have the status they'll achieve many years later. The lack of attention means there is no pressure to be elegant, or even concise.

The initial commit is quite often a code-dump of an existing, working project. Still, in some cases the history goes back as far as 1990 which is rather impressive.

Go

The first commit in Go has nothing to do with Go itself, other than illustrating its heritage: It's hello-world written in b, the predecessor of c. Even the commit timestamp matches (1972).

In fact the first 4 commits are a homage to Go's heritage. The first substantial commit is a preliminary version of the Go spec that already looks a lot like current Go with a few big omissions (e.g. channels aren't specced).

PostgreSQL

PostgreSQL's first commit is from 1996. PostgreSQL's first release was in mid-1995 but, as I understand it, had a less liberal license at that time. The first commit is an import of the whole, working project with the liberal PostgreSQL license.

Git

Git's fist commit was already discussed in depth in many places. I have fond memories of it because the description got me started on Git fairly early. It was so nice that it convinced me instantly, despite being a real brain twister at the time.

SQLite

The first commit from May 2000 is empty and says "initial empty check-in". The second commit is a code-dump. The copyright notices say 1999, 2000 so I assume that the project was started in 1999.

Interestingly, SQLite is in the public domain but the initial version was licensed as GPLv2. I did not discover at what point the license changed.

Python

Python has several first commits. It looks like they import an already functioning project between them.

People familiar with the project will recognize several components that have not changed in 25 years such as the Lib directory. See e.g. this commit.

Linux kernel

The full Linux history was re-assembled from three sources by Yoann Padioleau and can be found on archive.org. There isn't really a first commit as such because Linus Torvalds released versioned tarballs on an ftp server that's still around.

Kernel Newbies have written a nice walk-through of the first release so I'm going to stop here.

Docker

Docker's first commit is from 2013. It never migrated from one version control system to another, and not using using version control in 2013 would expose your project to ridicule. I.e. I was expecting a fairly clean start.

The code started as an internal project within dotCloud but was quickly moved to github. The initial commit is a code-dump tagged as 0.0.3.

Xen

Xen is famous for running Amazon's ec2 infrastructure. The first commit is from 1970, which is of course a lie and likely an artifact of moving from bitkeeper to mercurial.

The first substantial commit is a code dump by someone with an academic email address, hinting at the origins of the project.

The commit also contains a custom Linux called xenolinux which was developed for running efficiently on a Xen hypervisor.

Apache

Apache's first commit hints at a migration from CVS to SVN at some point.

Interestingly all of the early commits add documentation and no code. It makes no sense to document a non-existing server so the code must have been somewhere at this point. Maybe it got lost during the repository migrations?

Nginx

Nginx's first commit dates from August 2002, but as the commit message says "The first draft ideas are dated back to 23.10.2001.". Igor Sysoev started with Subversion and switched to Mercurial in April 2013.

The project doesn't contain a Makefile and is not complete. However the high level ideas are already there.

The first version of Nginx was only meant to serve static files over HTTP. It looks for an index.html document if it sees a directory, but doesn't support any other forms of URL rewriting.

A custom pool allocator is used to keep the memory footprint low. To this day Nginx is still very light on memory usage.

The early code already supports FreeBSD, Linux and Win32 with their respective event queue implementations (kqueue, select, aio). Maybe the plugin system comes from having to support these multiple event queue implementations from the beginning?

This slideshare presentation from Igor Sysoev contains a more information on the origin of Nginx.

Django

Django's first commit from 2005 is empty. Checking the second and third commit shows that this is an import from a private repository. This makes sense because the project started as the internal tool for the Lawrence Journal-World newspaper.

The order of of commits is mildly interesting: The first substantial commit is documentation. Documentation is a focus and a strength of the Django project to this day.

Conclusion

Most first commits are dumps of existing projects, including the fairly recent Docker. In some cases the projects started as private code, in other cases it looks like some data was lost when migrating to a new revision control system or just because revision control wasn't really a thing in 1990.