Source code archeology

The Heartbleed OpenSSL bug has been in the news a lot. And like many security stories there have been a few conspiracy theories floating around. Since OpenSSL is open source software, anyone can view the hostory of the project and see how the bug came about. But it does require understanding some tools.

In this post I hope to help explain them.

Step 1: Find The OpenSSL Source.

A good way to do this is to search for the project name and git or svn or hg. In this case “openssl git” found a few repos. I chose https://github.com/openssl/openssl. and then checked the list of sure it was current.

GitHub is a code hosting site that provides a rather useful front-end for browsing the history of a project. Checking the OpenSSL commit log shows that the tree is clearly kept up to date here. The changes are listed from most recent to the oldest.

Step 2: Find The Fix.

Looking through the commits, you’ll find these four:

OpenSSL commits 

The one on April 8th seems to hint a release happened. Updating the NEWS file is a common step in doing a release on a C-based opensource project. Another common file to update would be ChangeLog.

Continuing down you’ll see the second commit on April 7th - it specifically mentions the bounds check. Clicking on the hex number on the right (it’s a shortened version of the “commit id”) will bring you to that commit. If you scroll down that page you’ll get to the change in the file ssl/t1_lib.c:

ssl/t1_lib.c @ 731f431497 

The lines that are red and prefixed with a - were removed; the ones that are green and prefixed with a + were added. Knowing C is helpful here, but not fully required. The comments and the variable names give some hints as to what’s changed. Specifically lines 3982 and 3983 (in the new version of the file - the columns on the left indicate line numbers for the old version and then the new version):

if (1 + 2 + payload + 16 > s->s3->rrec.length)
  return 0; /* silently discard per RFC 6520 sec. 4 */

The if indicates a conditional. The expression it’s conditional on follows and on one side is a sum that includes the variable payload and on the other is a variable that includes the word length. The Heartbleed bug describes an issue where there’s a disagreement between how long the client claims the payload is and how long it actually is - so this likely relates to that (and it does). The next line says to discard the packet per RFC 6520.

Step 3: Go Back In Time

All that’s fine, we see the fix, but how did the bad code get there? The stuff in red. Let’s continue with ssl/t1_lib.c (other files were changed but you can apply what I describe here to the other files). First note the line numbers of the deleted lines - and the text of one of the lines - something easy to search on. For this file, it’s lines 3972 through 3976, n2s(p, payload);

Across from the filename you’ll see a button marked “View”. Click that and then note the buttons in the upper right of the file listing: Edit, Raw, Blame, History and Delete. This is the fixed version of the file, so lets go back - click the “History” button.

ssl/t1_lib.c history 

We’re currently on the April 7th version, click on the “Browse code“ link beneath the hex number to the right of the April 5th version. Search through the code for the lone of code you recorded earlier in this step and you’ll see who added the code and when. Check that the line numbers are roughly correct and sure enough it’s the code that was deleted:

ssl/t1_lib.c blame @ cd6bd5ffda 

The commit number for that change, from two years ago, is the hex number in the upper left - the 4817504d in blue. Clicking on the commit we learn from the commit comment (the blue box at the top of the page) that the code was written by Robin and reviewed/committed by Steve. And the comment seems to indicate that this is when the heartbeat feature (and the bug) was introduced.

So when people ask if the NSA did it it’s not all that hard to follow up and actually ask the people who let the code in. Did a security reasearcher in the UK and a student(?) in a German technical university plant a bug in OpenSSL for the NSA? Seems unlikely. But if you’re a journalist, you can follow up. You can follow the code changes, see who made them and get actual expert views on how it happened and why.