Brain Phrye

code cooking diy fiction personal photos politics reviews tools 


git filter-branch

[ Listen to article ]

I’m working on converting two large svn repos to git. Both have large binary files stored in them so I’m using git lfs to make them more manageable. However the repos are large enough that the normal tools don’t work so I’m using the BFG repo cleaner to do the lfs migration. And this, plus the subgit migration tool leave loads of .gitattributes files laying around.

They’re easy to remove, but I need .gitattributes files in there for lfs to function correctly. So how can I add a file to every commit?

The solution is to use git filter-branch. The docs already show how to remove a file. And they explain the benefits of --index-filter over --tree-filter. The latter checks out the entire tree, the former just makes an index for each commit. Far, far faster. So the way to remove the old .gitattributes files is like so:

1
2
3
4
git filter-branch \
  --index-filter \
    'git rm --cached --ignore-unmatch filename' \
  -- --all

But how to add one? First it’s best to have a repo to test on:

1
2
3
4
5
git init thing
cd thing
for i in $(seq 100); do
  ps > ps; git add ps; git commit -m "ps $i"
done

Now let’s pretend we’re tringing large compressed files:

1
2
3
git lfs track '*.bz2'
git add .
git commit -m "Start tracking big files."

Now we have a .gitattributes file but I want that in every past commit. Will this work?

1
2
3
4
git filter-branch \
  --index-filter \
    'git update-index --add .gitattributes' \
  -- --all

Nope. it fails saying it can’t find .gitattributes. Hm. What directory is --index-filter running in? Running this shows me it’s in .git-rewrite/t

1
2
3
4
git filter-branch \
  --index-filter \
    'pwd;git update-index --add .gitattributes' \
  -- --all

So this is what I have to do instead:

1
2
3
4
git filter-branch \
  --index-filter 
    'cd ../..;git update-index --add .gitattributes;cd -' \
  -- --all

Note the cd - at the end to go back to the directory --index-filter is running in. I covered this earlier.

To check this worked, git log --name-only and then you’ll see that the .gitattributes file was added all the way back in the first commit. The git lfs commit is now an empty commit (git allows those).