Categorieën
Geen categorie

Migrating from cvs to git

Hello,

Starting in a new job I was confronted with the use of a CVS repository. Needless to say, I wanted to migrate that to GIT. I found out there are several ways to do this, some bad, some better. I first started to use cvs2svn to convert to subversion and after that ‘git svn clone’ to do the second step. Turns out that where the first step takes approx 1,5 hours, the second step can take between 4 and 10 days! I started to use my Ubuntu workstation first and switched to some nice hardware but running CentOS 5. Unfortunately the standard cvs2svn rpm in the repo is rather old.

Finally I found the best way and it goes like this:

Get the latest cvs2git source from its project page (http://cvs2svn.tigris.org/). I used 2.3. It is a python thing. Next, analyse the script below and copy and tune it. It will make you happy (unless you are into S&M, than I would recommend doing it the hard way using the native cvs2svn rpm on a CentOS 5 system).

This script will migrate a 2 Gb CVS repo in approx 1.5 hours to a git repo. Good luck and happy hacking.

#!/bin/bash
export PYTHONPATH=/home/intrazis/jeroen/cvs-to-git-conversion/cvs2svn-bin/usr/lib/python2.4/site-packages/
export PATH=/home/intrazis/jeroen/cvs-to-git-conversion/cvs2svn-bin/usr/bin:$PATH
p=`pwd`
> cvs2git.log
date >> cvs2git.log 2>&1
echo "Resetting stuff"  >> cvs2git.log 2>&1
rm -rf cvs2git-tmp
mkdir cvs2git-tmp
#rm -rf cvsroot
rm -rf git-repo
mkdir  git-repo
# copie the cvs repo you have to this server
echo "copie repo"
scp -r  user@server:/var/cvsroot .
echo "Ready scp"  >> cvs2git.log 2>&1
date >> cvs2git.log 2>&1

##############################################################################################
#+ cvs2svn --pass=3 --retain-conflicting-attic-files --encoding=ascii --encoding=utf8 --encoding=utf16 --fallback-encoding=utf8 --dumpfile=svndump --write-symbol-info=symbol-info.txt cvsroot
#----- pass 3 (CollateSymbolsPass) -----
#Checking for forced tags with commits...
#The following paths are not disjoint:
#    Path tags/csource contains the following other paths: tags/csource/BasicHTML, tags/csource/IzInit, tags/csource/Login, tags/csource/include,
#Please fix the above errors and restart CollateSymbolsPass
#
#hacking the cvsroot files to boldly go and remove the tag 'csource' .
#  grep -R --exclude=*.gif,v "csource:" cvsroot/*
echo "hacking the cvsroot files to boldly go and remove the tag 'csource' ."  >> cvs2git.log 2>&1
for file in `grep -lR  "csource:" cvsroot/*`
do
  sed -i -e 's/csource:/csource-org:/' $file
done
echo "======================================================="  >> cvs2git.log 2>&1
date >> cvs2git.log 2>&1
##############################################################################################
# put all possible options into the cvs2git.options file. An example is available in the cvs2git python source
cvs2git --options=cvs2git.options >> cvs2git.log 2>&1
date >> cvs2git.log 2>&1
mkdir git-repo
cd  git-repo
date >> cvs2git.log 2>&1
git init  >> cvs2git.log 2>&1
date >> cvs2git.log 2>&1
#Load the dump files into the new git repository using git fast-import:
#
#git fast-import --export-marks=../cvs2svn-tmp/git-marks.dat < ../cvs2svn-tmp/git-blob.dat
#git fast-import --import-marks=../cvs2svn-tmp/git-marks.dat < ../cvs2svn-tmp/git-dump.dat #This can, of course, be shortened to: # echo "Start fast-import off dump and blob files"  >> cvs2git.log 2>&1
cat ../cvs2git-tmp/git-blob.dat ../cvs2git-tmp/git-dump.dat | git fast-import
echo "ready"  >> cvs2git.log 2>&1
date >> cvs2git.log 2>&1
cd ..