CVS to git migration
Introduction
After using three different distrbuted revision control systems for the DSS project, the team settled on git. After the project had ended, the SDD decided to migrate all of the CVS repositories to git. This wiki describes that process.
First I will say a bit about the migration tools for converting a repository from CVS to git. This migration can be a bit tricky because of the fundamental differences between CVS and git. For example, CVS commits are file based while git commits are based on change sets (multiple files). The migration tool must know how to manage the commits when converting them to change sets. Also, git branches are very different from CVS branches and not all tools migrate them in the same way. Finally, unlike CVS, git is a distributed revision control system where there is not a central repository. Surprisingly, however, this does not come into play so much during a migration.
The two migration tools that were consider for this migration were git-cvsimport and cvn2git. git-cvsimport is a tool that is distributed with git. At first it seemed to work quite while, however after reading some of the documentation on the man page we found that there are well known issues related to branches. For example, the man page explicitly states a warning a the top that says developers should never modify the branches migrated over using git-cvsimport. The following issues are presented at the end of the man page.
ISSUES
Problems related to timestamps:
? If timestamps of commits in the CVS repository are not stable enough to be used for ordering commits changes may show up in the wrong order.
? If any files were ever "cvs import"ed more than once (e.g., import of more than one vendor release) the HEAD contains the wrong content.
? If the timestamp order of different files cross the revision order within the commit matching time window the order of commits may be wrong.
Problems related to branches:
? Branches on which no commits have been made are not imported.
? All files from the branching point are added to a branch even if never added in CVS.
? This applies to files added to the source branch after a daughter branch was created: if previously no commit was made on the daughter branch they will erroneously be added to the daughter branch in git.
Problems related to tags:
? Multiple tags on the same revision are not imported.
If you suspect that any of these issues may apply to the repository you want to import consider using these alternative tools which proved to be more stable in practice:
? cvs2git (part of cvs2svn), http://cvs2svn.tigris.org
? parsecvs, http://cgit.freedesktop.org/~keithp/parsecvs
We chose to follow the advice of the git-cvsimport man page and used cvs2git instead. So far this seems to have been a good choice.
How to migrate a CVS repository using cvs2git
cvs2git
Load dump and blob files into a new git repository
Now that the dump and blob files have been created, cd into a sandbox to initialize a brand new shiny git repo! Follow the example below using the sparrow repository. Note, here I'm showing the full paths for clarity.
-
cd /home/sandboxes/mmccarty/git_migration/cvs2git
-
mkdir sparrow
-
cd sparrow
-
git init
-
git fast-import --export-marks=/home/sandboxes/mmccarty/git_mirgration/cvs2git/git-sparrow-marks.dat < /home/sandboxes/mmccarty/git_mirgration/cvs2git/git-sparrow-blob.dat
-
git fast-import --import-marks=/home/sandboxes/mmccarty/git_mirgration/cvs2git/git-sparrow-marks.dat < /home/sandboxes/mmccarty/git_mirgration/cvs2git/git-sparrow-dump.dat
-
git checkout master
Create a bare shared repository
Now you have a new git repository with revisions imported from the CVS repository. Next, we will create a bare shared repository. For doing this, I have setup an area in /home/git1/git/integration-bare. You will also want to make sure your selected group ID is correct, monctrl in this case.
-
newgrp monctrl
-
cd /home/gbt1/git/integration-bare
-
mkdir sparrow; cd sparrow
-
git --bare init --shared
-
git --bare fetch /home/sandboxes/mmccarty/git_migration/cvs2git/sparrow *:*
- This command will fetch all the branches and tags from the git repository you just created from the CVS repository. You use fetch instead of pull because pull will fetch and merge. Here we do not want to merge because a bare repository has no working directory, so there is no place to merge.
- At this point you are done. It is recommended that you now go back to your sandbox and clone a new repository from the bare shared repository for development.
Repositories to Migrate (and Status)
These are the repositories in /home/gbt2/repository
Repository |
Status |
43m |
|
45ft.old |
|
bignell |
|
casper |
|
dss |
Deprecated |
EMS |
|
firmware |
|
gbi |
|
GBTdocumentation |
|
GBTObsTool.deleteme |
|
IRCv6 |
|
mclark |
Deprecated |
mjd |
|
oapi |
|
oui |
|
parParser |
|
poof |
Migrated |
reservations |
Deprecated |
rxlab |
|
Starlink |
|
themis |
|
ygor |
Migrated |
45ft |
|
atlas |
Migrated |
bos |
Deprecated |
contrib |
|
doc |
|
DSS |
Deprecated |
fg_util |
|
gb |
Migrated |
gbt |
Migrated |
gbtidl |
Migrated |
GBTServo |
|
MarkIII |
|
metrology |
|
modules |
|
oofholography |
|
PennArray |
|
pro |
|
rfi |
|
sparrow |
Migrated |
temp |
|
VibMonitor |
|