17 Jul 08

There has been a lot of fuss about which is the better version control system among the likes of CVS, Subversion, Git, SVK, Perforce, Accurev, the list goes on and on. But I am going to limit myself to Git and Subversion which are frequently being compared in the arena of SCM. These two version control systems fall in two different categories

  • Distributed version control systems like Git and Code Co-Op
  • Centralized version control systems like CVS and SVN

Distributed Version Control system

  • In a distributed version control system every developer has his own private copy of the complete source code on his or her machine.
  • They can put changes in their modules and can sync their changes with other developers.
  • If by any means, access to the developer’s machine with whom you generally sync is not available, you have options to sync it with others. Also there is no bottleneck as you can sync with anyone in your project.
  • This system is generally structured as a pyramid, as developers at a particular level can deliver changes with other developers on their level or the level above them and at the top of the pyramid is the person who holds the decision of what actually goes in the main line of development.
  • Developers can work on the source code their own way without affecting anyone else. These developers can work with their fellow developers and sync changes with each other.
  • Each can decide what changes to accept or reject from their fellow developers.
  • The advantage is that the code is distributed and there is no “point of failure”.
  • Distributed systems enable a lot of private work in a way that is bad for the development.
  • Maintenance tends to be sloppier on distributed systems.
  • Git falls in this category; this type of development is not used very commonly. Perhaps the biggest example of this is Linux kernel development.

Centralized Version Control system

  • In a centralized system there is a repository on the central server where all the source code goes.
  • Every developer checks out the working copy of the same piece of code and then makes changes in their working copy and everyone can commit changes to the main line of development.
  • Anybody can put changes in anyone’s module if no folder level control is exercised.
  • In such a system every one gets the changes of other developers whether they are willing to do so or not. A centralized system is much more likely to be backed up and its hardware kept up to date.
  • Subversion falls in this category. This type of development is very common. Sourceforge.net follows this type of version control.

The distributed systems do work by not needing a server (except for rsync and any web server). But truly speaking the server-less system they set up is more complicated in actual practice than a single well-maintained server.

The official claim is that a centralized server cannot handle distributed type of development. I disagree that centralized servers cannot handle this development methodology. Let’s imagine a centralized version control system that has excellent branch development capabilities. No developer directly works off the trunk. Instead, each developer has one branch that is created from the trunk. Developers can merge their branches into the trunk or in other branches as done in distributed systems. For this to work, branching and merging operations must be smooth. And even the pyramid structure of distributed system can be implemented by controlling which all branches are to be merged in the trunk.

Subversion can be better but this has nothing to do with centralization. With the benefits of a centralized system, Subversion also gives you the flexibility to be used as a distributed system if you use right approach to it.

This is my personal opinion based upon my experiences working in the software engineering field. Feel free to comment on it and point out errors in judgment, if any.