Skip to content

VersionControlForGrammarDevelopment

MichaelGoodman edited this page Oct 11, 2011 · 5 revisions

Version Control for Grammar Development

This page is intended to describe how to use several version control systems (VCSs) with grammars. Version control (also "revision control") is essentially a more sophisticated method of backup for files. Users "checkout" files from a repository, modify the files, and "commit" their changes back into the repository. These systems usually have functionality for merging changes from multiples users, thus facilitating collaboration, and users can get previous versions of files from the history in order to revert damaging changes, etc. The repository is usually stored somewhere other than the development machines for safety. More information is available at http://en.wikipedia.org/wiki/Revision_control.

Centralized vs. Distributed

Traditionally, version control has been done with centralized systems like CVS or Subversion (SVN), but distributed systems, such as Git, Bazaar, or Mercurial, have become popular due to a number of benefits they provide.

Centralized systems, true to their name, need a central host for the repository. Users work in an "instance" of the repository. The instance may be moved around in a file system, but the repository must stay in the same location. All users check out of and commit to the same central repository, so their changes are shared immediately after committing the files.

Distributed systems are "headless" in that they do not need a central host. There are no "instances" of the repository, as the repository (including all the history) is built into the directories of the files it manages. This allows one to move the repository around and still be able to commit and view changes. One backs up the repository by simply copying the directories somewhere else. There is usually an official branch located on a publicly accessible server, and users commit changes by merging their own branch with the official version (if they have permission), or by making their version public so the maintainer of the official branch can merge the changes.

Other than the infrastructural differences, interactions with both centralized and distributed systems are largely the same.

Subversion

Subversion has traditionally been the VCS of choice for grammars, as there is usually an official version of the grammar hosted by a university. Following is an example of setting up a Finnish grammar (located at ~/grammars/fin) into an SVN repository.

Note: The example below initializes the repository in /tmp on the same machine as development. It is suggested you initialize it on a separate machine.

Initializing the repository:

user@host:~$ # svnadmin create REPOS_PATH
user@host:~$ svnadmin create /tmp/finnish

Importing the grammar directory (the -m option is for a commit message. If you do not use this, you will be prompted for a message in a text editor. Also, if the PATH is not specified, it will use the current directory.) Importing will not make the current directory an instance of the repository, so we can check out an instance somewhere else:

user@host:~$ cd ~/grammars/fin
user@host:~$ # svn import [PATH] REPOS_URL -m MSG
user@host:~/grammars/fin$ svn import file:///tmp/finnish -m "Initial import."
Adding  licence.txt
Adding  finnish.tdl
 ...
Adding  choices

Committed revision 1.
user@host:~/grammars/fin$ cd ~/grammars
user@host:~/grammars/fin$ # svn co REPOS_URL
user@host:~/grammars$ svn co file:///tmp/finnish fin2
Checked out revision 0.

Alternatively, we could have checked out an instance of the repository, added all the files in the current directory, then committed (note, don't import AND add, just do one or the other.):

user@host:~/grammars/fin$ # svn add FILES
user@host:~/grammars/fin$ svn co file:///tmp/finnish ./
Checked out revision 0.
user@host:~/grammars/fin$ svn add ./*
A       licence.txt
A       finnish.tdl
 ...
A       choices
user@host:~/grammars/fin$ svn ci -m "Initial commit."
Adding  licence.txt
Adding  finnish.tdl
 ...
Adding  choices
Transmitting file data ..........................
Committed revision 1.

Checking status and committing changes:

user@host:~/grammars/fin$ emacs finnish.tdl
user@host:~/grammars/fin$ # (edit and save file)
user@host:~/grammars/fin$ svn status
M       finnish.tdl
user@host:~/grammars/fin$ svn diff finnish.tdl
Index: finnish.tdl
===================================================================
--- finnish.tdl (revision 1)
+++ finnish.tdl (working copy)
@@ -4,8 +4,8 @@
 ;;;     Wed Jul 14 13:59:14 UTC 2010
 ;;; based on Matrix customization system version of:
 ;;;     unknown time
-;;;
-;;; 
+;;; author:
+;;;     user
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 
 ;;;;;;;;;;;;;;;;;;;;;;;;;
user@host:~/grammars/fin$ # svn ci [FILES] -m MSG
user@host:~/grammars/fin$ svn ci finnish.tdl -m "Added author comment."
Sending        finnish.tdl
Transmitting file data .
Committed revision 2.
Clone this wiki locally