Using Git

Introduction

Git is a distributed version control system which you will use to manage your lab source code during this course. Git is compatible with a large number of protocols and is supported on most major operating systems. It was originally created to assist in Linux kernel development.

You will be using Git to manage the source code of your labs during the course. With Git, you can easily keep track of the changes that you make to your projects, compare differences between different versions of your code, and revert changes as necessary. Git, like most modern version control systems, has the ability to maintain separate branches of code, which programmers can work on in distributed fashion and later merge together. This guide is intended to serve as a bare introduction to getting started with Git.

Helpful Git Documentation

The following is a list of documentation for Git, which will assist you in understanding how Git approaches the subject of distributed version control, and how Git works in general. You will not be using every feature of Git in this course, only the features necessary to clone, stage, commit, and push your lab work.

Git Reference Manual - The reference manual on the Git website.
The Official Git Tutorial - Git tutorial on the Git website.
Git for Beginners: The Definitive Practical Guide - from Stackoverflow.
Understanding Git Conceptually - Excellent tutorial on how Git organizes commit objects, and how to branch, merge, and rebase.

This documentation list is by no means exhaustive, but will more than cover any use of Git that you will be expected to perform in this course.

Obtaining Git

Git is already installed on the computers in the lab, data, and lore. If you plan to do your lab work on these machines, you do not need to do anything in this section. However, if you wish to use it on your own computer, you will need a local installation. If you are on Windows or Mac, you can obtain Git from the Git Website. If you are on Linux or some other variant of UNIX, you can use the package management software that you have to install git. See also the documentation for installing Git, which addresses some of the dependent libraries that Git needs.

Setting up your Git username

Since Git keeps track of who changes what, you will want to set up your global username in your user account. Git stores your individual user settings in a file located at ~/.gitconfig. To set up your username, do the following:

git config --global user.name "YourName"
git config --global user.email "youremail@purdue.edu"

Cloning your Repository

A single Git project is known as a "repository." A repository contains all of the files in the project, and all of the different revisions of those files. A single revision is called a commit object. Commits are stored in one or more individual branches. Your starting repository for lab 3 will contain a single branch, called "master."

All students are assigned a repository. To load a repository onto a computer and work on it, you will need to clone it into your work area. This can be in your CS account (which is the easiest) or another computer. To clone your repository from one of the lab machines, do the following, where $USER is your username:

git clone ssh://$USER@data.cs.purdue.edu/homes/cs252/sourcecontrol/work/$USER/lab3-src.git

This will create a folder called lab3-src in the current directory, and you can now begin working on your lab.

Staging and Committing

Git does not automatically track your changes; you will need to stage and then commit any changes that you make to your Git repository. You will also want to push your changes to the remote repository (which you cloned out of) so that your changes can be backed up.

Let's say you have changed your shell.y file, and wish to commit this change to git. In order to stage the change, type:

git add shell.y

Continue staging changes that you have made in other files. When you are ready to commit, type:

git commit -m "Added: yacc grammar to simple redirects."

The commit message should be a description of what was changed, so that anyone viewing the commit knows what to expect. When you're done committing, you can review the last few changes to the repository by typing git log. An example output is as follows:

commit f381c8f3b10aea89c244772ff3121772d47c5a7a
Author: CS252 Student 
Date:   Sun Sep 8 21:21:58 2013 -0400

    Added: yacc grammar to do simple redirects.

commit 7f88833a95a28b5c4780347cd4727de934a2e7fc
Author: CS25200 Account 
Date:   Sun Sep 8 19:36:17 2013 -0400

    Initial commit.

If you have changed several files and do not want to manually stage each one, the commit command argument -a will automatically stage all changed and deleted files for you. Use it as follows:

git commit -am "My message."

Pushing to the remote repository

In order to sync your repository to the remote, you need to execute a push. Large projects can have multiple remotes, but your lab's remote is in your cs252 work directory. Pushing to the remote will ensure that your changes are backed up, and that TAs and professors can see your code as necessary. It is also how you will turn in your lab. When you cloned your repository originally, git already saved the URL you cloned from and set it as a default remote called "origin." All that is needed to push to origin is the following:

git push

When running a git push command, the output will look something like the following:

data 71 $ git push
Counting objects: 5, done.
Delta compression using up to 32 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 342 bytes, done.
Total 3 (delta 2), reused 0 (delta 0)
To ssh://cs252student@data.cs.purdue.edu/homes/cs252/sourcecontrol/work/cs252student/lab3-src.git
   fcc00b4..f381c8f  master -> master

Since you will be working on lab 3 alone, there is no need to pull from the origin. You will also notice the default branch, called master is pushed automatically. During a collaborative project involving several people (which is what most real-world projects consist of), pulling allows you to update your local repository with changes that others have made, and branches allow multiple features to be developed simultaneously. Please push to your remotes at least once every 2 days..

Undoing a Commit

One of the most common problems that occurs using git is when a commit is run prematurely, i.e. there were changes you forgot to add to the commit. To reset back to the pre-commit status (while preserving your commit message) do the following:

git reset --soft HEAD^

You can view a head in git as a pointer to a commit. When you use the HEAD^ keyword you are referring to the parent of the current head, which points to the last commit object. Make any changes that you need made, and then re-commit by running:

git commit -a -c ORIG_HEAD

This will preserve your original commit message. The differences between soft, hard, and mixed resets are beyond the scope of this tutorial, but be warned: When you do a hard reset in git, you risk losing work. So when in doubt, do a soft reset unless you know exactly what you are doing.

How often (and what) do I commit?

Try to organize commits into specific features. For instance, if you make changes to your shell.l (Lex file), commit those changes before moving on to your shell.y. If it turns out that you need to reverse those changes later, you will be able to do so atomically without reverting other, unrelated features. Over time, you will develop intuition on to best schedule your commits.