Exercise: Git and GitLab

Outline

Introduction

By this point, you have already used and understand the basics of version control, and you have specifically used either git or svn while managing a small team project. In this exercise, you will learn how to use some of the advanced features of modern version control along with how to use integrated tools that can assist in coordinating and enforcing tasks like code review, scheduling, and issue tracking in a team development environment. The skills that you use in this exercise are core essentials of modern software development practices. Not only will you be expected to use these techniques, and the features of GitLab, while developing your semester projects, any work that you do outside of these workflows simply may not be taken into account during grading.

There are many outside sources of information on using git. The git website has an excellent book along with introductory videos that provide an excellent and thorough introduction. They also have numerous tutorials. Much of the information in this exercise comes from these tutorials or expects you to first follow these tutorials and then demonstrate your understanding. Some of the tasks in the exercise are based on those developed by Ted Kirkpatrick.

Step 1: git Basics

As you already know, a version control system (VCS) tracks the changes in files over time. Thus, if you want to know when or why a particular change was made to a file, the VCS should hold the answer. Managing these versions also means that the VCS can coordinate and track the changes made by multiple developers, ensuring that changes from multiple developers do not (textually) interfere with one another. In a modern VCS, a set of selected changes to files is applied atomically. That is, if the changes do not interfere with other changes concurrently made by another developer, then the changes are all applied simultaneously, otherwise, they are not applied.

One distinguishing feature of git and other distributed version control systems is that you locally batch together groups of changes to files into atomic commits. That is, you make your changes on your own computer. Then, you can choose to share these batches of changes with others by pushing them to a remote repository, likely on a different machine.

In order to gain a better understanding, read the following portions of the git book:

You may also try one of the online tutorials for the basics. You should now know about and understand:

Action Items

Create a new personal project repository using GitLab. Give both the instructor (wsumner) and the TA (cmishra) developer access to the repository through the GitLab member settings for the project. You will use this repository for all tasks completed in this exercise and submit the URL of the repository at the end.

The repository must contain exactly two commits. The first commit to the repository must have a master branch with exactly one file called student.txt, and file must contain exactly one line with your SFU user ID without the @sfu.ca suffix.

The second commit must rename/move student.txt to username.txt and add a second file called readme.md. The second file may contain anything you want.

These two commits must (naturally) be pushed to the remote GitLab server.

Step 2: Branching and Merging

If you have experience using another VCS, you may shudder at the word branching. Branching in a VCS refers to the ability to create a divergent history for a project. The main history of the project can continue normally on the main branch, and you can make experimental modifications to a second branch without affecting other developers' ability to use the main one. In fact, they may be entirely unaware of the changes that you make to the second branch. Once the second branch is in a desired state, you can merge the histories again, applying the desired changes from the second branch into the main one. Back to the Future is recommended viewing when learning about branching.

A primary distinguishing feature of git is that branching is easy and lightweight enough that it becomes one of the primary tools for tracking and managing changes to a software project. We will explore this more in step 3. In order to gain a better understanding, read the following portions of the git book:

You may also try one of the online tutorials. You should now know about and understand:

Action Items

Create a second branch called feature in your repository. Add a new file called diary.txt to this branch that contains anything you want.

Merge the feature branch into master. After this, master should now contain diary.txt.

You should now be on the master branch. Create a file called recipes.txt and commit it. Create another branch called roguechef and modify recipes.txt on that branch before committing it. Don't merge yet! Back on the master branch, modify recipes.txt differently before committing it. Now merge roguechef into master. This will cause a conflict because the changes in the two branches interfere. Resolve the conflict and finish merging roguechef into master.

Step 3: Following a Workflow and Code Review

As you saw in chapter 3.4, branching can be used to manage the experimental work associated with developing a single feature, maintaining legacy versions and new development versions, and more. In addition, as we shall see shortly, branching can also be used to manage and even enforce peer review of all code before it is committed to the repository. From now on, you should be following this latter practice for your projects this semester.

However, there are also some risks that ought to be recognized. In particular, maintaining many different branches can become undesirably confusing for developers, and the longer a branch lives without being merged into its eventual targets, the less likely the branch is to merge with few conflicts and little rework. The benefits and trade-offs of these workflow options need to be assessed and balanced in order to design a workflow that fits best for your project.

For projects this semester, you will follow a simple GitLab Flow. You can find a full explanation of GitLab workflows in the GitLab Documentation. Read this documentation for a full explanation of the workflow. You can also find videos online illustrating a simple use of the GitLab flow along with integrated features like merge request management and code review. Watch this video. It contains a clear illustration of the steps to follow for every change that you make to the repository. Roughly, the usual steps are as follows:

  1. Perform your development of a feature, refactoring, or whatever on a separate branch.
  2. When you are ready to merge this feature, push the branch to origin.
  3. Visit the page for the repository in GitLab and click on the button presented to create a merge request from the branch you pushed. Merge requests are GitLab's way of organizing changes to the code that have not yet been approved and merged (to the master branch).
  4. Assign at least one peer of your team to review the code and provide feedback.
  5. Your team member then reviews the code, and you can continue to discuss and improve it until it is eventually approved and merged to the repository. Under GitLab flow, the source branch is removed upon merging.

Note, I will track the merge requests as one metric of measuring your overall contributions to your semester projects.

For your semester projects, you should also use a production branch for delivering and maintaining your individual iterations.

Action Items

Create a new branch topsecret containing a single file called secrets.txt. Create a GitLab merge request for the branch. From the merge request management page in GitLab, approve the merge request to finally merge the branch into the repository. NOTE: In a real project, you should not approve your own merge requests! In this case, you are doing so simply to learn how the system works.

Step 4: Issues and Scheduling Milestones

GitLab also has the ability to identify units of work called issues. An issue can be a bug fix, a documentation change, a feature addition planned for a future sprint, or any other unit of work. The issue tracking page provides a convenient way of organizing these issues along with a platform for discussing them, assigning them to developers, and monitoring their progress. Using this system for your semester projects can help to ensure that you do not forget a task or lose track of it amongst the many requirements you must fulfill.

Issues can also be assigned to milestones, which are commonly used to identify deadlines or iteration/sprint boundaries in a project. By having both issue tracking and milestone scheduling, GitLab allows you to flexibly keep track of the tasks that need to be performed and schedule or reschedule them as necessary.

Action Items

Create a new milestone titled "Exercise Due" in GitLab and set its due date to September 21st. Create a new issue titled "Add more cats to the readme" and schedule it for the milestone you just created.

Now modify readme.md by adding the word cat to it (anywhere you like) and commit it to the repository with a commit message that includes the words "Fixes #N" where N is the number of the issue you created. Push the change on a new branch in order to create a merge request. Accept the merge request and note that the issue is now closed.

Step 5: Submission

To submit your exercise, you must submit the URL of your repository via CourSys. For instance, my submission might be

https://csil-git1.cs.surrey.sfu.ca/wsumner/exercise2

Bonus Steps: Additional Features to Save Your Skin

git includes many additional features that allow you to more conveniently change, navigate, and extract useful information from the history of a project. Three particularly useful features are interactive staging, stashing, and tools for debugging.

Interactive Staging

Interactive staging allows you to look at the changes in your working directory as diffs and select exactly the combination of changes you want for a commit. This means that you can choose to commit only part of the changes to a file if you made changes to the same file that semantically belong to different commits.

Stashing

You probably noticed that checking out a branch required a clean working directory (with no uncommitted changes). Sometimes you might have changes in your working directory that you do not yet wish to commit. In this case, you can stash the changes for later, change branches to do your work, change back to the original branch, and unstash your uncommitted changes on the original branch. It may sound complicated, but it is quite straightforward to use.

Debugging

Git contains some commands that can greatly simplify debugging your code. In particular, you'll find that use git blame to identify the last commit (and developer) that touched a particular line of code. Sometimes, though, you'll want to perform a binary search on the history of a project in order to discover when a bug was first introduced. This can be done either interactively or automatically using git bisect. These are powerful tools to have in your arsenal.