Skip to content

6. Introduction to Git

christine edited this page Sep 19, 2017 · 1 revision

Introduction to Git

This page is intended to serve as a brief introduction to the Git version control system and as a collection of resources for reference. We'll review the following:

  • What Git is and the basics of working with it
  • Differences with SVN (Subversion)
  • Additional resources
  • Git Flow
  • Helpful tips

If anything on this page is unclear, incorrect, or out of date, please feel free to correct it or file an issue for someone else to take a look. Thanks!

What is Git?

Git is a version control system that allows individuals and teams to keep track of changes to a code base (known as a repository). Using a version control system while writing software enables folks to keep track of their changes as they work and maintain a history (log) of them. Should there ever be a need to undo something, track down a bug or regression, or trace back steps of a complex feature, a version control system allows one to do all of these things.

Git provides two major additional benefits that most other version control systems do not:

  • Quick and easy branching of code
  • Distributed/decentralized repositories

Some quick background on these concepts follows, but the additional resources section provides a wealth of information that goes into detail behind these two important concepts.

Branching of code

When writing software, there are times when a developer may want to go down a different path to try something new, or they may have to stop what they're working on to shift gears to fix a bug in their existing production system. A version control system that supports branching allows them to do this, and Git makes it extremely easy. Unlike other systems, Git's branching model does not require that the entire code base be copied in order to create a new branch. Instead, it operates on the principles of pointers to specific commits as the base/parent of a branch and allows you to continue working from that point in time and state of code. A developer can also merge branches into one another easily to mix and match changes from other parts of the repository and from the work of other folks.

Distributed/decentralized repositories

Unlike traditional version control systems, Git is distributed, which means that a developer is able to work locally and independent of any centralized repository. They choose when and where to share their changes when ready. While this provides a lot of flexibility for a developer it also means there is a bit more work and discipline required to keep code in sync with other folks when working in teams or collaborating with external contributors.

Differences with SVN (Subversion)

The two major benefits listed in the overview of Git above also happen to be the two main differences with SVN (Subversion). SVN is what's known as a centralized version control system, which means that when a developer interacts with a repository managed by SVN they are also communicating with the SVN server at all times. The canonical source of the code is always hosted on a remote server, and when changes are committed they are immediately sent to the server.

Committing code in Git, however, only marks that set of changes locally. A developer must then push their code to a remote repository for it to actually pick up the changes and become available to others. Until the developer explicitly issues a push command, all changes are tracked locally only.

Key difference: commits are pushed to the server immediately in SVN; commits stay local in Git and must be pushed separately.

Git also has this notion of an index or staging area, which is the state of the file system prior to committing any code. This allows a developer to keep track of what's changing and see differences between the current state of the code as it exists in Git and what has changed as they work. It also allows the developer to easily pick and choose what they want to keep and commit as they work versus what they would like to discard or continue working on.

SVN keeps track of what has been modified, but it doesn't provide as much flexibility as Git does (more details about what Git allows one to do can be found in the resources) and again, changes that are committed go immediately into the main repository. This typically means a developer will not commit anything until they are completely done with a feature, but by that point the state of the code will have changed and it becomes a lot harder to integrate their changes with the others.

Key difference: SVN provides a limited amount of tooling and support while working with code; Git provides additional hooks and features to help manage code while it's being actively worked on.

This leads us to the other main difference: branching. As mentioned before, Git makes branching simple and easy, and does not require a copy of the entire code base to be made in order to create a branch. SVN does exactly this (at least up until very recent versions), which makes it very difficult to manage one's code locally. Furthermore, merging branches becomes extremely cumbersome due to every change between a source and destination branch having to be accounted for; there is no common origin or parent in this sense, which means the focus isn't on just what work was done in a given branch.

Historically speaking, most teams eschewed branching in SVN all together because of the cumbersome nature of working with them and instead developed workflows to manage building new features and fixing bugs as best as possible. Git has transformed how these activities can be done and has enabled teams to craft more efficient development workflows that result in greater outputs of work and less time spent resolving conflicts and dealing with regressions introduced from accidental commits. To be fair, these things absolutely still happen with Git, but they're much more manageable than with SVN.

Key difference: Branching is hard and expensive in SVN and generally never taken advantage of; branching in Git is quick, easy, and very effective.

Additional resources

The Git Homepage contains a wealth of information and is a great place to start. It provides online copies of Git's documentation as well as links to numerous other resources and reference material. Some of the most useful links are below:

  • Git/SVN Comparison: A much more detailed comparison between Git and SVN.
  • Git - SVN Crash Course: A very helpful resource in learning the commands of Git and how they compare to SVN's commands.
  • Pro Git 2nd Edition: This entire book is free online (print copies are available as well)
  • Git Cheatsheet: This helps visualize what is happening as a developer works with Git and makes changes to their source code.
  • git-svn: A tool for working with SVN repositories in Git; this can help with migrating repositories over.
  • GitHub Subversion to Git Migration: More documentation from GitHub around transitioning between SVN and Git.

Git Flow

The folks at 18F who have worked on the beta FEC project have used an additional tool called Git Flow to help manage their work with Git. Git Flow is nothing more than a collection of wrappers around various Git commands that helps developers on a team manage their work in a consistent manner. These command wrappers help guide folks in following a specific style of work known as a branching model in Git parlance.

More information on Git Flow can be found here:

Helpful tips

TBA