Data Science for Social Impact

Intro to Git and Bitbucket

Mirra Rasmussen, Erika I. Barcelos

What is Git?

  • Git is:
    • Free
    • Open source
    • Defined as a Distributed Version Control System
  • Distributed Version Control:
    • All code and revision history is mirrored on every computer
    • Not centralized, everyone has their own copy
  • This enables:
    • Content tracking and storage (in a Repository)
    • Easy collaboration
    • Code can be added and edited
  • Repository services:
    • Bitbucket (Atlassian)
    • GitHub (Microsoft)

How Git Version Control Works

  • Each developer has:
    • Remote repository (on Bitbucket or Github)
    • Local repository (personal computer)
  • Main (Master) Branch: Centralized, all code worked on individually gets merged to here
  • Individual Branch: Your edits to the project
    • Work on your local repository that you can push to your remote repository and merge with the main branch
  • Git commands are used to in the terminal to update both the local and remote repos with your and other developers’ work

Getting Started with Bitbucket (One Time Setup)

Before you start/join a project:

  1. Create a Bitbucket account

  2. Install Git (if coding locally on your desktop)

  3. In the terminal, link your Bitbucket account with Git:

  • git config --global user.name "username"

  • git config --global user.email "bitbucket_email@email.com"

For each project you are working on:

  1. In Bitbucket: Make your branch for the repo

  2. Also in Bitbucket: Click clone and copy the git clone command for the repo

  3. Paste this command in your terminal

Merging Main Branch with User Branch

Before we get coding, we need to make sure that our user branch is up to date with any changes in the main branch.

  1. Update your local main branch:
  • git checkout main moves to the main branch

  • git pull main pulls changes in the remote repo to your local repo

  1. Merge the main branch into your user branch:
  • git checkout user moves to the user branch

  • git merge main merges main branch into the branch we are currently in (user)

Where Files Exist

  • Working Directory: Where you add/edit content

  • Staging Area: Temporary area where we see files that were changed

  • Local Repo: Locally available version of the remote repo

  • Remote Repo: Online repo where uploaded content is stored

Adding Edits to the Remote Repo:

  • git add adds copies of files you edit in the working directory to the staging area

    • Use git add --all to add all files you have edited
  • git commit saves these files to your local repo

    • You should add a descriptive commit message so other developers can see a summary of your changes
  • git push uploads whatever is in your local repo to your branch in the remote repo

Merging User Branch with Main Branch

Now that you’ve made your edits and updated your branch in the remote repo, we can update the main branch from our user branch.

Merge your user branch into the main branch:

  • git checkout main moves to the main branch

  • git merge user merges the user branch into the branch we are currently in (main)

Be careful: merge conflicts can occur when the same lines of a file have been changed by two developers.

In Bitbucket, you can use pull requests to merge your branch into main and to prevent merge conflicts.

Remote Branch Changes to Local Branch

Getting edits from your remote branch to your local repo:

  • git checkout user will make sure that you are in the correct branch

  • git pull adds copies of these files to your local repo

Remote Branch Changes to Local Branch

Before you start working, you want to make sure everything is up to date:

  • git checkout user moves to your branch
  • git pull gets changes in your remote branch
  • git checkout main moves to main
  • git pull gets changes in remote main
  • git checkout user moves back to your branch
  • git merge main merges main into your branch
  • git push updates your remote branch

Avoiding Merge Conflicts

  • Pull Requests:
    • Part of Bitbucket interface that you can use to notify before you merge your branch into main
    • Easier way to compare changes between your branch and main
  • Status:
    • git status can be used to figure out whether your branch is up to date and which files are staged or untracked before you commit

Glossary

  • Git: free and open source distributed version control system that tracks development in files (including code) and is often used for collaboration between developers Git Commands:

  • git add: moves changes from working directory to staging area

  • git checkout: navigates between branches

  • git clone: copies an existing Git repo -

  • git commit: takes content in the staging area and commits it to the local repo

  • git config: sets configuration options for Git

  • git merge: combines changes from different branches

  • git pull: downloads branch from remote repo and merges it into the current branch

  • git push: copies content in a local repo branch to a remote repo branch

  • git status: prints information about the state of the working directory and staging area

A more complete glossary of Git commands can be found here

  • Branch: line of project development used to maintain and manage individual edits for different developers and/or sections of the project
  • Merge Conflict: occurs when multiple edits happen to the same line at the same time on two different versions, so Git is unable to determine which version to keep
  • Pull Request: method of merging branches that notifies other team members and creates an interface for seeing changes and potentially resolving merge conflicts
  • Repository: a collection of files and versions of a project
  • Local: on your computer
  • Remote: stored on a server, accessible online
  • Staging Area: location where files are added before committed to the local repo
  • Working Directory: a singular version of the project where you create and edit project files

References and Resources

Atlassian Git Tutorial Page

Allison Horst (@allison_horst)