Table of parts: https://docs.google.com/document/d/1IIOkivkS0lV9QIL16b_OjT480BxZHQfKswvvS-CVZlw/view

Git

Book

Interactive tutorial:

Hashes

A hash is a one-way function giving short chaotic snippet of data from arbitrarily large data.

$ sha1sum /tmp/ca.crt

c531640bd2721ec40a9fd6c7a35d08b79d499510 /tmp/ca.crt

Main properties:

Hash can be used as a handle / primary key for some data. This is used in Git a lot.

Hash data structures

A hash chain:

$ echo qwer | sha1sum

994d7a8290f6b0f0c630d5f8a5455794151b9030 -

$ echo '994d7a8290f6b0f0c630d5f8a5455794151b9030 someotherinfo' | sha1sum

34bba814383d4a098e3802701bc59d56fd65be50 -

$ echo '34bba814383d4a098e3802701bc59d56fd65be50 more info' | sha1sum

2b6988a6c14278c73c8b891dd5089193d7129a9f -

more info node has a "secure pointer" to someotherinfo, which in turn has a handle of qwer.

Mini-exercise: try changing "qwer" to "qw3r" and observe how all hashes change.

Simple extension: hash tree

Simple extension: DAG (directed acyclic graph).

$ echo '994d7a8290f6b0f0c630d5f8a5455794151b9030 34bba814383d4a098e3802701bc59d56fd65be50 twoparents' | sha1sum 6cb9c96ac35cc2f21d3a7b2e8abed2d550203d6c -

Not possible: generic (not acyclic) graph.

Parts of Git repository

Everything outside ".git" directory

.git/objects/67/7207042f7076036901458f962f5ded740832b2

.git/objects/pack/pack-f0dc1fda8aeb5516201d48127d1905fcfa4e0b58.pack

Git object database

Git object model (simplified):

Some related terms:

        (longer thing: http://stackoverflow.com/a/23303550/266720)

Basic git commands for observing the things above:

Usual git commands:

Exercise:

  1. Initialize a Git repository, commit two files (one in subdirectory) into it, change a file, commit the changes;
  2. Observe temporary files in .git/objects/
  3. Pick some randing temporary file in .git/objects/ and determine its type, show content using git cat-file and git show
  4. Using git cat-file and git ls-tree commands, traverse entire created structure starting from HEAD
  5. Issue git gc command and observe what has changed in .git/objects/

Note: there are one more object type: annotated tag. Also trees can contain "commit" object instead of "blob" or "tree", but about this we should talk later.

Refs and HEAD

ref is a movable pointer to some hash. It is a little file (.git/refs/) or a line in a file (.git/packed-refs).

Usually refs point to commits, but that's not a rule.

HEAD is either:

Exercise:

Find the file representing your ref "master" and view it.

Typical refs:

refs/heads/master - usual master branch

refs/heads/myotherbranch - alternative branches

refs/tags/v1.2.3 - a tag (may be lightweight, annotated or signed)

refs/remotes/origin/master - mirroring information from other repository

git show HEAD~2^2:src/foo.c

Creating commits

Just for a demo: low-level commands

$ mkdir -p .git/{objects,refs/heads}

$ echo ref: refs/heads/master > .git/HEAD

$ echo "Hello, world" | git hash-object -t blob -w --stdin

a5c19667710254f835085b99726e523457150e03

$ ( printf '100644 hello.txt\0'; echo 'a5c19667710254f835085b99726e523457150e03' | xxd -r -p ) | git hash-object -t tree -w --stdin

67ac38590b37477deff534cc43c90d2e97a5d95a

$ printf 'tree 67ac38590b37477deff534cc43c90d2e97a5d95a\nauthor me <myemail> 123456 +0000\ncommitter me again <myemail> 123456 +0000\n\ncommit message\n' | git hash-object -t commit --stdin -w

5eb0dcc133cad6b6a4f5f4ee3c76e4bf706e127b

$ echo 5eb0dcc133cad6b6a4f5f4ee3c76e4bf706e127b > .git/refs/heads/master

$ git reset --hard

HEAD is now at 5eb0dcc commit message

$ cat hello.txt

Hello, world

Normal sequence

$ git init

$ echo "Hello, world" > hello.txt

$ git add hello.txt

$ git commit -m "commit message"

Check my step-by-step infographics : https://vi-server.org/pub/gitevolve.svg

Data can be exchanged in a flexible way between all three "worlds":

Command line options madness: git checkout

Main uses:

This command should actually be 3 separate commands.

git reset

Without changing HEAD (unless detached), changes ref.

git reflog

Merging

Inside files, it happens linewise

There may be conflicts

There is also rebase

First parent is main parent.

Exercise 1: Prepare simple repository. Create two branches with commits, merge them. Inspect the result at low level.

Exercise 2: Try merging with a conflict.

Exercise 3: Try merging specifying strategy, conflict resolution strategy.

Exercise 4: Try subtree merging.

git config --global merge.conflictstyle diff3

Merge strategy

"Try harder" tool for merging: wiggle

wiggle -r myfile.c && rm mfile.c.porig

Useful link for advanced things: http://blog.ezyang.com/2010/01/advanced-git-merge/

Rebasing

Manual rebasing

Initial disposition:

Step 0: determine merge base

git merge-base HEAD otherbranch

Step 1: Turn patches into files

git format-patch <mergebase>..HEAD --stdout > mywork.diff

Step 2: Reset to new position

git reset --hard otherbranch

Step 3: Apply patch series

git am -3 mywork.diff

Exercise 1: Try manual rebasing, inspect the patch file

Exercise 2: Revert the result of manual rebasing using reflog, try automatic rebasing.

EDITOR

When asking user to input something, Git opens text editor. It can often appear to be Vim or Nano, causing annoyance to users inexperienced with them. Here is how to override the editor:

export EDITOR="gedit -w"

git commit

git config --global core.editor "subl -w"

/usr/bin/editor -> /etc/alternatives/editor -> /usr/bin/vim.gnome

update-alternatives --config editor

Git waits for the editor to save file and exit.

Interactive rebase

git rebase -i HEAD~10

git rebase --interactive --autosquash

pick b7a2642 Earliest commit

pick 43f2347 First line of second commit message

pick bcab2b1 editing this part of the line changes nothing

pick fe087b3 Latest commit

git rebase --continue

Automates the following actions:

Note that editing the first line of commit message appearing as a comment does not work like "reword".

Deleting all uncommented lines means aborting rebase.

Some some change produce empty patch (without changes, i.e. squashing together "-A +B" and "-B +A" patches), Git may stop with error.

Exercise 1: Create repository with multiple non-conflicting commits. Try interactive rebasing: delete a commit, rework a commit, reorder and join two commits.

Exercise 2: Try splitting a commit using interactive rebase.

Ignoring changes

Exercise 1: Set up repository with five files:

Exercise 2 (optional): Set up clean filter that makes Git ignore all lines containing "NOCOMMIT" substring

Remotes

A remote is:

git ls-remote $URI

git ls-remote origin

git remote add $name $uri

git remote -v

git clone

Clean state:

---

We have unpushed changes:

---

Somebody else pushed first:

---

We download new commits: "git fetch":

---

What to do?

Option 1: merge:

git merge origin/master

git pull

then push:

git push

git push origin HEAD:refs/heads/master

Option 2: rebase:

git rebase origin/master

git pull --rebase

then push:

Option 3: --force:

git push --force-with-lease

git push --force-with-lease refs/heads/master:refs/heads/master

git push +master:master

Exercise 1: Try remotes-related commands in some repository. Inspect mirroring refs.

Exercise 2: Add new remote to test repository, synchronize some change with peer. Use SSH or "git daemon".

Exercise 3: Try editing history and pushing (pulling) changes. I'll explain personally what happens.

git daemon --export-all

"git push" command

git push remote [+]commitish:ref

Some notable options:

There may be a lot of abbreviations (including just "git push" without any arguments). Use "-n" option if not sure.

git config receive.denyDeletes true

git config receive.denyNonFastForwards true

Github

Fork and pull request.

Gitlab.

Exercise 1: Create a Github account and test repository (if not already exists).

Exercise 2: Give push access to a peer.

Exercise 3: Use push access from Ex 2.

Exercise 4: Create a pull request. Accept one. Check what has happened from Git's point of view.

Cherry-pick and revert

Cherry-pick is like a little rebase of just one commit from there to here. It is applied to an unmerged commit.

Revert is a "negating" commit, like a cherry-pick in reverse. It is applied to an ancestor (already merged) commit.

Exercise 1: Try cherry-picking a commit

Exercise 2: Try reverting a commit

Submodules and subtrees

Submodule - commit_id in tree with accompanying information in .gitmodules to deliver the code

Subtree - just foreign source from repository embedded in project.

Submodule structure

Uninitialized submodule consists of:

Initialized submodule consist of:

git submodule add

git submodule update --init --recursive

git add

Stray subproject (nested git repository) may add a commit pointer in tree without accompanying .gitmodules content, leading to a broken submodule.

References and alternates

--reference

--disassociate

file:///path/ vs /path links for git clone

rebase/filter-branch

Tags, etc.