Git and Github
Git and Github
GIT
Git is a distributed version control system (collaborated with people and work on the same project)
Can track changes in files in any folder (repository)
Git preserves history of all changes in the files
Git could be used locally without internet and locally you will have full access to the files change
history
Git stores all changes locally and does not rely on internet connection
GITHUB
SHELL
Git init – Initialises a new git repository (with default branch as master)
Git repository will be initially empty even if directory where you initialize git is not empty and
contains files and other directories
Never manually change .git folder, let git manage itself
Blob – stores any files with extensions (.txt, .mp3, .mp4, etc)
o Have no file names
o Size and type is stored in each blob
Tree – stores information about directories / folders
Commit – storing different versions of the project
Annotated tag – persistent text pointer to specific commits
Git hash – object – able to create new object and write it in git repository
o Hash = folder name + file name
o For e.g. echo “HELLO” | Git hash-object –stdin -w
Git cat -file – read information about git object
Git mktree – make tree
JSON VS GIT
HASH
Takes any variable length input and creates a fixed length hash
Hash is generated based on the contents of the file
Hash are one-way functions (create based on input to create hash)
Same input will generate same hash
Hash types
o MD5 (128 bit)
o SHA1 (160 bits)
o SHA256 (256 bits)
o SHA384 (384 bits)
o SHA512 (512 bits)
Git uses SHA1 hash function (which is hexadecimal format)
160 bits means (git can store 2^160 files in the same repository)
o Therefore the probability of a hash collision is 1/ (2^320)
40 hexadecimal (base 16 0 – 9 and A - F ) characters
Slight change to input data can lead to completely different hashes
Commands such as shasum can be used in git bash to generate a hash
Each time you save it creates a unique ID which is the SHA1 Hash
Git cat-file -p <hash (can enter first 4 characters atleast) > - contents of the object
Git cat-file -s <hash> - size of the object (in bytes)
Git cat-file -t <hash> - type of the object
Git cat-file command doesn’t have options for retrieving the filename from the blob because blobs
don’t store filenames
GIT HASH-OBJECT
Echo “hello, Git” | git hash-object --stdin -w (stdin standard input and -w create git object)
o Git hash-object <filename> -w
Git repository stores files independently in its own file system in the objects directory
Git generates SHA1 hash based on input+type+size
GIT OBJECT
TREES
Representation of repositories
Structure is the same of a blob (type, length and hash)
Contents include permissions, type (tree), SHA1 hash and file name
PERMISSIONS
Nano temp-tree.txt
<permission type> (space) <object type> (space) <sha1 hash of the object> (tab) <filename>
16 (due to 16 different hexadecimal characters) * 16 (due to the folders having 2 characters) which
means 16^2 = 256 in git objects folder
COMMITS
Commit contains same structure as blob and tree (object type, length and delimiter)
Commit has information such as author name, email, commit description and parent
Each commit has its own SHA1 hash (due to author name and email)
Commits allow us to store in git database for different versions of the project
Able to very easily and fast move to any version of the project by checking out specific commit
Commit is a wrapper for the tree object and contains pointer to a specific tree
By moving to different commits (checking out – taking files out of git repository and putting it into
your directory) you are able to “travel” between different “versions” of the project
CREATING COMMIT
PARENT COMMIT
GIT CHECKOUT
Branch is just a text reference to the commit (separate history of your project)
Default branch is master
Multiple branches can exist in the same repository
Pointers for all branches are located in .git/refs/heads folder
Current branch tracks new commits and creates new commit objects
o For example, you are in master branch and made several commits and checkout to a new-
branch and have made couple of commits. However, when you checkout to master branch,
you will not see commits from new-branch as the current branch tracks new commits and
those commits are not added to other branches but there are other ways to marriage
branches or rebase them.
Branch pointer moves automatically after every new commit
Change branch git checkout <branch>