Git and Github interview questions
Git and Github interview questions
Security and Integrity: Each file in Git has a unique SHA-1 hash,
which means unauthorized changes are easily detected.
Code Reviews: Git simplifies the process of reviewing code before it's
incorporated into the main codebase.
Key Concepts
Commit
Branch
Master is a default branch present in all Git repositories. You can create
many other branches.
Merge
Beginner Tips
Make frequent commits.
Write clear and concise commit messages.
Push your code to a remote repository often, especially before taking a
break or finishing a task.
Unique Features
Distributed Model: Git decentralized nature empowers every set of
files with a complete repository history and enables full-fledged
operations without network connectivity. This stands in contrast to
centralized systems that require a constant network connection and
utilize a single centralized server for all project data.
Git
Git is a distributed version control system that manages code and
tracks its changes. It's mainly a local solution, only allowing developers to
coordinate directly.
Key Features
GitHub
GitHub, on the other hand, is a web-based platform that provides hosting
for Git repositories. It offers features that extend beyond Git's core
functionality.
Key Features
Every Git repository has two primary parts: the working tree and the
hidden .git directory, where Git keeps all of its metadata for the repository.
Key Features
Distributed: Each contributor to the project has their own copy of the
repository, including its complete history and version-tracking
capability. This enables offline work and serves as a backup for the
project.
Efficient Storage: Git optimizes storage using techniques such as file
hashing and delta compression, emphasizing minimal redundancy and
data redundancy elimination.
Work Tracking: Git allows for tracking changes made to the project
over time, helping individuals or teams understand the evolution of the
project, who made specific changes, and why.
.git Directory
The .git directory is the control center of a Git repository, housing
everything Git needs to manage the repository and its history. It includes:
The object database that hosts all the data about commits, trees,
and blobs.
The references directory maintaining pointers to specific commits
(e.g., branches and tags).
Configuration files detailing the repository's settings and attributes.
Working Tree
The working tree is a directory where tracked files are extracted
for viewing and editing. Any changes made to files in the working tree can
be staged and committed to the repository to capture the changes.
Untracked files, i.e., files not previously staged or committed, coexist with
the working tree and reside outside the Git management.
# Now, we can simulate the construction of our working tree using a dictionary
working_tree = {
'folder1': {
'file1.py': 'print("Hello, World!")'
},
'folder2': {
'file2.py': 'print(2 + 2)'
}
}
Practical Applications
Code Reversion: You can reset a project to an earlier commit,
effectively reversing changes.
Collaboration: Commits facilitate collaboration by enabling team
members to understand changes and when they were made.
Code Review: Before incorporating changes into the primary branch,
commits offer a structured way for team members to review each
other's code. This process ensures that new code meets the
repository's standards.
Working Directory
Staging Area
Repository
Core Concepts
Working Directory
The working directory is your playground. It's your space for experimenting,
where you can edit files, merge different versions, and even create new
ones.
MDS: Files that have been modified since your last commit
SML: Files that have been changed and prepared for your next commit
When you're happy with your changes in the staging area, you "stage" them.
Staging in this context is like flagging certain parts of your project that you
want to be saved in your next commit. Notice that this step is
completely optional.
Repository (History)
The repository is the final tier, dealing with the preservation of your work.
Think of it as the local repository that serves as the safe house for all your
commits. It keeps track of all changes and commits you've made, ensuring
that you can still retrieve them even after major updates.
The repository is also broken down into three apparent "trees" - the Working
Directory, the Staging Area, and the last one, the HEAD. The HEAD points to
your most recent commit. These three trees can seem confusing at first. But
really, they're straightforward to understand - they represent the current
status of your project, changes that you're planning to commit, and the
commits that you've confirmed you want to save in the future.
Key Concepts
Master/Trunk/Branch: The primary branch serving as the project
source.
3. Risk Mitigation: Changes are kept separate until they are thoroughly
tested. If an experimental feature doesn't work as expected or
introduces bugs, it won't affect the more stable Master branch.
Key Attributes
Two-Part Process: Cloning involves duplicating the repository and
setting up the local environment.
Full Version History: The cloned repository typically includes all
commits and file versions.
Remote Linkage: A relationship is built with a designated remote for
data synchronization.
Terminology Overview
Local Repository: Storage on the user's machine where all project
and version data reside.
Remote Repository: An external, shared storage often on a server,
accessible for collaborative work.
Working Directory: The location on the user's file system where files
are manipulated and changes are tracked for commits.
Core Functionality
Duplication: The clone process reproduces the entire project, with all
branches and commits, onto the user's machine. This offers a
complete, standalone history.
Configuration: The local environment is configured to maintain
synchronization with a specific remote repository. By default, the
repository from which the clone is made (origin) is set as the primary
remote, but additional remotes can be added if required.
Integrity: Git ensures the integrity of the copied data, such as
commits and file snapshots, during the cloning process.
Selective History Contraints: While it's common for a clone to
receive the entire repository history, limited cloning is also possible,
especially with large repositories. Technique like shallow cloning or
specifying a certain commit for cloning can be used.
For GitHub, cloning often uses the HTTPS mode. Upon updates to the remote,
the local repository can be synchronized using standard Git commands
(e.g., fetch, pull, push).
Technically, clonality is not tied to a specific remote URL; it's more about
aligning the histories. For instance, after cloning from GitHub using HTTPS,
remotes can be reconfigured to use SSH.
Detailed Process
1. Acquiring Data: The clone command contacts the remote repository,
retrieves its history and files, and then saves this snapshot locally.
2. Repository Creation: The git clone execution sets up a new
repository, employing the data fetched from the target remote, and
thus creates both the working directory and associated .git directory,
which hosts the entire Git control structure.
Technical Commands
Clone: Initiate the cloning process.
git clone <remote-repo-url>
Repository
The repository houses the following key components:
Let's assume a file's content changes and you add the updated file to
the staging area.
The new version of the file is then stored as a Blob object, and the
associated Tree object also gets updated.
During the following commit, a new Commit object is created, which
references the updated Tree object.
Simultaneously, the Branch you're working on moves forward to
reference this new commit.
This systematic approach assures the integrity of your project's history.
Since any alteration in content, directory structure, or commit details will
trigger a change in the SHA-1 hash of the related object(s), a change can be
immediately recognized.
git init
Use the following command to add files or directories to the staging area:
# To add a specific file
git add filename
Committing Changes
Once files are in the staging area, use this command to commit these
changes:
git commit -m "Your commit message goes here"
If you plan to push your local repository to an online service like GitHub,
first link your local repository to a remote one:
git remote add origin your-remote-repository-url
Replace "your-remote-repository-url" with the URL provided by your online
repository.
Pushing to Remote
Finally, use this command to push your local repository to the remote one:
git push -u origin master
After the initial "upstream" link is established, subsequent push commands
can be executed with a simple git push.
status command.
git status is an essential command for tracking changes in a Git
repository. It provides important details such as file states, current working
branch, untracked files, and more.
Changes not staged for commit: Highlights modified files that were
not yet staged for the next commit.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: app.js
Untracked files:
(use "git add <file>..." to include in what will be committed)
server.js
no changes added to commit (use "git add" and/or "git commit -a")
Code Example: Visual Diff of Staged Changes using git
diff --staged
Here is the code:
diff --git a/readme.txt b/readme.txt
index ce01362..0602794 100644
--- a/readme.txt
+++ b/readme.txt
@@ -1 +1 @@
- Initial file version
+ Modified during stage
Code Example: Multifile git status Output
Here is the code:
On branch: master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: file1.cpp
modified: file2.cpp
Both file3.cpp and file4.cpp are untracked.
Your branch and 'origin/master' have diverged,
and have 4 and 2 different commits each, respectively.
Best Practices
Regularly use git status to stay updated on the repository's state.
Review and comprehend the provided information before proceeding
with any actions, such as staging or committing.
2. File Comparison: Git analyzes the selected files and identifies the
exact line-by-line or block-level modifications.
3. Staging Area Update: The selected file changes are prepared and
categorized. Once you're ready, you can make a commit action to
move all the changes in the staging area to the repository.
Relation Manager
The staging area acts as a bridge between your working directory and the
repository (history).
Key Actions
Add All Modifications: Restage modified files to the current state: git
add .
Add Interactively: Handpick changes from modified files: git add -p
Merge Conflicts: After addressing conflicts, ready files for
commit: git add <filename>
# Current content
print("This line is part of the initial version.")
Untracked Files: These are new files that Git does not track yet. To track
untracked files and subsequently include them in a commit, they need to be
staged using the git add command.
Tracked Files: Files that have previously been committed or added to the
staging area can be included in the next commit.
Parameters (optional):
-m
Use a one-line message to describe the commit.
git commit -m "Add new feature xyz"
-a
Stage all modified files and commit them.
This command omits the staging step, effectively combining git add and git
commit. Since it bypasses the staging area completely, care should be taken
when using it, to ensure only intended changes are included in the commit.
git commit -am "Update feature xyz"
When using -a without -m, Git will open
the default text editor for you to
compose the commit message in a separate file.
CAUTION:
Avoid tracking, staging, or committing sensitive information.
Keep .gitignore up to date, especially in collaborative projects.