Learning Objectives:
Use an advanced understanding of Git
Skip the staging area to delete and move files within Git
Amend and roll back commits
Explain the concept of branching and merging
Create new branches
Use merging to combine branched data
Manage and handle merge conflicts
Table of Contents |
---|
Advanced Git Interaction
Skipping the Staging Area
If we already know that the current changes are the ones that we want to commit, we can skip the staging step and go directly to the commit. We do this by using the -a flag to the git commit command.
...
This flag automatically stages every file that's tracked and modified before doing the commit letting it skip the git add step.
At first, you might think that git commit dash a is just a shortcut for git add followed by git commit but that's not exactly true. Git commit -a doesn't work on new files because those are untracked. Instead, git commit -a is a shortcut to stage any changes to tracked files and commit them in one step.
The head indicator is moved to the latest commit.
...
Git uses the head alias to represent the currently checked out snapshot of your project. This lets you know what the contents of your working directory should be. In this case, the current snapshot is the latest commit in the project. Think about it as a bookmark that you can use to keep track of where you are. Even if you have multiple books to read, the bookmark allows you to pick up right where you left off.
We'll soon learn about branches. In that case, head can be a commit in a different branch of the project.
As a shortcut, it's generally easy to think of head as a pointer to the current branch, although it can be more powerful than that.
Getting More Information About Our Changes
We've seen how git log shows us the list of commits made in the current Git repository. By default, it prints:
the commit message,
the author, and
the date of the change.
This is useful, but if we need to look at the actual lines that changed in each commit, we can do this with git log -p. The p comes from patch, because using this flag gives us the patch that was created.
If we don't want to scroll down until we find the commit that we're actually interested in, another option is to use the git show command. This command takes a commit ID as a parameter, and will display the information about the commit and the associated patch.
...
Another interesting flag for git log is the --stat flag. This will cause git log to show some stats about the changes in the commit, like which files were changed and how many lines were added or removed.
...
Sometimes it can take a while until we're ready to commit. Imagine you've been working on adding a new complex feature to a script and it requires thorough testing. Before committing it, you need to make sure that it works correctly.
Check that all the test cases are covered and so on and so on. So while doing this you find bugs in your code that you need to fix.
It's only natural that by the time you get to the commit step you don't really remember everything you changed.
To help us keep track git gives us the git diff command. This format is equivalent to the diff -u output that we saw in an earlier video.
We could pass a file by parameter to see the differences relevant to that specific file instead of all the files at the same time.
Something else we can do to review changes before adding them is to use the -p flag with the git add command. When we use this flag, git will show us the change being added and ask us if we want to stage it or not.
...
git diff shows only unstaged changes by default. Instead, we can call git diff -- staged to see the changes that are staged but not committed.
Deleting and Renaming Files
Let's say that you've decided to clean up some old scripts and want to remove them from your repository.
You can remove files from your repository with the git rm command, which will stop the file from being tracked by git and remove it from the git directory.
File removals go through the same general workflow that we've seen. So you'll need to write a commit message as to why you've deleted them.
...
23 lines in the file that are no longer there. And it states the file itself was deleted.
What if you have a file that isn't accurately named?
You can use the git mv command to rename files in the repository.
...
The git mv command works in a similar way to the mv command on Linux and so can be used for both moving and renaming.
If our repository included more directories in it, we can use the same git mv command to move files between directories.
Info |
---|
The output of git status is a super useful tool to help us know what's up with our files. It shows us which files have tracked or untracked changes, and which files were added, modified, deleted or renamed. |
If there are files that get automatically generated by our scripts, or our operating system generates artifacts that we don't want in our repo, we'll want to ignore them so that they don't add noise to the output of git status.
To do this, we can use the .gitignore file. Inside this file, we'll specify rules to tell git which files to skip for the current repo. To do this, we'll create a .gitignore file containing the name of this file.
Remember that the dot prefix in a Unix-like file system indicates that the file or directory is hidden and won't show up when you do the normal directory listing. That's why we have to use ls -la to see all files.
We've added a .gitignore file to our repo but we haven't committed it yet. This file needs to get tracked just like the rest of the files in the repo.
...
Advanced Git Cheat Sheet
Command | Explanation & Link |
---|---|
git commit -a | |
git log -p | |
git show | |
git diff | Is similar to the Linux `diff` command, and can show the differences in various commits |
git diff --staged | An alias to --cached, this will show all staged files compared to the named commit |
git add -p | Allows a user to interactively review patches to add to the current commit |
git mv | |
git rm | Similar to the Linux `rm` command, this deletes, or removes a file |
There are many useful git cheatsheets online as well. Please take some time to research and study a few, such as this one.
.gitignore files
.gitignore files are used to tell the git tool to intentionally ignore some files in a given Git repository. For example, this can be useful for configuration files or metadata files that a user may not want to check into the master branch. Check out more at: https://git-scm.com/docs/gitignore .
A few common examples of file patterns to exclude can be found here.
Undoing Things
Undoing Changes Before Committing
You can change a file back to its earlier committed state by using the git checkout command followed by the name of the file you want to revert.
...
With that, we've demonstrated how we can use git checkout to revert changes to modify files before they get staged. This command will restore the file to the latest storage snapshot, which can be either committed or staged.
If you need to check out individual changes instead of the whole file, you can do that using the - p flag. This will ask you change by change if you want to go back to the previous snapshot or not.
That's it for undoing unstaged changes.
What if you added the changes to the staging area already?
We can unstage our changes by using the git reset command. Staging changes that we don't actually intend to commit happens all the time. Especially if we use a command like git add star, where the star is a file glob pattern used in Bash that expands to all files. This command will end up adding any change done in the working tree to the staging area.
...
We can see that this output file, which was supposed to be a temporary file for debugging, has now been staged in our repo but we didn't want to commit it.
Conveniently, the git status command tells us how to unstage the file right there in the output.
The example output mentions the head alias, the current checked out snapshot. So by running the suggested command, we're resetting our changes to whatever's in the current snapshot.
You can use git reset - p to get git to ask you which specific changes you want to reset.
Amending Commits
Let's say you just finished committing your latest batch of work, but you've forgotten to add a file that belongs to the same change. You'll want to update the commit to include that change. Or maybe the files were correct, but you realize that your commit message just wasn't descriptive enough. So you want to fix the description to add a link to the bug that you're solving with that commit. What can you do?
We can solve problems like these using the --amend option of the git commit command. When we run git commit --amend, git will take whatever is currently in our staging area and run the git commit workflow to overwrite the previous commit.
...
The list of added files for this commit now includes both files that we wanted to add. Now that the files have been added, we can also improve our initial commit message which was a bit too short.
...
Let's save the new description as usual. We've amended our previous commit to include both files and a better message.
...
You could also just update the message of the previous commit by running the git commit --amend command with no changes in the staging area.
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
While git --amend is okay for fixing up local commits, you shouldn't use it on public commits. |
Meaning, those that have been pushed to a public or shared repository. This is because using --amend rewrites the git history removing the previous commit and replacing it with the amended one. This can lead to some confusing situations when working with other people and should definitely be avoided.
So remember, fixing up a local commit with amend is great and you can push it to a shared repository after you fixed it. But you should avoid amending commits that have already been made public.
Rollbacks
There are a few ways to rollback commits in Git. For now, we'll focus on using the git revert command. Git revert doesn't just mean undo. Instead, it creates a commit that contains the inverse of all the changes made in the bad commit in order to cancel them out.
For example, if a particular line was added in the bad commit, then in the reverted commit, the same line will be deleted.
This way you get the effect of having undone the changes, but the history of the commits in the project remains consistent leaving a record of exactly what happened.
We can revert the latest commit by using the head alias that we mentioned before. Since we can think of head as a pointer to the snapshot of your current commit, when we pass head to the revert command we tell Git to rewind that current commit.
...
So once we issue that git revert HEAD command, we're presented with the text editor commit interface that we've all seen before. In this case, we can see that git has automatically added some text to the command indicating it's a rollback. The first-line mentions that it's reverting the commit we just did called “Add call to disk full function”. The extra description even includes the identifier of the commit that got reverted.
...
While we could use this description as is, it's usually a good idea to add an explanation of why we're doing the rollback. Remember that the goal of these descriptions is to help our future selves understand why things happen. In this case, we'll explain that the reason for the rollback is that the code was calling a function that wasn't defined. Once we're done entering the description, we can exit and save as usual.
...
You'll notice the output that we get from the git revert command looks like the output of the git commit command. This is because git revert creates a commit for us.
Let's look at the last two entries in the log using -p and -2 as parameters.
...
As demonstrated before, the -p parameter lets us see the patch created by the commit while the -2 perimeter limits the output to the last two entries.
So in this log, we can see that when we called revert, git created a new commit that's the inverse of the previous one.
We can see that the original commit shows the lines we added by preceding them with a plus sign.
The same line shows up with a minus sign in the newer commit message indicating that they were removed.
In this example, we reverted the latest commit in our tree. But what if we had to revert a commit that was done before that?
Identifying a Commit
We can target a specific commit by using its commit ID. Commit IDs are those complicated looking strings that appear after the word commit in the log messages.
The commit ID is a 40 character long string. This long jumble of letters and numbers is actually something called a hash, which is calculated using an algorithm called SHA1.
Essentially, what this algorithm does is take a bunch of data as input and produce a 40 character string from the data as the output. In the case of Git, the input is all information related to the commit, and the 40 character string is the commit ID.
Cryptographic algorithms like SHA1 can be really complex, so we won't go too deep into what this means.
Still you might be wondering, why on earth would you use a long jumble of letters as an ID for commit, instead of incrementing an integer, like 123, etc?
To answer that, let's take a quick look at the reason why Git uses a hash instead of a counter, and how that hash is computed.
Although SHA1 is a part of the class of cryptographic hash functions, Git doesn't really use these hashes for security.
Instead, they're used to guarantee the consistency of our repository. Having consistent data means that we get exactly what we expect. This is really useful in distributed systems like Git because everyone has their own repository and is transmitting their own pieces of data.
Computing the hash keeps data consistent because it's calculated from all the information that makes up a commit. The commit message, date, author, and the snapshot taken of the working tree.
...
The chance of two different commits producing the same hash, commonly referred to as a collision, is extremely small. It'd take a lot of processing power to cause this to happen on purpose.
If you use a hash to guarantee consistency, you can't change anything in the Git commit without the SHA1 hash changing too.
Remember our discussion about fixing commits with the --amend command? Each time we amend a commit, the commit ID will change. This is why it's important not to use dash dash amend on commits that have been made public.
The data integrity offered by the commit ID means that if a bad disk or network link corrupt some data in your repository, or worse, if someone intentionally corrupt some data, Git can use the hash to spot that corruption. It will say, the data you've got isn't the data you expected, something went wrong.
How can you use commit IDs to specify a particular commit to work with, like during a rollback?
...
Let's look at the last two entries in our repo using the git log -2 command. Say we realized that we actually liked the previous name of our script, and so we want to revert this commit where we renamed it.
First, let's look at that specific commit using git show. We've copied and pasted the commit ID that we wanted to display, and that works.
Alternatively, we could provide just the first few characters identifying the commit to the command, and Git will be smart enough to guess which commit ID starts with those characters, as long as there's only one matching possibility. Two characters is not enough, but usually four to eight characters will be plenty.
...
Okay, now that we've seen how we can identify the commit that we want to revert, let's call the git revert command with this identifier.
...
As usual, this will open an editor where we should add a reason for the rollback. In this case, we'll say that the previous name was actually better.
...
As we called out before, when we generate the rollback, Git automatically includes the ID of the commit that we're reverting. This is useful when looking at a repo with a complicated history that includes a lot of commits.
Now, once we save and exit the commit message, Git will actually perform the rollback and generate a new commit with its own ID.