You forgot log, cherry, stash, blame, which are also daily use commands. You also have to learn a lot of concepts to even understand the help and error messages for these commands - the worktree, the index, the stash, HEAD, ours vs theirs, conflicts, remotes, tracking branches, when it is safe to push -F, etc. Depending on the project you joined, you may also have to immediately learn about git LFS, submodules, squashing, fast-forwards, tags.
You have to understand a lot of things before you can somewhat comfortably use git, much more than something like P4.
Based on my experience - most teams will get by with ivanhoe's list.
You additional items are definitely helpful - but the typical team member's workflow won't cross those bridges. Team leads perhaps.
There's definitely levels of proficiency, but for a junior or even mid-level dev doing feature work, I think there red flags if they are needing to jump into stash, cherry, etc on a daily basis.
As I said to the other commenter - I don't even know how to use git without the stash, since you need everytime when you have some local changes but want to pull from the remote - the only alternative I know of is committing your local changes instead of stashing them.
Also, git lfs and git submodules and their associated commands are necessary or not based on the project, not on your personal level of proficiency.
I also don't know of any workflow where you don't need to look at the log at least once every few days, even as a junior dev, to confirm if a bug is fixed in a build or not if for no other reason.
At work I have 30 stashed things some of which go back 2 or 3 years. I got distracted when I pull and forget to pop sometimes, or I was working on something so trivial I never unstashed it.
That doesn't seem great... Usually these days I try to put my temporary work on a temp commit or branch at least so I don't lose it to the stash. I'm not saying I'm stashing properly, just that stash is easy to mess up.
Also probably half of these are things I popped but had merge conflicts or something. I fixed the merge conflict why is it still there? (I know the reason, but still).
> As I said to the other commenter - I don't even know how to use git without the stash, since you need everytime when you have some local changes but want to pull from the remote - the only alternative I know of is committing your local changes instead of stashing them.
I usually do all the work in branches anyways, so I'll just create a quick branch and commit it there.
> I usually do all the work in branches anyways, so I'll just create a quick branch and commit it there.
I never work on a checked-out version of a branch that's currently being updated. I always check out a new branch for my work and rebase or merge off the branch others are working on.
I'm fully aware of stash, and every now and then I use it, but it's pretty infrequent. Seeing as you need to learn branching anyway it's not going to be important for every workflow.
I've used both but generally don't use stash. I think it's more expressive to have a single commit like "wip: some simple description goes here" and then to `git commit --amend` this until we are happy with its contents.
As well as being more descriptive if you come back after a long weekend, this also means that you can swap branches without worrying that your stash relates to a particular branch but doesn't explicitly belong to this.
Tip, instead of amending, it's also straightforward to "uncommit" a temporary commit when you come back to it. `git reset HEAD~` and the state of the commit becomes the state of your working tree instead.
You can set `merge.autostash` and/or `rebase.autostash` to `true` in your global git config and then you can `git pull` with a dirty working tree.
Of course, to the point of this thread, that is yet another concept you have to understand to get a sane working environment. I assume autostash is disabled by default because the post-merge/rebase unstash could result merge conflicts and be a pain to unwind if you change your mind. If you just blindly set this option based on someone else's recommendation without knowing what the stash is, and things go wrong, you'll require a mental model of the steps involved to fix it, which you won't have.
That's one more confusion vector. I for example use Git for like 10 years now and I didn't know about this option. It didn't even occur to me that it could exist.
It might help creating patches for the dirty state with something like `git diff > feature.patch` and then drop the current changes. After update, it's another `git apply feature.patch` to restore the changes.
This helps with remembering what each change means since `git stash show` is rather awful. YMMV
git fetch && git rebase origin/master, which is equivalent to git pull --rebase origin master, requires your worktree to be clean, so you must either commit or stash if you have any local changes.
I think that's a very good way to have a long list of things stashed that you don't remember what they are.
Commiting your changes is much better as you can add an explanation for your futur self.
Ok, so you don't need to know git stash (good) but you need to know that autostash exists and how to configure it, so you still don't get away without an extra concept. I suppose the nice part about autostash is that it's easier for someone to just give you a .gitconfig with it, and not have to teach you how to use git stash.
The `.gitconfig` and also be 'given' to you by just being checked into the project. Someone has to know that you can have per-repo configs, but not everyone. ;)
> I think there red flags if they are needing to jump into stash, cherry, etc on a daily basis.
Without stash, things can be nasty. Some alternatives:
1. Commit frequently, one tested piece of code at a time.
I'm sure this is what you meant. Not every environment and process allows this and not every change is small.
If you can though and have time, do this. If needed perhaps you can combine things and rebase later into larger commits. This is the reliable and clean process of development that everyone would love to have as a base, and then to do whatever they feel with it and it always stay like this with hard work and dedication.
Let's move on because that isn't always going to be the case.
2. Commit unfinished/non-working code.
The best analogy of this is like leaving unfinished crap all over the place that might look finished to some. At some point, it may go wrong. If you do this, you might name the branch with something standard indicating it's unfinished.
This always seems to happen when a developer leaves for a holiday/vacation or leaves permanently. Then some other developer tries to build/test it, it works, they smooth some rough edges and commit it. In my experience, it is was truly unfinished, the quality may end up being somewhere between a point on a line anywhere below the initial committer's typically code quality and anywhere below the fixer's code quality. There are exceptions, but as a general rule, be more careful with such commits if the personal investment and sense of ownership is not strong in that code.
3. Multiple copies of the repo.
It's likely going to be less efficient to have multiple copies of the repo from a storage standpoint.
Showing the stashes may be more efficient than searching through different versions of the files in copies of the repo or having to recursively diff repo copies.
Using multiple copies of the repo may also increase the chance of things going wrong or history being lost.
I still would recommend having a backup of anything really important in the repo at times, if you're not feeling confident or are worried about losing anything.
4. Throw away code changes.
Every time you switch priorities, you could throw away all of the work you had locally. If it was crap, this might be best. Be careful; you could lose something important.
5. Manually copy changed files to another area outside of the repo to ensure it doesn't get stomped accidentally.
This can be messy, but with tools to make it easier, it might be "ok". Compared to git stash though, it's likely less efficient, because whole files are being copied instead of just the changes.
6. Manually backup only the diff/patch files of changes.
Well, now you're just recreating git stash functionality, but sure, you could do that. People did this before git stash and still do. Create patch files. It doesn't sound as easy or clean, and you've got to put those patch files somewhere. Will that be consistent between developers? If a developer leaves or is unavailable, where would you find them?
> I think of log, cherry-pick, stash & blame "quarterly use" commands
At work we do care that our git history makes sense, is free of random nonsense, but contains only self-contained commits with reasonable documentation. So I call git log many times every day, I would not know how to do it without.
Cherry-pick probably depends how may maintenance branches you maintain. We don't have many, so I don't need it very often. I guess that could be very different for someone required to support long product life cycles. Sometimes when I need to split or unite development branches that need major reorganization I use it.
Blame I need soon when I try to understand others code. Sometimes even on my own. Also when you get bug reports form the field, to understand how long certain bugs existed. Not always to be able to blame the author, but just understanding how long a line has been unchanged.
Personally I don't use stash a lot because I'm not afraid of committing anything to my working branch. I can always fix the code or the history later. Or I make a temporary branch with a descriptive name than just stash. Some use stash more frequently I have noticed.
How can you pull without using stash? Do you always commit everything before pulling?
You also need the log daily to know things like "what changes made it into this build", or almost everytime I fix a merge conflict, to understand why something is the way it is.
I can grant that cherry-pick & blame are more rarely used, though blame is often on by default in many editors, and cherry-pick is something my team does daily around every release (since we don't want to merge the trunk into the release branch the day of the release for 1 bugfix).
More typically, I will "git fetch origin" to fetch the current integration branches, then "git checkout -b <feature branch> origin/master" to start a new feature branch from a given integration branch, then push that once the change is completed.
If I need to update a local copy of an integration branch, then I might well use "git pull". But I would never have any local changes made there which would require stashing, since all changes are done on feature branches.
I use git pull pretty frequently. I like to keep my main branch up to date with the main branch of the remote repo. I find it's helpful to have that consistency across contexts.
I think diffs are faster if it's to the local repo vs the remote repo.
I always make branches off local-main, as opposed to remote main as in your example.
I think it's also helpful to have the main branch replicated across as many machines as possible. There have been one or two times where a dev has deleted remote main at my company where some (very VERY CALM) git push solved the problem. I probably could have checked out a tracking branch and fixed it that way, but having a local copy of main made it a one-step process (I think the resolution was just to git push the main branch?).
There are few differences between "local" and "remote" main in GP's example. "origin/master" is a _local_ copy of the origin's master branch, and updated with every "git fetch". So "git push origin origin/master:master" resolves the "remote master was deleted" without having to stash/checkout/etc.
It's just a preference thing. If I'm sitting down to write a new feature I can do the git fetch origin and go directly to a branch, but if I'm investigating an issue, I'll often open that develop/master branch directly, find the problem, do a pull to make sure there are no changes that got checked in, then "git checkout -b".
Alternately if I'm helping other developers, I'm on their branch a lot, and it basically becomes a "push your changes and I'll take a look" and I just open that folder and git pull.
I think I use more git pull now than I did originally, because I'm working on less of my own code.
I basically never use git pull. I use git fetch and then I check (using git diff, git log or in confusing cases gitk) what I want to do. That could be merge (fast forward), rebase, ignore or either branch because the other one is just better. Of course often pull might do what you want, but when it doesn't people start to complain. So better avoiding surprises from the beginning.
It sounds like you finish your work much faster than I do. I usually go for the same workflow as you do, but it can take days between a branch being created and the moment I'm done with it. During this time, I often want to keep up with the latest changes on master. Other times, more than one person works on a feature branch, so they still need to git pull from origin/feature-branch into feature-branch to get the changes from the other committer.
I would be very happy if I could just merge to master every day, but that is extremely rare for me and my team.
No, I don't necessarily work any faster. But I will switch between multiple feature branches as necessary. And to keep up with master, I'll rebase my feature branch onto origin/master or depending upon the circumstance I'll directly merge origin/master. Both keep me up-to-date without a direct pull.
Well, if you're switching branches, you already have problems with a dirty worktree. Also, pulling can actually mean rebasing (I always use pull --rebase).
My team works on feature branches that get merged to master. Only master ever needs a git pull and you never work directly on master. Every team I've worked with over the last ten years has done some version of this work flow with small deviations.
Your uses for log sound legitimate but usually they are resolved over other communication channels (in person or over slack or in release notes) on the teams I've worked on.
I almost never use `git pull`. I have a helper script (written by an ex-coworker's father ...) that fetches + updates my master branch head to the proper place, without changing away from my current branch.
If I want to have things-from-master in my current branch, I either rebase my branch onto master or merge master into my branch. (I'm lucky that my branches are short-lived enough that rebasing on top of master is almost always easy and the right choice.) If I have unfinished things, I'll make a "WIP" commit, that I will unroll later after doing the rebase.
I nearly never use `git blame` on the command line, but have a very frequently used hotkey bound to it in my editor to show the blame annotations.
It sounds like you're using a very different and much more complex process than average. Keep in mind for a lot of web projects releases are every day, sometimes multiple times per day.
If you take out the concept of builds, you just have short lived feature branches off of master, you're left with perhaps even less than the 8 commands listed above.
For our project, a flow like `create feature branch -> implement -> push -> review -> merge to master` is rarely something that finishes every day (but there are daily changes to master from someone else on the team). Builds are something that happens for every change to master, and there is constant QA on those builds. Even if we chose to release those builds to customers, the question of "did a fix for bug X make it into build Y" would still arise quite often.
Yes. Doing a “git commit -am savepoint” on a feature branch is not a big deal; it will get folded into the squash merge with everything else in that unit of code review.
If you have a dirty worktree, `git pull --rebase` fails. You need to either commit or stash your local changes. To stash them, you can use git stash. If you decide to commit them, that's ok, but then you also need to learn about `git reset --hard` or `git commit --amend` or `git push --squash` if the changes were work-in-progress.
So which would you recommend?
Oh, there is also the option of enabling autostash I think, but that is relatively recent (maybe 1 year?)
If you have to pull while your work tree is dirty, it’s best but not easiest to branch & commit (safest and avoids the possibility of conflicts during pull), or commit & pull with rebase (safe but might have conflicts).
In case you weren’t aware, stash is not as safe as other git commands, it doesn’t have the same safety net and reflog support as a commit does. It’s relatively easy to drop stashes accidentally and lose them forever. I’ve seen people do this in production by not being careful when pulling while they have a dirty work tree. The man page for git stash mentions this fact:
“If you mistakenly drop or clear stash entries, they cannot be recovered through the normal safety mechanisms. However, you can try the following incantation to get a list of stash entries that are still in your repository, but not reachable any more:
It's interesting that you mention branching, because that is another workflow where I feel that I constantly have to reach for stashing - when I simply want to move to another branch (either for a quick bugfix or simply to check something).
Otherwise yes, I know you can relatively easily lose work with git stash (I actually once lost about 2-4 days of work with a P4 shelve, which is an extremely similar feature), but it still seems easier to me than committing and then cleaning up history/resetting if you change your mind about the implementation...
It might be worth really picking apart why you think you need to stash instead of commit, and why stash feels easier. I honestly don’t think committing is any more work than stashing at all, not even more typing. It’s really not more difficult to commit what you have temporarily, go to another branch to work, then come back and continue working, compared to stash. But I don’t doubt that it feels more difficult for some reason. Is it the need to make a commit message that causes the mental friction? Does git reset somehow seem tricker than git stash pop? Think about how you typically rebase when you’re done anyway, so that means when you come back to a branch with a temporary commit, you don’t have to stash pop, you can just add more changes and squash or rebase it all later. With the stash workflow you have to keep the stashes in your head, and manually pop them. They get harder to manage than branches when you have multiple stashes, since they’re disconnected from their context. Personally I think if you examine what it really takes to use commits & branches, and practice using them more, you may find it’s just as easy as the stash workflow, if not easier, and it comes with a wider safety net and is easier to manage when you have a lot of different changes in flight at once. I also think it helps to realize that a branch is nothing more than a named SHA, they are incredibly lightweight, and using them ought to reflect that.
I think it's primarily a question of focus: I want to pull or to switch a branch. The fact that I have to do something like committing or stashing is already an annoyance - having to come up with a commit message is even worse.
Perhaps I will give this commit/reset workflow a try as well, to see how it feels. It may also be that committing still feels to much like "an event" for me, from my P4 days.
You can, if you want, git fetch without committing or stashing, and then checkout just the files you want to update in your work tree, as long as they’re not the files you’re working on.
Aside from that, I hear you, git’s model is fundamentally different from Perforce. There’s no choice about whether you need a clean work tree before pulling, that’s simply a git requirement. So the main thing to ask now is what workflow you want and what safety net you want underneath it. On one side, the best way to focus is to not pull anything while you’re working and have a dirty work tree. But things come up at work, that’s not always realistic, so the next question is how to make the workflow both safe and also instinctual so that it doesn’t have friction.
I do think there’s something to your notion that commit feels more serious and heavy than stash, and that is something I personally have tried to break down. In git, commits and branches can be so much more lightweight, fluid and flexible, but it takes practice and fluency in git to be able to actually feel that.
It might be worth considering some shell or git aliases to do common things. You could easily alias a command that commits your work in progress with a comment “WIP”, and never have to type it. You can do the same with a branch and even keep around a branch for work in progress that is always temporary and gets force reset to whatever you’re working on. (Remember a branch is just a pointer and nothing more, you can completely change what it points to without hurting anything, and the change is stored in the reflog, so there’s a undo button.) I think it is possible to make a branch & commit workflow that is both easier and safer than stash.
Well, you get rid of the need to know git stash, but now you need rebase -i, amend, the idea of squashing, probably git reset if you're not happy with a commit... Whichever way you look at it, you need to learn quite a few more commands (and concepts!) than what the initial list claimed.
Cherry-pick probably depends on what kind of software you're working on. If there is alway exactly one version in production (two if you deploy gradually), then I don't see it being needed often. If you are supporting multiple versions at once, then it ends up getting used just about every time you have to fix a bug.
We also use it very often when we approach a release - usually there is a release branch that gets cut from master, and then for a while you have in parallel features going to master and bugfixes going to master and cherry-picked to release.
Rebasing is always a good skill to know for when someone inevitably commits binary files or secrets to a repo and you have to do a bit of surgery to fix it.
Bisect of course isn't required knowledge but by god is it one of the most useful git commands a developer could know.
I think I've used `reflog` maybe... twice in a decade of daily git use. It's crazy to me that anyone would think that it's a daily use command. I don't even know what bisect is off the top of my head.
I don't use `reflog` often but I've found that 9/10 times you make a serious mistake in git, it ends up being the go to solution.
As for `bisect`, it's magic. It is used to binary search git commits back to X point in time. You can use it to manually go through previous commands to find what code/how long ago a bug was introduced. You can also rig it up with a test command (say a unit test that you copy out of the working tree/keep in a separate worktree) and have it automatically sift through the commits and tell you the first point where it starts to fail.
Bisect is a debugging utility - it helps you do a binary search for the commit which introduced a bug by checking out code at certain commits, having you compile and run it, and tell git if the bug was there or not, and then going back or forward in history until you can identify the 1 commit that caused the bug to appear.
I've never used it, but it sounds like a nice tool of last resort.
It is actually a great first debugging tool for regressions (unless you already have a hunch where the problem is). It's usefulness is greatly enhanced if you keep your commits small.
Usually, I prefer to investigate a bug starting from the code rather than the history, if it's possible. In my experience, most bugs have taken longer to reproduce repeatedly than to understand from code. Besides, there's no guarantee that a bisect will land at the right commit if the bug is not reliably reproducible.
But I have been in situations where I had to manually bisect the code because I just couldn't understand how the code could reproduce the problem, and having git bisect would have been a significant help.
P4 definitely has its own rough sides, but I find the model is overall much simpler.
In essence, P4 is a file tracker - you have remote files and local files kept in sync (git clone, git fetch, git pull). The basic workflow is to modify, add, copy or delete one or more files (git add), group the changes into a changelist (git stage), give it a description (git commit), and submit that change list to the remote (git push). Of course, if someone else modified the same files between your last sync and your submit, you will also have to resolve conflicts.
Of course, you can sync all or part of your files to some specific older CL (git checkout). You can revert a change on the remote as well (git reset). When you look at a file or directory, you can see the log of all changes that happened to it (git log).
The biggest difference from Git is in the branching area - branches in P4 are simply copies of one or more files, together with some metadata to tell P4 that this is an intentional branch. Working with branches is simply working with different copies of a file. When you want to bring changes from one branch back into the other, you can merge (integrate in P4's parlance) one or more changes between the branches. At that time, it does a 3-way merge just like Git - last common ancestor, changes in branch A, changes in branch B => result. The most common way of creating branches is to have each branch be a copy of the entire project dir (so you would see on your local system ~/p4/root/proj1/main,~/p4/root/proj1/dev,~/p4/root/proj1/feature1 etc.).
One limitation of P4 branching is that you can't really revert a merge in P4. You can of course revert the changes, but the merge is forever considered the new common base for the two branches. So if you integrate branch A into branch B, then revert the change on branch B, and then try to integrate again, P4 will tell you that all of the changes have already been integrated. If you try to integrate B into A now, it will try to integrate the revert into branch A.
And this is essentially it. P4 scales much better than Git, so most companies have a single P4 repo, and everyone syncs only the parts that they need.
All of these things can be done from either the command line (p4 sync, p4 edit, p4 reconcile, p4 submit, cp + p4 branch, p4 integrate) or from the very simple first party GUI, P4V.
From CLI, p4 is pretty consistent - rather more so than git (which I suppose isn't a high bar!). But of course there are some inconsistencies - as with any tool, particularly if its been around a few years.
Small frequent commits will limit the need for stash. That's a good practice anyway.
The others I'd say could be completely excluded from daily/weekly/yearly use. If you are using cherry or blame daily then there is probably something wrong :D
Why is small frequent commits a good practice? What is to be gained from committing a change to a file, then a commit to undo the previous one if you decide the change wasn't useful/needed?
Also, cherry is often needed daily in short bursts around releases.
Blame is rarer, absolutely, but I don't think you're using your version control to anywhere near its full potential if you don't use blame (and log) while investigating complex bugs.
The most important reason imo is making code reviews easier. Enormous PRs = less confidence in merging code on a team. If you are pushing small commits then it is easier for the team to follow the changes and review them.
- easier to merge and ship small incremental changes than large ones
- easier to revert small commits
- better commit messages since you can summarize the smaller change instead of "coded a lot of things"
- if you have a messy commit history in a PR you can squash them when merging upstream
I'm not sure what your git methodology is but I tend to avoid cherry.
We've got a clean production/master branch that we merge develop into on releases. We merge feature branches into develop. Very very rarely we may hotfix something on develop. I don't think I've used cherry in 6 or 7 years on repos with 2-30 active devs.
What are you using cherry for daily around releases?
All of those only make sense to me if a commit is a finished work item. If I'm committing just because I want to switch branches or pull, most likely what I'm committing is not meaningful in any way. If I squash later, it doesn't really matter, the reviewer will see one commit anyway, so committing often or not didn't help.
And sure, it's easier to ship small features than large ones. But it's harder to follow one complex change as a series of short patches than one big, self-consistent change.
So to me, commits should be as often as possible, sure, but that is normally once a day or once every few days for serious features / large bug fixes. I usually want to pull or switch branch much more often than that.
And related to release time: typically we have a single main branch that serves multiple projects. When one of these projects is close to a release, they cut off a release branch that is feature frozen, in order to fix any remaining bugs and thoroughly sanitize it. In the meantime, other teams continue merging features into master as they are ready. Bugfixes normally also go into master first, and they are then cherry-picked into the release branch (or branches, if there are multiple releases overlapping). So for us, during this release window, cherry-picking is something that every team member does constantly.
I think the habit of committing more frequently stems from working remotely across different timezones and still making code review a daily habit. It is part of our async communication. If someone hasn't committed code for a few days I'd be concerned they were going down a bad path or getting stuck.
I may not even communicate with that team member outside of a PR for the whole week so that's where most of our communication happens if I don't see them in slack or zoom - which would be the case if our work hours don't overlap.
But hey! As long as what yall are doing works for you then no worries. No team manages their code the same way - which I'd say is another plus for git.
One benefit of frequent commits is that it allows you to go back to a recent commit in case your code gets screwed up and you want to undo some recent changes. For example, if you do a search/replace in hundreds of files and make a mistake.
> What is to be gained from committing a change to a file, then a commit to undo the previous one if you decide the change wasn't useful/needed?
"Undo" commits make sense, occasionally, to fix recent incorrect or improper changes that have been already pushed.
But the more appropriate way to discard changes is leaving them to rot in the commits of a dead experimental branch, avoiding to merge them into the important branches from which releases are made.
You don't have to "understand" it. Just form a basic opioun of what you think they will do and build an experience of what usually happens. That is what I do anyway as a software developer for 8 years.
I really really disagree that you need all of those. I have been using git for several years and never used blame. I use built in IDE tools to view revision history. I occasionally use stash, never use cherry, never use log.
It depends on your team's workflow, probably, but if you find yourself needing all of those daily, it seems to me like your workflow has some extra complexity that most people won't have.
You have to understand a lot of things before you can somewhat comfortably use git, much more than something like P4.