Why I Like Mercurial More Than Git
Mar. 29th, 2011 09:31 amAfter working for over a year alternating between two projects, one that uses Git for its version control and another that uses Mercurial, I have finally achieved sufficient mastery of both toolchains that I now feel comfortable defending my judgment that Mercurial is the superior of the two systems. I think Git has one glaring deficiency that makes it the inferior tool, and I hope to describe the required remedy in this weblog posting.
The tools are very similar, and many of the distinguishing differences come down to a matter of taste in my opinion. Some may consider it a deal-breaker that Mercurial expects its extensions to be written in Python, whereas Git admits extensions written in just about any language you care to imagine, but the usual approach is to write them in a shell language. That's not a deal-breaker for me. Many other differences are either consequences of that fundamental distinction, or they are cosmetic in nature. It also bothers me not at all neither that Mercurial has no index, nor that Git has the index. The difference between the Git stash and Mercurial patch queues is similarly trivial to me.
The big difference, the deal-maker for me, is in how each tool goes about meeting the fundamental requirement for any version control system: how it handles source code merging. Quite simply, Mercurial is better at merging than Git.
I need to introduce a bit of terminology here to make my point. Because the literature for Git and Mercurial use the word branch to mean crucially different things, I'm going to avoid the word here entirely so as to prevent confusion. For the concept described in the Git literature with the word branch and in the Mercurial literature with the word head, I shall use the word lineage. I shall use the word family when referring to the concept the Mercurial literature uses branch to describe, which is a name that distinguishes a related set of lineages.
Mercurial is superior to Git because it records family history in the repository, while Git does not. In every other significant respect, a Git repository stores the same information as a Mercurial repository. This is why it is possible to convert a Git repository into a Mercurial repository then back into a Git repository without losing any information. It is not possible to perform this round-trip starting with a Mercurial repository (in the general case) because the family history must be discarded in the conversion to Git. (In the conversion to Mercurial, the entire Git repository can be regarded as one monolithic family, and indeed this is how the excellent Hg-Git tool presents its Mercurial view of Git repositories.)
It turns out that having the family history recorded in the repository— and thereby copied around with clones, pushes and pulls— is really important when reviewing the history of a project. A hint of this importance shows up in the cultural difference one observes between Git and Mercurial users.
Among Git users, it's common to see people arguing vociferously that proper workflows involve judicious use of the "rebase" command to reduce the incidence of merging in the repository history. This is because Git only records the lineage of every change, not its family. When all you have to review in the history of a change is its lineage, you don't want to be distracted by a lot of merges between different lineages in the same family. In a Mercurial repository, because the family history is recorded in the repository with every changeset, the urge to keep every lineage pure from ancestor to descendant isn't quite as strong.
In any sane Git workflow, there are two different ways to join a pair of divergent lineages, "merge" and "rebase," and you'd better choose the right one at every opportunity or your whole team will lose valuable momentum dealing with their frustration with your bad version control hygiene. Always use "rebase" when the lineage in your local clone is divergent from the lineage in the upstream, i.e. more authoritative, clone. You do this so that the upstream clone can do a "fast-forward" merge when it pulls your change. It's important in Git for the merge not just to proceed without conflict; it must be a fast-forward merge in order to keep the authoritative lineage "clean" of any evidence of your divergence.
Basically, what's going on here is that Git encourages its users to adhere to a convention whereby lineage and family are equivalent concepts. This leads to an aesthetic concern for "clean history" where every merge of two or more lineages is a record of the merging of the families corresponding to the lineages. Any family with more than one lineage has a "dirty" or "unclean" history. Figuring out the family history of any change in a Git repository where developers have not strictly adhered to this policy means a lot of guesswork. Consequently, some Git repository administrators set flags that enforce this convention, which leads to further confusion among users. "Why can't I push? Oh, you mean I should have rebased instead of merging? Foo."
If you have a fetish for clean pedigrees, or you are using the Hg-Git bidirectional bridge, there is the standard "rebase" extension. It allows you to adopt a workflow that minimizes the incidence of merging between lineages in the same family. There is, however, not any compelling reason to do so: the repository retains the family history. It's easy to review which changes belong to which family whatever lineage they may have. Mercurial users therefore have no reason to be particularly diligent about maintaining "purity" of lineage histories, as Git users do.
I wrote at the outset of this article that I believe Git should be improved to remedy the deficiency I'm describing here. There are couple ways it could be done. One way would be to adopt Mercurial's style of annotating every node in the graph with a family name. Another way— perhaps a more straightforward and "git-like" way— of dealing with it would be to annotate every edge in the graph with the family name (derived from the branch name of the ancestor node in the repository where the commit occurred). You'd probably need a distinguished name for the case where the family history is lost to antiquity.
In any case, this is my argument for why Mercurial is superior to Git. You're welcome to your opinions, of course, but this one is mine. I'm open to persuasion that I'm DoingItWrong™, but it took me a long time to arrive at my judgment here, so please think through the arguments you want to make to me before you comment. Thanks.
[Note: this article has been revised for clarity since its initial publication. The original draft improperly assumed the reader has a familiarity with Mercurial "branch" semantics. Some redundant assertions have been removed.]
The tools are very similar, and many of the distinguishing differences come down to a matter of taste in my opinion. Some may consider it a deal-breaker that Mercurial expects its extensions to be written in Python, whereas Git admits extensions written in just about any language you care to imagine, but the usual approach is to write them in a shell language. That's not a deal-breaker for me. Many other differences are either consequences of that fundamental distinction, or they are cosmetic in nature. It also bothers me not at all neither that Mercurial has no index, nor that Git has the index. The difference between the Git stash and Mercurial patch queues is similarly trivial to me.
The big difference, the deal-maker for me, is in how each tool goes about meeting the fundamental requirement for any version control system: how it handles source code merging. Quite simply, Mercurial is better at merging than Git.
I need to introduce a bit of terminology here to make my point. Because the literature for Git and Mercurial use the word branch to mean crucially different things, I'm going to avoid the word here entirely so as to prevent confusion. For the concept described in the Git literature with the word branch and in the Mercurial literature with the word head, I shall use the word lineage. I shall use the word family when referring to the concept the Mercurial literature uses branch to describe, which is a name that distinguishes a related set of lineages.
Mercurial is superior to Git because it records family history in the repository, while Git does not. In every other significant respect, a Git repository stores the same information as a Mercurial repository. This is why it is possible to convert a Git repository into a Mercurial repository then back into a Git repository without losing any information. It is not possible to perform this round-trip starting with a Mercurial repository (in the general case) because the family history must be discarded in the conversion to Git. (In the conversion to Mercurial, the entire Git repository can be regarded as one monolithic family, and indeed this is how the excellent Hg-Git tool presents its Mercurial view of Git repositories.)
It turns out that having the family history recorded in the repository— and thereby copied around with clones, pushes and pulls— is really important when reviewing the history of a project. A hint of this importance shows up in the cultural difference one observes between Git and Mercurial users.
Among Git users, it's common to see people arguing vociferously that proper workflows involve judicious use of the "rebase" command to reduce the incidence of merging in the repository history. This is because Git only records the lineage of every change, not its family. When all you have to review in the history of a change is its lineage, you don't want to be distracted by a lot of merges between different lineages in the same family. In a Mercurial repository, because the family history is recorded in the repository with every changeset, the urge to keep every lineage pure from ancestor to descendant isn't quite as strong.
In any sane Git workflow, there are two different ways to join a pair of divergent lineages, "merge" and "rebase," and you'd better choose the right one at every opportunity or your whole team will lose valuable momentum dealing with their frustration with your bad version control hygiene. Always use "rebase" when the lineage in your local clone is divergent from the lineage in the upstream, i.e. more authoritative, clone. You do this so that the upstream clone can do a "fast-forward" merge when it pulls your change. It's important in Git for the merge not just to proceed without conflict; it must be a fast-forward merge in order to keep the authoritative lineage "clean" of any evidence of your divergence.
Basically, what's going on here is that Git encourages its users to adhere to a convention whereby lineage and family are equivalent concepts. This leads to an aesthetic concern for "clean history" where every merge of two or more lineages is a record of the merging of the families corresponding to the lineages. Any family with more than one lineage has a "dirty" or "unclean" history. Figuring out the family history of any change in a Git repository where developers have not strictly adhered to this policy means a lot of guesswork. Consequently, some Git repository administrators set flags that enforce this convention, which leads to further confusion among users. "Why can't I push? Oh, you mean I should have rebased instead of merging? Foo."
If you have a fetish for clean pedigrees, or you are using the Hg-Git bidirectional bridge, there is the standard "rebase" extension. It allows you to adopt a workflow that minimizes the incidence of merging between lineages in the same family. There is, however, not any compelling reason to do so: the repository retains the family history. It's easy to review which changes belong to which family whatever lineage they may have. Mercurial users therefore have no reason to be particularly diligent about maintaining "purity" of lineage histories, as Git users do.
I wrote at the outset of this article that I believe Git should be improved to remedy the deficiency I'm describing here. There are couple ways it could be done. One way would be to adopt Mercurial's style of annotating every node in the graph with a family name. Another way— perhaps a more straightforward and "git-like" way— of dealing with it would be to annotate every edge in the graph with the family name (derived from the branch name of the ancestor node in the repository where the commit occurred). You'd probably need a distinguished name for the case where the family history is lost to antiquity.
In any case, this is my argument for why Mercurial is superior to Git. You're welcome to your opinions, of course, but this one is mine. I'm open to persuasion that I'm DoingItWrong™, but it took me a long time to arrive at my judgment here, so please think through the arguments you want to make to me before you comment. Thanks.
[Note: this article has been revised for clarity since its initial publication. The original draft improperly assumed the reader has a familiarity with Mercurial "branch" semantics. Some redundant assertions have been removed.]
family == branch labels
Date: 2011-04-20 07:47 pm (UTC)Named branches / family / branch labels perhaps solve the issue with rebase / transplant and merge... but they have one serious disadvantage: name clashes. Your "for-john" branch might not be the same as mine "for-john" branch... and John would want to have it as "from-jhw" and "from-jn", or equivalent.
Note also that people usually don't rebase because of some notion of purity, but because either the fact that straight linear history is easier to bisect, or the fact that rebased commits would not conflict (if sending patches via email).
Re: family == branch labels
Date: 2011-04-20 07:55 pm (UTC)Re: family == branch labels
Date: 2011-04-20 08:13 pm (UTC)BTW. an equivalent to MQ in Git is not rebase / interactive rebase, but tools such as StGit or Guilt.
Re: family == branch labels
Date: 2011-04-20 10:01 pm (UTC)I'll look into StGit and Guilt. I've not heard of them before.
Re: family == branch labels
Date: 2011-04-23 04:27 pm (UTC)Besides with "named branches" / family names / branch labels you have to come with good name for a branch upfront (or rewrite history). With Git I could be working e.g. on branch 'subsystem' in my private working repository, then push this branch (perhaps after rebase and cleanup) into branch 'subsystem-feature' to my public bare publishing repository. From there maintainer can fetch it into e.g. 'jn/feature' branch in his/her repository. Note that bookmark extension (lightweight branches) has similar problem as "named" branches: they are for some time transferrable, but to avoid difficulty with mapping branch names (Git's "refspecs") 'bookmark' branch names are global.
BTW for proper merging you should need only 3 versions: ours (current branch you are merging into), theirs (branch being merged) and ancestor (merge base), for 3-way merge. All history, including family history, is irrelevant...
Re: family == branch labels
Date: 2011-04-26 08:17 pm (UTC)Deal-braker: Mercurial can't store empty directories
Date: 2011-04-26 06:14 pm (UTC)- Adding the creation of the dirs to build script:
Brittle - someone will rewrite the script and forget it or deploy the project without the build script.
- Creating dummy files in the directory:
Yikes !
- Ensuring the code creates the dirs at runtime:
Doable, but having to change code, and make coworkers do it as well is not cool
The bottom line is that I should be able to take an existing project and import it to Mercurial. No "buts", "ifs" or excuses. I'm sticking to subversion for the time being (although it is not perfect either).
Re: Deal-braker: Mercurial can't store empty directories
Date: 2011-04-26 08:14 pm (UTC)Re: INVALID: Deal-braker: Mercurial can't store empty directories
Date: 2016-03-22 12:37 pm (UTC)There is NO(!) difference for empty directory handling between git and merc.
PS: Opinionated view follows:
I'm returning to this page after 4 years, was checking arguments for both, having gotten angry about the inconsistencies of git, which we were using back then solely.
I must admit that I did not really fully understand the author's argument - but went to mercurial for our PaaS offering with my team simply for merc's clear benefits on the command line, while other teams in the company continued to use git.
After years with both and many k commits later, I'm now 100% sharing the author's argument: the problems in our companies git repos really accumulated over time, their histories are often a mess.
We never looked back: Mercurial is 100% perfectly designed ready to use application for the purpose, while git remains to feel to me like a platform, with no clear governance/vision/leadership governing the design decisions. Mercurial is being handled even by operations at our customers in CI setups, unthinkable with git.
Git & rebase
Date: 2011-06-21 02:36 pm (UTC)Re: Git & rebase
Date: 2011-06-21 05:55 pm (UTC)Git could be improved. I'm sad that it won't be.
Re: Git & rebase
Date: 2011-06-21 06:16 pm (UTC)I'm sure there are people who can argue why Mercurial's way of doing it is bad or the choices available in git are better. Unfortunately, that's not me.
The question that hopefully will come out of this discussion is, why do others not see this as such a glaring omission that it needs to be fixed immediately?
Re: Git & rebase
Date: 2011-06-21 06:25 pm (UTC)If they were at all concerned about the cognitive burden of their human interface, then Git would have a completely different command set. In fact, for too many of those core developers, the fact that Git has a high cognitive burden and requires its users to master a lot of arcane knowledge even to do ordinary things is a feature.
Sure, I suppose some of them might eventually feel moved to construct a coherent argument for why Mercurial's way of recording named lineages is an actively bad idea. More likely, however, they will be satisfied with achieving the comparatively easier objective of satisfying themselves that it isn't a good enough idea for them to care about it. They have a system, and it works for them, so long as you obey all its fiddly rules and secret handshakes. Problem solved.
Re: Git & rebase
Date: 2011-06-29 05:02 am (UTC)Anyway, going back to what you call the single greatest failing of Git, which is not recording branch information on the commits themselves. I don't want branch information recorded. I may make a temporary branch named something ridiculous or uninformative (and I'm speaking of lightweight branches, or Mercurial's bookmark extension), implement a feature, and decide that it's worth keeping. I then merge this feature into a more "mainstream" branch (or rebase, if that's how I want to work -- both workflows are valid). I don't want my old temporary branch name to stick to those commits. It's unhelpful information. However, giving this temporary branch an important or mainstream name is not ideal, because I may not want to merge the contents of that branch into the mainstream workflow, and then those commits are floating around with seemingly important labels, when they're garbage.
I'm rambling, so here's the point -- I don't see the importance of knowing under which branch a certain commit was developed. That's useless information, in my opinion. Rather, I want to know what a single commit does. (Which should be documented in a well-written commit message) Assuming everyone is writing good commit messages, then it doesn't matter under which branch some code was written -- what matters is the code itself.
Now, I have to end with a confession: I don't know mercurial very well. Every time I try to pick it up, I miss my integrated lightweight branching. (I don't want to use an extension -- it's the most important part of my workflow, and what makes Git so awesome in my opinion, its branching model) Every time I hear someone mention heavy branching (cloning into a new directory), I shudder, thinking of the SVN branching model. I admit, though, I need to learn more about Mercurial and really try it.
But I will say this: the author has stated that he has worked on projects using both Mercurial and Git; I submit that the author has not learned to properly use a Git workflow, based on what I've read here. Please go read Pro Git (http://progit.org/book/) from cover to cover. It's not very long, and really explains a Git workflow and the power of the Git branching model. After reading this book, I submit that no one could point at Git and call it arcane or a cognitive burden -- it's progressed a lot in the last few years. If, at that point, you still miss your Mercurial branches (er, families, I guess), then by all means use Mercurial. But I love Git, and just want people to realize that it's easy to use when you have the right resources to learn from, and just encourages a different workflow from Mercurial.
Re: Git & rebase
Date: 2011-06-29 05:55 am (UTC)I currently use both Git and Subversion on a daily basis in a large, well-respected software organization with a source code base that numbers into the hundreds of millions of lines of code with a history going back at least thirty years in the case of the main body of code in my principle area of responsibility. The sun never sets on the data centers where these repositories and their mirrors are hosted. I'm not in a position to argue with a whole building full of people who insist on using "the wrong workflow" with Git, whatever that happens to be from your point of view.
I routinely use Mercurial as front-end to both of Git and Subversion, because I'm willing to pay for a bit of added round-trip time when pushing to and pulling from the integration repositories in exchange for a simplified workflow in my day-to-day development work. This makes me acutely aware of the shortcomings in both Git and Subversion, which are different in each case, worse in the case of Subversion to be sure, but Git has at least one serious flaw as well, as I write above.
> I miss my integrated lightweight branching. (I don't want to use an extension...
It's not an extension in Mercurial. Bookmarks are a core feature of the tool, present without you even having to add a single line to your .hgrc file or install anything separately. Moreover, you can even dispense with bookmarks and work with anonymous heads if you're really pressed for time.
Look, I wasn't kidding when I said that you can do a round-trip conversion of a Git repository into and back out of Mercurial without losing any data. There's even a way to do that in the other direction as long as the Git commit logs adhere to the convention required my Mercurial for storing the branch name.
I've read the Pro Git book. It doesn't tell me anything I don't already know. High on the list of what I want it to tell me, which it doesn't say a damned thing about, is why on Carlin's Green Earth there should be any good reason to use the receive.denyNonFastForwards setting on the server.
Re: Git & rebase
Date: 2013-02-28 10:37 pm (UTC)You wonder why you want to set receive.denyNonFastForwards on the server.
I wonder why this is not the default.
The point is, this flag is set on the server to ensure the server will not _lose_ history.
It is not telling the client it should rebase upon the upstream the changes. It tells the client to not push a ref which does not have the upstream ref as a parent. Think about the last sentence for a while.
The client might however have this upstream ref in a random 'lineage', so it could just have merged it, or rebased upon it. It must however have pulled it in somehow. You cannot push before you pulled it with this flag set.
Re: Git & rebase
Date: 2015-11-25 04:00 pm (UTC)http://www.randyfay.com/node/89
"How do you prevent git push --force? (thanks to sdboyer!)
In the bare authoritative repository,
git config --system receive.denyNonFastForwards true"
Re: Git & rebase
Date: 2013-03-04 01:59 am (UTC)Re: Git & rebase
Date: 2012-05-26 11:18 am (UTC)That's because there is no problem.
You are saying that the problem with git is that it doesn't have mercurial "branches". How is that a problem?
What is it _exactly_ that you cannot do in git?
Hint: you can do everything in git.
Re: Git & rebase
Date: 2011-07-04 03:17 am (UTC)I think Linus oppose to this workflow.
People wants to stick to the old SVN mindset and git make it possible, so some people are enforcing this.
This is not the git community recommand.
Re: Git & rebase
Date: 2011-07-04 03:41 am (UTC)Which ought to tell you something important. When you make a tool and you later find out that more people are using it differently than you intended than there are people who are following your careful instructions for proper use, the traditional thing that good engineers do is figure out what's wrong with their expectations of how people will be using the tool and adjust accordingly in the next iteration. Instead, the Git community decides that people are stupid, whines piteously that people aren't moar smarterz, and secretly regrets trying to help people in the first place. That's not sound engineering principles in action— that's misanthropy.
Re: Git & rebase
Date: 2011-07-04 06:09 am (UTC)It is "clean" and "readability" that matters, not "rebasing" and "linearization". The history should be bisectable and easy for later debug session. If you rebase everything you send, you are doing it wrong.
I think linus don't ask you to rebase -- he ask you to send patch in email, not a git-pull request (unless you are a trusted subsystem maintainer).
This is for easier code reviewing. Would you rather reviewing (1) two patches: one with bugs and one fix it; or (2) just one patch?
Consider git as a patch exchange tool with version control as a feature, then the rewriting history thing will start making sense.
See this LWN.net article for some elaboration https://lwn.net/Articles/328436/
On tools design:
This is the unix philosophy of "gives you more than enough rope to hang yourself".
Variant of this appears everywhere, e.g. in C vs Java, KDE vs Gnome, etc. This is an eternal problem. I have nothing to add to this debate.
Bang on!
Date: 2011-08-03 08:00 am (UTC)You really cant follow a line of development back in git especially if there are merges. Hence the whole emphasis on rebasing as you quite nicely point out.
It took me sometime to understand your article. I visited it a few months ago. At that point my knowledge of git was not sufficient to totally understand what you had written. Today was the day of the ah-hah moment!
Git creates complexity and solves it. Mercurial avoids the complexity all together.
Maybe it might be a good idea to graphically show what you've written for other people who visit this page?
Re: Bang on!
Date: 2011-08-19 05:04 pm (UTC)no subject
Date: 2011-08-15 09:14 pm (UTC)Git have it good side
Date: 2011-12-17 09:14 am (UTC)I just started learning about it.
In some aspect I think that Git Concept is better than Mercurial.
- First is the Stage Area. I like this feature a lot.
- And second is about the remote. You can save all the remote to pull from and track which commit the remote branch current is. That is something currently missing in Mercurial.
- Also I like the workflow in Git with branch.
But one thing that I don't like about Git is.
Need a lot of knowledge to do a simple task.
As I'm not good at bash script.
So it is a little hard for me.
Also the tag in git is not very friendly.
It should display in the log like in hg log.
Summary, I think Git have it advantage. But still need to improve the user friendly.
Re: Git have it good side
Date: 2014-01-11 11:33 pm (UTC)- If you want a stage area, you can simply activate the mq extension. That even gives you an infinite number of staging areas which can contain multiple unfinished commits.
- Remotes in hg are simply entries in the .hg/hgrc under the heading [paths].
- Git-like branches are called bookmarks in hg.
no subject
Date: 2012-05-02 07:03 pm (UTC)I want to be able to push. Git can't push shared branches because in the case of a potential collision, it has no recourse. The UI, "git push" is simple and the semantic is straightforward and obvious - I want my changes in that repository. Rather than do that, git simply throws up it's hands and refuses, (in the case where the destination has that branch checked out, or that branch has other changes that are not mine).
So... there's an obvious semantic, an obvious interpretation, and git doesn't do it. That's a pretty big and scarey culture shock coming from pretty much any source code control system developed in the last decade. With git, we're back to the geographic branches of clearcase multisite where each repository owns a branch and the other branches show up as read only. This doesn't scale very well, as we learned a couple of decades ago. It's a lot of extra work to manage all those potentially-automatic-but-failing merges.
The multiple hg heads are the natural conclusion to this problem of how to handle collisions in the repository. Hg can then propagate them anywhere and anyone in any repository can merge them. Not so in git. In git, you're reduced to sending random emails trying to find the person with whom you need to coordinate.
This isn't unique to hg, btw. Other systems do this as well.
no subject
Date: 2015-01-08 04:23 pm (UTC)Yes, when you push with Mercurial, you will silently get a new anonymous head if there are upstream children of your current change, whereas Git will throw a rock at you and refuse to push. This can make Git users feel nauseated because of their need for every lineage to have a carefully manicured pedigree. Indeed Git makes you expressly create the lineage with a distinguished name and go through a registration process before you can push it upstream.
I still like Mercurial better. You pushed something and, oh, that meant creating a new lineage. So we did that. Maybe you want to give it a name, maybe it won't be around long enough to deserve one. If it needs a name, then you do that with a bookmark on the new head. Pushing the bookmark is optional. What's important here is that Mercurial didn't throw any rocks at you and no history got destroyed. That seems obviously superior to me, but whatever.
Mercurial .vx. Bazaar?
Date: 2013-11-20 08:59 pm (UTC)By any chance, do you have experience with Bazaar too?
Thanks
Richard Gomes
http://rgomes.info/
Re: Mercurial .vx. Bazaar?
Date: 2015-01-08 04:13 pm (UTC)Absolutely Correct
Date: 2015-01-07 06:19 pm (UTC)See http://duckrowing.com/2013/12/26/bzr-init-a-bazaar-tutorial/