jhw | Why I Like Mercurial More Than Git

After working for over a year alternating between two projects, one that uses Git for its version control and another that uses Mercurial, I have finally achieved sufficient mastery of both toolchains that I now feel comfortable defending my judgment that Mercurial is the superior of the two systems. I think Git has one glaring deficiency that makes it the inferior tool, and I hope to describe the required remedy in this weblog posting.

The tools are very similar, and many of the distinguishing differences come down to a matter of taste in my opinion. Some may consider it a deal-breaker that Mercurial expects its extensions to be written in Python, whereas Git admits extensions written in just about any language you care to imagine, but the usual approach is to write them in a shell language. That's not a deal-breaker for me. Many other differences are either consequences of that fundamental distinction, or they are cosmetic in nature. It also bothers me not at all neither that Mercurial has no index, nor that Git has the index. The difference between the Git stash and Mercurial patch queues is similarly trivial to me.

The big difference, the deal-maker for me, is in how each tool goes about meeting the fundamental requirement for any version control system: how it handles source code merging. Quite simply, Mercurial is better at merging than Git.

I need to introduce a bit of terminology here to make my point. Because the literature for Git and Mercurial use the word branch to mean crucially different things, I'm going to avoid the word here entirely so as to prevent confusion. For the concept described in the Git literature with the word branch and in the Mercurial literature with the word head, I shall use the word lineage. I shall use the word family when referring to the concept the Mercurial literature uses branch to describe, which is a name that distinguishes a related set of lineages.

Mercurial is superior to Git because it records family history in the repository, while Git does not. In every other significant respect, a Git repository stores the same information as a Mercurial repository. This is why it is possible to convert a Git repository into a Mercurial repository then back into a Git repository without losing any information. It is not possible to perform this round-trip starting with a Mercurial repository (in the general case) because the family history must be discarded in the conversion to Git. (In the conversion to Mercurial, the entire Git repository can be regarded as one monolithic family, and indeed this is how the excellent Hg-Git tool presents its Mercurial view of Git repositories.)

It turns out that having the family history recorded in the repository— and thereby copied around with clones, pushes and pulls— is really important when reviewing the history of a project. A hint of this importance shows up in the cultural difference one observes between Git and Mercurial users.

Among Git users, it's common to see people arguing vociferously that proper workflows involve judicious use of the "rebase" command to reduce the incidence of merging in the repository history. This is because Git only records the lineage of every change, not its family. When all you have to review in the history of a change is its lineage, you don't want to be distracted by a lot of merges between different lineages in the same family. In a Mercurial repository, because the family history is recorded in the repository with every changeset, the urge to keep every lineage pure from ancestor to descendant isn't quite as strong.

In any sane Git workflow, there are two different ways to join a pair of divergent lineages, "merge" and "rebase," and you'd better choose the right one at every opportunity or your whole team will lose valuable momentum dealing with their frustration with your bad version control hygiene. Always use "rebase" when the lineage in your local clone is divergent from the lineage in the upstream, i.e. more authoritative, clone. You do this so that the upstream clone can do a "fast-forward" merge when it pulls your change. It's important in Git for the merge not just to proceed without conflict; it must be a fast-forward merge in order to keep the authoritative lineage "clean" of any evidence of your divergence.

Basically, what's going on here is that Git encourages its users to adhere to a convention whereby lineage and family are equivalent concepts. This leads to an aesthetic concern for "clean history" where every merge of two or more lineages is a record of the merging of the families corresponding to the lineages. Any family with more than one lineage has a "dirty" or "unclean" history. Figuring out the family history of any change in a Git repository where developers have not strictly adhered to this policy means a lot of guesswork. Consequently, some Git repository administrators set flags that enforce this convention, which leads to further confusion among users. "Why can't I push? Oh, you mean I should have rebased instead of merging? Foo."

If you have a fetish for clean pedigrees, or you are using the Hg-Git bidirectional bridge, there is the standard "rebase" extension. It allows you to adopt a workflow that minimizes the incidence of merging between lineages in the same family. There is, however, not any compelling reason to do so: the repository retains the family history. It's easy to review which changes belong to which family whatever lineage they may have. Mercurial users therefore have no reason to be particularly diligent about maintaining "purity" of lineage histories, as Git users do.

I wrote at the outset of this article that I believe Git should be improved to remedy the deficiency I'm describing here. There are couple ways it could be done. One way would be to adopt Mercurial's style of annotating every node in the graph with a family name. Another way— perhaps a more straightforward and "git-like" way— of dealing with it would be to annotate every edge in the graph with the family name (derived from the branch name of the ancestor node in the repository where the commit occurred). You'd probably need a distinguished name for the case where the family history is lost to antiquity.

In any case, this is my argument for why Mercurial is superior to Git. You're welcome to your opinions, of course, but this one is mine. I'm open to persuasion that I'm DoingItWrong™, but it took me a long time to arrive at my judgment here, so please think through the arguments you want to make to me before you comment. Thanks.

[Note: this article has been revised for clarity since its initial publication. The original draft improperly assumed the reader has a familiarity with Mercurial "branch" semantics. Some redundant assertions have been removed.]

Flat | Top-Level Comments Only

From:

jnareb.openid.pl

What you call family, and what I understand so-called named branches in Mercurial terminology (A Guide to Branching in Mercurial (http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial) blog post by Steve Losh from 2009 has Branching with Named Branches (http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/#branching-with-named-branches) section) I like to call (http://stackoverflow.com/questions/1598759/git-and-mercurial-compare-and-contrast/1599930#1599930) branch labels. I think it describes the concept better.

Named branches / family / branch labels perhaps solve the issue with rebase / transplant and merge... but they have one serious disadvantage: name clashes. Your "for-john" branch might not be the same as mine "for-john" branch... and John would want to have it as "from-jhw" and "from-jn", or equivalent.

Note also that people usually don't rebase because of some notion of purity, but because either the fact that straight linear history is easier to bisect, or the fact that rebased commits would not conflict (if sending patches via email).

From:

jhw

Family names don't clash because each family comprises multiple lineages, or in Mercurial terms: each branch can have zero, one or many heads. Also, good Mercurial hygiene uses MQ instead of rebasing for the purpose of staging patches prior to committing them to the persistent global history.

From:

jnareb.openid.pl

I'm sorry, I was not clear enough. By "name clashes" I mean that one branch label / family name ('for-john' in john's repository from two different repositories) might contain disconnected and unrelated commits.

BTW. an equivalent to MQ in Git is not rebase / interactive rebase, but tools such as StGit or Guilt.

From:

jhw

The canonical response to concerns about namespace management, in every case where names can clash, is to set a policy that encompasses the scope of the namespace. This isn't a problem unless you try too hard to make it one.

I'll look into StGit and Guilt. I've not heard of them before.

From:

jnareb.openid.pl

Centralized policy for a distributed version control system? If I am to rely on branch naming policy, then Subversion is just as good, at least with respect to creating branches...

Besides with "named branches" / family names / branch labels you have to come with good name for a branch upfront (or rewrite history). With Git I could be working e.g. on branch 'subsystem' in my private working repository, then push this branch (perhaps after rebase and cleanup) into branch 'subsystem-feature' to my public bare publishing repository. From there maintainer can fetch it into e.g. 'jn/feature' branch in his/her repository. Note that bookmark extension (lightweight branches) has similar problem as "named" branches: they are for some time transferrable, but to avoid difficulty with mapping branch names (Git's "refspecs") 'bookmark' branch names are global.

BTW for proper merging you should need only 3 versions: ours (current branch you are merging into), theirs (branch being merged) and ancestor (merge base), for 3-way merge. All history, including family history, is irrelevant...

From:

jhw

There is no requirement for branch naming policy to be centralized with Mercurial. It's perfectly possible to devise a federated naming policy. Mercurial is agnostic about the naming policy; it reserves only one name from the namespace: "default".

From:

trsdomain.dk

Having never tried a distributed source control system I decided to try out Mercurial after reading this article. I was fairly disappointed when I found out it is not able to store an empty directory... I know there are a lot of workarounds:

- Adding the creation of the dirs to build script:
Brittle - someone will rewrite the script and forget it or deploy the project without the build script.

- Creating dummy files in the directory:
Yikes !

- Ensuring the code creates the dirs at runtime:
Doable, but having to change code, and make coworkers do it as well is not cool

The bottom line is that I should be able to take an existing project and import it to Mercurial. No "buts", "ifs" or excuses. I'm sticking to subversion for the time being (although it is not perfect either).

From:

jhw

Sure. I'll admit that others may have different concerns, all perfectly legitimate. I can live without empty directories in exchange for all the other benefits, but I can see how others might weigh their concerns differently.

From:

redpill39

This post was about **git versus mercurial**, not subversion versus mercurial.

There is NO(!) difference for empty directory handling between git and merc.

PS: Opinionated view follows:
I'm returning to this page after 4 years, was checking arguments for both, having gotten angry about the inconsistencies of git, which we were using back then solely.
I must admit that I did not really fully understand the author's argument - but went to mercurial for our PaaS offering with my team simply for merc's clear benefits on the command line, while other teams in the company continued to use git.
After years with both and many k commits later, I'm now 100% sharing the author's argument: the problems in our companies git repos really accumulated over time, their histories are often a mess.
We never looked back: Mercurial is 100% perfectly designed ready to use application for the purpose, while git remains to feel to me like a platform, with no clear governance/vision/leadership governing the design decisions. Mercurial is being handled even by operations at our customers in CI setups, unthinkable with git.

From:

https://www.google.com/accounts/o8/id?id=AItOawllJaD1TSNJZdyl6vgYfQTMEg16W_l4gSo

I'm no expert in either system, but it's fair to point out that rebase is not a requirement in git. It's there if you want it, but under normal circumstances you're certainly not required to use it. If you don't rebase or merge with no-ff commits, you'll preserve a lot more history. The question is, at the end of the day, will that history be useful and do you want to see it? Your choice.

From:

jhw

Yeah, it's not like Git requires you use 'rebase' to get anything done, but I've seen multiple working groups adopt policies that make using it mandatory, i.e. you do not get your code pulled onto the upstream master branch unless it's a fast-forward. They've made the choice to elide a whole bunch of history that isn't terribly useful, because it's incomplete due to the deficiencies in Git that I'm writing about in this article. This requires a complicated workflow with Git that experience shows me novices find terribly frustrating to learn, whereas if they had simply chosen Mercurial instead of Git, the simple workflow would have sufficed, novices would not be as frustrated by the version control system, and the history that Git users can't figure out how to use would be available under Mercurial and useful because it is complete.

Git could be improved. I'm sad that it won't be.

From:

https://www.google.com/accounts/o8/id?id=AItOawllJaD1TSNJZdyl6vgYfQTMEg16W_l4gSo

The assertion that git won't be improved is silly. Git has undergone tremendous improvement to get to where it is today. So has Mercurial.

I'm sure there are people who can argue why Mercurial's way of doing it is bad or the choices available in git are better. Unfortunately, that's not me.

The question that hopefully will come out of this discussion is, why do others not see this as such a glaring omission that it needs to be fixed immediately?

From:

jhw

I assert that Git won't be improved to address the problem I'm writing about because the core developers do not perceive the issue to be a problem.

If they were at all concerned about the cognitive burden of their human interface, then Git would have a completely different command set. In fact, for too many of those core developers, the fact that Git has a high cognitive burden and requires its users to master a lot of arcane knowledge even to do ordinary things is a feature.

Sure, I suppose some of them might eventually feel moved to construct a coherent argument for why Mercurial's way of recording named lineages is an actively bad idea. More likely, however, they will be satisfied with achieving the comparatively easier objective of satisfying themselves that it isn't a good enough idea for them to care about it. They have a system, and it works for them, so long as you obey all its fiddly rules and secret handshakes. Problem solved.

From:

https://www.google.com/accounts/o8/id?id=AItOawni8VvVcImNktTYFWQtJyNBBRCkrrQdIdY

I feel like this post is from someone who was using git back in the early days, when you had to do a lot of plumbing commands yourself. Today's Git has a very similar command set for the average user's needs. The free book Pro Git gives you enough information in a single chapter to use Git very well day to day (chapter 2, if you're wondering). I feel like it's very straightforward and quite polished under most circumstances, but has power if you want to delve a little deeper. I submit that anyone who wants to compare Git to Mercurial, or point out any fault of Git, needs to have read that book from cover to cover.

Anyway, going back to what you call the single greatest failing of Git, which is not recording branch information on the commits themselves. I don't want branch information recorded. I may make a temporary branch named something ridiculous or uninformative (and I'm speaking of lightweight branches, or Mercurial's bookmark extension), implement a feature, and decide that it's worth keeping. I then merge this feature into a more "mainstream" branch (or rebase, if that's how I want to work -- both workflows are valid). I don't want my old temporary branch name to stick to those commits. It's unhelpful information. However, giving this temporary branch an important or mainstream name is not ideal, because I may not want to merge the contents of that branch into the mainstream workflow, and then those commits are floating around with seemingly important labels, when they're garbage.

I'm rambling, so here's the point -- I don't see the importance of knowing under which branch a certain commit was developed. That's useless information, in my opinion. Rather, I want to know what a single commit does. (Which should be documented in a well-written commit message) Assuming everyone is writing good commit messages, then it doesn't matter under which branch some code was written -- what matters is the code itself.

Now, I have to end with a confession: I don't know mercurial very well. Every time I try to pick it up, I miss my integrated lightweight branching. (I don't want to use an extension -- it's the most important part of my workflow, and what makes Git so awesome in my opinion, its branching model) Every time I hear someone mention heavy branching (cloning into a new directory), I shudder, thinking of the SVN branching model. I admit, though, I need to learn more about Mercurial and really try it.

But I will say this: the author has stated that he has worked on projects using both Mercurial and Git; I submit that the author has not learned to properly use a Git workflow, based on what I've read here. Please go read Pro Git (http://progit.org/book/) from cover to cover. It's not very long, and really explains a Git workflow and the power of the Git branching model. After reading this book, I submit that no one could point at Git and call it arcane or a cognitive burden -- it's progressed a lot in the last few years. If, at that point, you still miss your Mercurial branches (er, families, I guess), then by all means use Mercurial. But I love Git, and just want people to realize that it's easy to use when you have the right resources to learn from, and just encourages a different workflow from Mercurial.

From:

jhw

> I feel like this post is from someone who was using git back in the early days...

I currently use both Git and Subversion on a daily basis in a large, well-respected software organization with a source code base that numbers into the hundreds of millions of lines of code with a history going back at least thirty years in the case of the main body of code in my principle area of responsibility. The sun never sets on the data centers where these repositories and their mirrors are hosted. I'm not in a position to argue with a whole building full of people who insist on using "the wrong workflow" with Git, whatever that happens to be from your point of view.

I routinely use Mercurial as front-end to both of Git and Subversion, because I'm willing to pay for a bit of added round-trip time when pushing to and pulling from the integration repositories in exchange for a simplified workflow in my day-to-day development work. This makes me acutely aware of the shortcomings in both Git and Subversion, which are different in each case, worse in the case of Subversion to be sure, but Git has at least one serious flaw as well, as I write above.

> I miss my integrated lightweight branching. (I don't want to use an extension...

It's not an extension in Mercurial. Bookmarks are a core feature of the tool, present without you even having to add a single line to your .hgrc file or install anything separately. Moreover, you can even dispense with bookmarks and work with anonymous heads if you're really pressed for time.

Look, I wasn't kidding when I said that you can do a round-trip conversion of a Git repository into and back out of Mercurial without losing any data. There's even a way to do that in the other direction as long as the Git commit logs adhere to the convention required my Mercurial for storing the branch name.

I've read the Pro Git book. It doesn't tell me anything I don't already know. High on the list of what I want it to tell me, which it doesn't say a damned thing about, is why on Carlin's Green Earth there should be any good reason to use the receive.denyNonFastForwards setting on the server.

From:

https://www.google.com/accounts/o8/id?id=AItOawmzwP5jmwisow1kYx9X2bSQvYUy-HqA3j0

First of all, I'm biased towards git. Second of all, this is a late reply, so you might already know. Just for consistency about this.
You wonder why you want to set receive.denyNonFastForwards on the server.
I wonder why this is not the default.
The point is, this flag is set on the server to ensure the server will not _lose_ history.
It is not telling the client it should rebase upon the upstream the changes. It tells the client to not push a ref which does not have the upstream ref as a parent. Think about the last sentence for a while.
The client might however have this upstream ref in a random 'lineage', so it could just have merged it, or rebased upon it. It must however have pulled it in somehow. You cannot push before you pulled it with this flag set.

From:

shelby3

> is why on Carlin's Green Earth there should be any good reason to use the receive.denyNonFastForwards setting on the server

http://www.randyfay.com/node/89

"How do you prevent git push --force? (thanks to sdboyer!)

In the bare authoritative repository,

git config --system receive.denyNonFastForwards true"

From:

mikeschinkel

One this to consider about Mercurial vs. Git is that with Mercurial most people don't have to read a book cover to cover in order to be able to use it because, in general, Mercurial is logical and intuitive and Git, is, well it's not either of those.

From:

felipec.myopenid.com

> I assert that Git won't be improved to address the problem I'm writing about because the core developers do not perceive the issue to be a problem.

That's because there is no problem.

You are saying that the problem with git is that it doesn't have mercurial "branches". How is that a problem?

What is it _exactly_ that you cannot do in git?

Hint: you can do everything in git.

From:

j16sdiz.myopenid.com

> you do not get your code pulled onto the upstream master branch unless it's a fast-forward.

I think Linus oppose to this workflow.

People wants to stick to the old SVN mindset and git make it possible, so some people are enforcing this.

This is not the git community recommand.

From:

jhw

No, Linus prefers a slightly modified version of this workflow, i.e. he'll pull your non-fastforward patch if he otherwise like the cut of your jib, but he'll bitch and moan about it. And if he doesn't know you and trust you already, then he'll just reject you out of hand without even looking at the diff. It's like the brown M&M's in clause 27 of the Van Halen technical crew contract. It's the same basic process, just not formalized into a hard-coded service configuration.

Which ought to tell you something important. When you make a tool and you later find out that more people are using it differently than you intended than there are people who are following your careful instructions for proper use, the traditional thing that good engineers do is figure out what's wrong with their expectations of how people will be using the tool and adjust accordingly in the next iteration. Instead, the Git community decides that people are stupid, whines piteously that people aren't moar smarterz, and secretly regrets trying to help people in the first place. That's not sound engineering principles in action— that's misanthropy.

From:

j16sdiz.myopenid.com

On Linus:
It is "clean" and "readability" that matters, not "rebasing" and "linearization". The history should be bisectable and easy for later debug session. If you rebase everything you send, you are doing it wrong.

I think linus don't ask you to rebase -- he ask you to send patch in email, not a git-pull request (unless you are a trusted subsystem maintainer).
This is for easier code reviewing. Would you rather reviewing (1) two patches: one with bugs and one fix it; or (2) just one patch?

Consider git as a patch exchange tool with version control as a feature, then the rewriting history thing will start making sense.

See this LWN.net article for some elaboration https://lwn.net/Articles/328436/

On tools design:
This is the unix philosophy of "gives you more than enough rope to hang yourself".
Variant of this appears everywhere, e.g. in C vs Java, KDE vs Gnome, etc. This is an eternal problem. I have nothing to add to this debate.

From:

sidk.info

I think you've hit the nail on the head. I like to think of it this way: Git branches are just simple pointers while Mercurial branches are "lines". Every mercurial commit "belongs" to a branch while every git commit just has parent commit(s).

You really cant follow a line of development back in git especially if there are merges. Hence the whole emphasis on rebasing as you quite nicely point out.

It took me sometime to understand your article. I visited it a few months ago. At that point my knowledge of git was not sufficient to totally understand what you had written. Today was the day of the ah-hah moment!

Git creates complexity and solves it. Mercurial avoids the complexity all together.

Maybe it might be a good idea to graphically show what you've written for other people who visit this page?

From:

jhw

See the followup post.

From:

max630.net

There are reflogs in git. They can play a role analogous to what you are calling "family". But they exist only locally and do not move to other repositories during fetch and push.

From:

friendly12345

I have work with Mercurial for a while now.
I just started learning about it.
In some aspect I think that Git Concept is better than Mercurial.
- First is the Stage Area. I like this feature a lot.
- And second is about the remote. You can save all the remote to pull from and track which commit the remote branch current is. That is something currently missing in Mercurial.
- Also I like the workflow in Git with branch.
But one thing that I don't like about Git is.
Need a lot of knowledge to do a simple task.
As I'm not good at bash script.
So it is a little hard for me.
Also the tag in git is not very friendly.
It should display in the log like in hg log.
Summary, I think Git have it advantage. But still need to improve the user friendly.

From:

arnebab.livejournal.com

What you list are not features git has over Mercurial, but simply different exposition to features:

- If you want a stage area, you can simply activate the mq extension. That even gives you an infinite number of staging areas which can contain multiple unfinished commits.

- Remotes in hg are simply entries in the .hg/hgrc under the heading [paths].

- Git-like branches are called bookmarks in hg.

From:

rich_pixley_hp

My argument is simpler, but comparable, and boils down to the same thing.

I want to be able to push. Git can't push shared branches because in the case of a potential collision, it has no recourse. The UI, "git push" is simple and the semantic is straightforward and obvious - I want my changes in that repository. Rather than do that, git simply throws up it's hands and refuses, (in the case where the destination has that branch checked out, or that branch has other changes that are not mine).

So... there's an obvious semantic, an obvious interpretation, and git doesn't do it. That's a pretty big and scarey culture shock coming from pretty much any source code control system developed in the last decade. With git, we're back to the geographic branches of clearcase multisite where each repository owns a branch and the other branches show up as read only. This doesn't scale very well, as we learned a couple of decades ago. It's a lot of extra work to manage all those potentially-automatic-but-failing merges.

The multiple hg heads are the natural conclusion to this problem of how to handle collisions in the repository. Hg can then propagate them anywhere and anyone in any repository can merge them. Not so in git. In git, you're reduced to sending random emails trying to find the person with whom you need to coordinate.

This isn't unique to hg, btw. Other systems do this as well.

From:

jhw

Well the other side of this is worth a remark.

Yes, when you push with Mercurial, you will silently get a new anonymous head if there are upstream children of your current change, whereas Git will throw a rock at you and refuse to push. This can make Git users feel nauseated because of their need for every lineage to have a carefully manicured pedigree. Indeed Git makes you expressly create the lineage with a distinguished name and go through a registration process before you can push it upstream.

I still like Mercurial better. You pushed something and, oh, that meant creating a new lineage. So we did that. Maybe you want to give it a name, maybe it won't be around long enough to deserve one. If it needs a name, then you do that with a bookmark on the new head. Pushing the bookmark is optional. What's important here is that Mercurial didn't throw any rocks at you and no history got destroyed. That seems obviously superior to me, but whatever.

From:

frgomes [launchpad.net]

Thank you for your excellent comparison between git and hg. It's definitely rare to find an article of such quality.

By any chance, do you have experience with Bazaar too?

Thanks

Richard Gomes
http://rgomes.info/

From:

jhw

Nope. I have no opinion of Bazaar, other than a vague expectation that I'd probably like it better than Git. I understand it has an interestingly pedantic approach to computing differences that can show up in merge operations, i.e. it will merge some rare conflicts cleanly that both Mercurial and Git will silently fail to flag as conflicts and proceed to merge wrong.

From:

fmccann

Bazaar out of the box and Mercurial with named branches makes tracking work extremely easy, and most of the work that git users do to "clean" their histories is just a complete waste of time compared to these workflows. Git by design discards branch history. It seems unintuitive, but the way to compensate for Git tracking less information is to lose even MORE information via rebases and squashing to "clean" the history. Bzr and hg track enough information to easily show you branch histories so you always get the level of detail you want without having to manually hack up the DAG.

See http://duckrowing.com/2013/12/26/bzr-init-a-bazaar-tutorial/