I'm a huge fan of Github. I recognize the huge effect they've had, transforming the Open Source community, and I'm grateful for it. I also love how they run their company, how open and talkative they are about their organization. Their blog holds many gems.
However, I cannot work on Github. In my opinion, the Pull-Request mechanism is weak because it fails to address a major point: It's highly improbable that I accept the PR on the first try. There's going to be a lot of back and forth discussion, reviews, remarks, etc.
Standard PRs I deal with on Github usually end up with several redundant commits. When I ask people politely to rebase all the commits they're sending into one, most don't know how to do it. (It's okay not to know. It's weird when someone with several contributions per day never had to do it before). If I `git commit --amend`, my PR is actually overwritten and I lose the history of the first patch I've sent.
I started disliking working with Github ever since I started working on OpenStack. The process to send a patch there might seem a bit daunting at first, but it's really not that complicated. Using Gerrit (https://code.google.com/p/gerrit/) for code review has many advantage, like limiting your patch to one commit, keeping a detailed log history of successive patch sets, and generally making reviewing more inviting. On top of that, the whole suite of tests is ran against each submitted patch in a virtually never resting CI server.
And finally, I don't find Github's Pull Request that "easy, simple and fast". My workflow with OpenStack is a lot faster. 1 command:
If I recall correctly, Linus Torvalds went on a highly publicized rant against Github PRs not that long ago.
I'm not arguing that every Open Source project should have a complete QA infrastructure, and Github is a great place to deal with your first Pull Requests. However, I do argue that you can very quickly reach the limit of what Github can give to you in terms of collaborative tools.
My rule of thumb: If you have more than 10 contributors, at least 4 of which are active daily, it may be worth it to invest in some real infrastructure.
I agree completely about revisions on PRs. Also, proposed patches often spawn much broader technical discussions that are much better on a mailing list than on a random commit (that will likely become unreachable later since it's not going to be accepted). Nothing beats the archival quality of mailing lists and if you start your code review there, you never need to make the choice to "move" a discussion from GitHub to a mailing list. The problem is that mailing list review requires a lot of discipline and some popular MUAs make it especially painful.
The value of PRs and other models is that they degrade more gracefully to lack of experience/discipline, and (perhaps) to casual involvement (subscribing to mailing lists with delivery turned off is fine as long as there is a convention to Cc people that are likely interested in a specific change; not many people know about subscribe-without-delivery).
> proposed patches often spawn much broader technical discussions that are much better on a mailing list than on a random commit (that will likely become unreachable later since it's not going to be accepted)
PRs are also issues, and I vastly prefer GitHub issues to mailing lists (to each his own, I guess).
I like inline code comments, especially when the original commits are not well factored. But I find it too common for a discussion starting from a patch to gradually become general and ultimately involve dozens of people that had no interest in the original patch, but care a lot about the general discussion. Sometimes these involve multiple projects, in which case cross-posting to another mailing list makes sense. With GitHub Issues, you generally tag individuals rather than groups of perhaps hundreds of people (most of whom filter on their own criteria).
Later, how well does search find the 100-message design discussion starting on a commit in a PR that was later amended and then rejected? And what if you move the repository elsewhere?
PRs being issues is an issue in an of itself: it tempts one into not opening an issue for the bug the PR is fixing. Then, if the PR is closed, there is no open issue for the underlying bug it was attempting to fix.
I am a huge fan of Gerrit, other than its UI. I ran it along with Jenkins[0] at a previous company. Its review model is far, far, far superior to Github's. As you said, it keeps successive patch sets, and allows for diffs against previous patch sets, rather than only against the original parent. This allows for you to see only the most lastest changes, so you don't have to wonder if anything else was changed and review the entire patch.
> $ git review
I actually wrote a bash script with the same name and a few other shortcuts before my company started working on OpenStack. I was quite pleased with the workflow I provided the team I was on.
$ git start <new-branch> <branch-from>
$ <do work as normal>
$ git review
$ git finish # removes the branch
These three commands along with the git-flow branching model[1] (but not the tool itself), leads to clean, sensible history in my opinion.
I respect that Github tries to maintain a low-barrier of entry to increase use and ease, but I believe there's a way to maintain it while still having a great patch-review model.
That said, I do like pull requests: you might lose the history locally, but the PR contains a 'see outdated diff' that shows what it used to look like.
I'm using GitHub more and more — ultimately because so many more developers are willing to submit patches there than elsewhere (presumably — at least — because they already have a clue what they're doing, rather than having to work out how to submit something).
In the end, what I've ended up doing is using Travis CI with its GitHub hook, which gives the whole suite of tests against each and every patch, as well as using external code review (mostly using Critic (https://github.com/jensl/critic), which supports explicitly rebasing branches, collapsing reviews into all pending commits, and so on, all gracefully — unlike GitHub's code review).
While I'd like something better, the issues it throws up (not using the default code review system most obviously) as well as — as you say — the complexity of submitting a PR, are, in my opinion, outweighed by the extra contributions that are got by using something other developers are already comfortable with.
Do you (or anyone else) have a link to this alleged Torvalds rant about PRs?
I have mixed feelings about the idea that all contributions should be squashed. I see the attempt at fastidiousness, but I also think that git makes it very easy and reasonable to keep multi-change history while linking it to only one commit in the eventual target repository.
I do wish it was easier to rebase a branch such that backmerges were pulled out where possible, cleaning up the graph at least, but I tend to look at my soup branch (master, usually) with git log --first-parent most of the time anyways and that's not much different from if they'd been squashed.
"My rule of thumb: If you have more than 10 contributors, at least 4 of which are active daily, it may be worth it to invest in some real infrastructure."
Disclaimer:
I'm the creator of GitSense. We are working on a solution to make GitHub pull requests enterprise ready.
I do agree that GitHub's pull request model is not quite enterprise ready, but they have a solid foundation that you can build on top of. With their API, were able to build a solution that I believe will address most of its short comings.
For example, my first concern with GitHub's pull request system, is it is at the repository level. With Gerrit, you can see requests at the branch level. With our enhancements you'll be able to track pull requests from different repositories at the branch level:
We are also able to address the concern of dealing with new commits. With our Smart Attributes technology, it is very easy to flag what commits you have reviewed.
We also take care of the problem of not having a side by side diff. With our solution, you'll be able to use side by side diffs to review pull requests.
> When I ask people politely to rebase all the commits they're sending into one, most don't know how to do it.
Very easy solution to this problem. 1) Make note of it in CONTRIBUTING.md. 2) Do it for them. Check out their branch, rebase/squash/fix whitespace/etc and merge.
A coworker and I recently discussed how Github ought to invest some time into templated workflows. Conventions like "make a branch per pull request" can be mandated by a repo owner, as could "rebase all your commits". At minimum, providing a TODO list for pull request submissions would be good.
I have pulled no fewer than 15 freelance contracts from my personal Github.Just from code that I have thrown out there because I don't have a commercial reason not to.
One of the cool things is that deploying code on Github forces you to make at least some sort of effort of documenting it (a README, generally) and cleaning it up a bit to be reusable. It has helped me reuse my own code, because of that.
It improves code quality, makes friends, helps people and drives work. Sharing is a worthwhile business activity.
It would be amazing to see a site that pairs experienced
developers (especially women developers) with less-
experienced women to sherpa them into contributing to
open source.
Is there really a need to give women special treatment or help when it comes to open source development? Wouldn't that reinforce the (unbased) idea that there's a disparity between genders in tech? Or is there?
I don't have a strong opinion about this particular thing, but it seems like a decent idea.
But let me speak more generally.
Sometimes it seems like some people have a very "logical" way of looking at ideas. It's like every idea is interpreted according to some metaphysical schema and judged by whether or not it's conceptually harmonious and untainted.
In this case, the schema is "equality," which ironically means that any concrete idea that addresses disparity is prematurely judged as faulty or tainted.
It's stunningly obvious that the open source community is dominated by males, even if it's not more so than the programming community at large. This is a matter of statistics, it's not complicated. This paper http://jitm.ubalt.edu/XXI-4/article3.pdf cites a figure of 1.5% OSS developers being women, which doesn't seem entirely implausible.
This simple statistical fact should mean that some women who do want to pursue an interest or career doing open source work might indeed have use for the kind of thing we're discussing. So why not?
There IS a disparity between genders in tech. This is so obvious. Why on earth would you call this an unbased idea?
That there is a disparity doesn't mean that there is an "essential" or "necessary" disparity. It just means that at this point in human history, women are extremely underrepresented in this particular section. In that situation, it seems more than appropriate to set up various measures to support diverse engagement.
You're right about there being many less women population-wise, but I wonder what the cause is, and if there is a more direct way of addressing the problem. Your reply made me reconsider my statement.
I would say this is a great idea for the same reason having more minority male english teachers in minority schools is a great idea. For whatever historical reasons, certain groups are underrepresented. This snowballs when youth try to get a foothold into the field, since they see no one like them succeeding. Is it logical to have more confidence (a huge indicator of success in just about anything) when the teacher looks like you? Probably not, but that does not stop it from being effective.
There are plenty of people in the world (not all of them men) who think that women should not be in education, should only be child carers, should not talk over men, etc, so if you just take the approach that you personally will be completely unbiased, then while you are not adding to that problem, you are also not doing much to assuage it.
Neutrality is all very well, but it often takes a lot more than neutrality to get somewhere if a lot of society is pushing back the other way from where you would like to get to.
People say that they want more diversity in tech. Perhaps ironically many also go to great lengths with proudly proclaiming how nerdy they are, and building a whole subculture that is heavily associated with fields like programming, science, etc.
If they want more women in order to get more diversity in the field, why are they so deeply entrenched in and identify so strongly with a specific subculture? Is the goal really to have a diverse array of people in the field? Or is it to get yet another "I'm such a nerd!", only with different genitals?
GitHub may be a good service, but THEY ARE PROPRIETARY. Thus, it is absurd to talk about them as important proponents of Open Source without mentioning this hypocrisy.
I would go so far as to say that MOST open source contributions come from developers and companies that also work on proprietary systems. I really do not see a problem with this. There are legitimate business, legal, security, and technical reasons for this. Personally, I open source whenever I can, but much of the work I do would not be feasible in an open source model (highly custom systems, competitive advantage, legal restrictions on code or data, code that would be of little value for anyone but me and my clients, etc.).
That is true, but they do that so they can continue to serve their business easily. Would you rather them be open source and take away all the free hosting? Furthermore, they are easily the best product in their category. If you want an open source Git Service, use gitorious or something.
Does anyone have any tips on getting started contributing to open source? I'm looking to get into it but I just don't know where to start or even how to find projects where I could contribute.
It may be easiest to look at small projects you are already using (but want to fix something... or just, write better documentation, which may be the best start).
I think Github made a huge effect on Ruby On Rails movements ( or revolutions). So many peopole use it (instead of mysterious Python community before), brought theories into reality. Thank you, Github.
Why should people have to work for free? I have a skill, I expect to get paid. I don't see my plumber doing pro bono work at the weekend.
Not everyone has to adhere to your philosophy. You license your code for me to use, that's your prerogative. Don't complain when people don't give anything back.
To paraphrase your last paragraph: "If your ideal system allows parasites, you're not entitled to use mechanisms (e.g. social stigma) to control those parasites."? Complaining about parasitic behavior is a form of social control. It actually comes part and parcel with the open source license. If you don't like people talking bad about free loaders, stay away from the open source community.
"Beneath the appealing, easy-to-use interface of Mac OS X is a rock-solid foundation that is engineered for stability, reliability, and performance. This foundation is a core operating system commonly known as Darwin. Darwin integrates a number of technologies, most importantly Mach 3.0, operating-system services based on 4.4 BSD (Berkeley Software Distribution), high-performance networking facilities, and support for multiple integrated file systems.
Darwin 1.4.1 is the UNIX-based, open-source foundation of Mac OS X. It is based on FreeBSD and Mach 3.0 technologies and provides protected memory and pre-emptive multitasking. This release corresponds to the release of Mac OS X 10.1."
If a company decides to open source a project that is not directly related to their business, then they are gaining an advantage - now their code can be vetted and improved by the open source community and they can spend less engineering time on the project themselves while they focus on their real business.
Where is this magical "open source community" which is eager to examine dull business code? It's a myth. The Debian project shipped a broken PRNG for ages, and that is a popular project.
Who ever said 'dull business code'? Take Twitter Bootstrap for example - It started off just so Twitter could iterate faster on internal applications and not worry so much about styling - Now they've open sourced it and others can contribute. It does nothing to better their business model.
There's no reason to invoke No True Scotsman here. There are valid justifications for both open-source development (as described in the article) and closed-source development (e.g. IP).
A hack writer is a term that massively pre-dates the usage of the word hack in technology, often used to describe journalists or pulp novelists, so I would assume that it is a deliberate play on words marrying more than one meaning of hack.
Which would also mean that the term "hacking text" in this context is itself a bit of a hack.
...
edit - I went for a look at old uses of the word hack as it pertains to writing and found that there is another variation in meaning, that of "hack words".
Here it is in use from the 1858 periodical, "The Ladies' Companion", in a passage that as chance would have it is discussing the creative usage of words -
"Of the influence which German literature Jean Paul and others has had on Carlyle enough has been said elsewhere. This influence certainly shows itself markedly in his style though by no means detracting from its originality.
It has given to him a somewhat burdensome richness of compound words and a few unfamiliar derivations. He is fond of seizing upon the primary meaning of a word and bringing out that meaning forcibly by contrast and repetition. He introduces innovations of foreign words to a great extent and not unfrequently uses simple ones in an obsolete or new sense.
These may be looked upon as faults but it will be found that his foreign introductions are mostly from languages which form the basis of English having an affinity to accepted English words and that they invariably have a peculiar significance which could not have been given by more common expressions. His obsolete or new senses too show generally an evident reason for their adoption.
He is fond of hack words - a kind of slang of the day; for instance, "sham", "jargon", "gigmanity". He uses such words as watchwords of the time wherein he writes, as bearing in them a nineteenth century view of things."
However, I cannot work on Github. In my opinion, the Pull-Request mechanism is weak because it fails to address a major point: It's highly improbable that I accept the PR on the first try. There's going to be a lot of back and forth discussion, reviews, remarks, etc.
Standard PRs I deal with on Github usually end up with several redundant commits. When I ask people politely to rebase all the commits they're sending into one, most don't know how to do it. (It's okay not to know. It's weird when someone with several contributions per day never had to do it before). If I `git commit --amend`, my PR is actually overwritten and I lose the history of the first patch I've sent.
I started disliking working with Github ever since I started working on OpenStack. The process to send a patch there might seem a bit daunting at first, but it's really not that complicated. Using Gerrit (https://code.google.com/p/gerrit/) for code review has many advantage, like limiting your patch to one commit, keeping a detailed log history of successive patch sets, and generally making reviewing more inviting. On top of that, the whole suite of tests is ran against each submitted patch in a virtually never resting CI server.
And finally, I don't find Github's Pull Request that "easy, simple and fast". My workflow with OpenStack is a lot faster. 1 command:
A colleague of mine says it a lot better than I do: http://julien.danjou.info/blog/2013/rant-about-github-pull-r...If I recall correctly, Linus Torvalds went on a highly publicized rant against Github PRs not that long ago.
I'm not arguing that every Open Source project should have a complete QA infrastructure, and Github is a great place to deal with your first Pull Requests. However, I do argue that you can very quickly reach the limit of what Github can give to you in terms of collaborative tools.
My rule of thumb: If you have more than 10 contributors, at least 4 of which are active daily, it may be worth it to invest in some real infrastructure.