As discussed in earlier submissions [1], this approach does not account for spam PRs, projects that close PRs to move discussion elsewhere, or other PR closures that are not about rejecting code.
I am working on addressing that. So far spam detection is the most difficult one.
Ignoring PRs that were closed by the author themselves or changed readme.md helps but is obviously not enough.
There is more repos like this and it is hard to detect them. Especially that sometimes they do have some actual merged PRs in GitHub for some special circumstances (for instance bazelbuild).
It would be nice to further analyse the merge factor: Very often there are a handfull maintainers but a bigger team working on the project (or close circle). It would be nice to cluster the prs by people (many prs, medium prs, few prs).
Because if i want to submit a pr it does may make a difference whether i am a very regular contributor or want to submit just this time because i found a bug.
This is an interesting idea. There certainly are pretty crisp modes in distribution. In a typical company backed project some devs will make and merge multiple PRs a month while outsiders will make a couple PRs in total out of which one or two get merged.
I have a checklist for frontend dev that I run through before submitting PRs - https://gist.github.com/onion2k/8615d92178f9d1eb2d162da69495... - it doesn't improve the quality of a feature but it stops my PRs being returned for silly problems. I can highly recommend writing your own.
[1]: https://news.ycombinator.com/item?id=25822408