I've always established the team rule that most PRs must be reviewed within a day. At the very least, devs should be reviewing any open PRs before they leave or when they first get in.
Obviously, there's going to be exceptions for large changes or big rewrites on the one hand, or for tiny label changes on the other hand.
PS. Reading the comments here, people seem to work in some pretty dysfunctional environments. Get the team together and establish some guidelines that everyone agrees on. Review time should be predictable, and the work should be reasonably shared.
In the midst of the perennial "woe the kids today" rant, it's worth considering that this generation is the first generation in the 300,000 year history of homo sapiens sapiens to widely utilize reading and writing as a primary means of daily communication.
Reading/writing part of the brain is a repurposed part which otherwise responsible for the faces, facial emotions, etc. recognition. Giving the already noticeable decrease of the in-person skills in the young "texting" generations we can speculate where it may go.
They might all become hyper-nerds interested in D&D?
Sounds cool.
But then, my brother was already into D&D in the late 80s back when I was learning to read from the Commodore 64 user manual, so of course I'd think that :P
Yes, they'll be able grasp the beauty of Les Misérables or The History of the Decline and Fall of the Roman Empire so long as it is fed to them as a stream of <280 character tweets, dohohoho.
Mr. Munroe is falling victim to the unfortunate phenomenon where people believe their popularity means that their opinions outside of their areas of expertise are well-informed. Whether they can spell better or not, minds weaned on little chunks of text laced with memes and emoji are going to struggle with chapters, let alone full books.
Given the comic is an echo of conversations like this, I think perhaps that if he is guilty of that then so too are you and I and all others here.
Myself, I say that to equate the modern proclivity for tweets with a degradation of intellectual rigor is as fallacious as imagining that the concise elegance of Tacitus foretold a dulling of Roman wit. Or something like that.
Will they (kids these days) like the style of old classics? Of course not, just as few native english speakers alive today wish to speak (or write) in the style of Shakespeare — breaking a long thing up into tweet-sized chunks, that is simply style, no more relevant than choice of paragraph or sentence length.
But to dismiss a generation’s capacity for engagement with monumental works (be they Les Mis, The Decline and Fall etc., Shakespeare, Dickens, or any other) on the basis of their chosen tools of communication betrays not only an ignorance of how often communication has changed so far — when Edward Gibbon wrote the Decline and Fall, literacy in the UK was somehere around the 50% mark, but still the illiterates could watch plays and listen to tales — but also modern attention spans when we also have binge-watching of entire series of shows as a relatable get-to-know-you-better topic on dating apps.
I mean, I'm pretty sure that more people have read Sam Pepys in the last 15 years than in the previous few hundred, due to https://www.pepysdiary.com (it's a bot that tweets Pepys' Diary in real time; no longer available on Twitter due to API changes, but it's still on RSS, Mastodon and Bluesky. It's on its third run through the diaries).
You're greatly overestimating how much an oral tradition leads to fixed wording. This is a pretty well-studied field at this point in time, and non-poetry oral traditions just don't generate the kind of long word-for-word identical passages that we see in Luke and Matthew.
There's a lot of debate over the synoptic problem in the academy, but almost nobody doubts that the solution involves a literary source instead of an oral one.
No, macOS will just put quotes in the name. I just saved file from vscode as "testfile.sql", including the quotes, and wound up with a file named "testfile.sql".s.
1. Put some of the forbidden symbols in the file names of, let's say, a bunch of photos.
2. ZIP them
3. Send to someone who ordered those photos with Windows/Linux computer.
4. An hour later tell them that everything works on your computer and you don't see the problem. As a bonus you can film unpacking the zip file and saying: "You see, it works. There's something wrong with your computer".
I don't know of any Unix that allows forward slashes in file names. On macOS you can use : which is displayed as / in Finder, but in exchange you can't put a : in file names...
One of my biggest pet peeves about modern development is just how many people don't know jack about git even though they use it every day. It really annoys me when I see a pull request with 20 random commits (with messages like "temp" or "checkpoint") or when people merge main into their personal feature branch instead of just rebasing their branch (yes, sometimes that's not right, but I'm not talking about the corner cases).
I always think about using "clean up a pull request" as a fizzbuzz-ish screen in interviews. It just seems like a decent proxy for "do you care at all?".
Just to be clear, and you probably won't disagree with this, there's nothing wrong with commits like "temp" or "checkpoint" or "WIP" (Work In Progress). I often make these sorts of commits as I'm working on stuff.
The issue is in submitting an MR/PR with those commits. There's an expectation among professionals that you make your work presentable before submitting it for review, although those who are new to the profession don't realize that this cleanup step is necessary (how could they? the intro courses don't teach this and they're usually struggling enough with the code).
I just wanted to throw this comment in here in case some newbie sees this and comes away thinking "oh, I can't have 'temp' or 'checkpoint' in my commit messages"
I encourage developers to clean up their commits but so often they're either unduly nervous about breaking things or have a fascination with "preserving the history" (of their branch, which is not yet merged in anywhere).
I partly blame the excessive fear mongering around rebasing, where the strict Never Rebase a Pushed Branch rule is drilled into them and they never learn why or when they can break the rule safely.
So it's an uphill fight but I just try to teach by demonstrating, frequently, exactly how they can tidy up for the merge request.
I couldn't care less about the commit messages on a PR, I care about the diff. as long as the messages are cleaned up in the squash and merge with main
If it's large I want to be able to look at the commits in isolation and understand the work in the logical chunks that made sense to the author.
If what made sense to them is temp, checkpoint, temp, temp, undo the temp, fix test, try this, that didn't work try other thing, maybe?, temp, fix test - then I don't stand a chance. Recently I reviewed one that had multiple 'rebase' commits, I have no idea.
- But sometimes I want to have a few distinct (isolated commits for my PR)
- Then make the PR small
- But I have several changes
- Then make several small PRs
We keep coming back to “just make more PRs”. Which is curious, given that GitHub doesn’t even support dependent PRs. The thing you need immediately when your PRs start depending on each other (like refactor X before implementing Y, where both touch the same code).
But I can’t come to any other conclusion than that this is because of the over-focus on PR as the one and only reviewable unit. Thanks to GitHub.
I can’t really get on the small PR train. A PR has overhead and I often want to do small changes that I only happen to notice are necessary when I am in the middle of doing that one PR.
That begs other questions. The relevant reviewer isn’t available until Wednesday. Do I start on the next tasks in dependent branches? The act of making all those branches is a large part of the inherent busywork. And then I need to remember to get back to them when the reviewer gets back.
The benefit of all these small PRs is yet to be revealed.
Is the dialogue here that you ask questions that makes the workflow more and more constrained and then when X factors are fixed it becomes obvious that the workflow works great?
Of course I start on work that I (not the reviewer) can work on right now.
3. I'd rather the PR were whatever size it needs to be to entirely do the thing it's supposed to do (and nothing else) than conform to some arbitrary size requirement.
But that's the thing, in most cases there's no review step anymore after the squash/merge has been made; while it's in a branch / MR, you can still edit the commit messages and content, but in most cases the squash and merge is an atomic, unreviewed step. Of course, maybe the merge commit should be generated, or should be filled in and reviewed as part of the main code review.
Yeah, I’ve noticed that many around the Internet just comment with “the squash and merge” as if it’s a given that that’s the strategy that is used (even mandated).
Git rebase aka People rewriting history to make it LOOK like the commit was a compilable, working work result at a point in time even though the real history was something completely different, then f*cking up the rewriting of said history, then whining "please mighty Senior SWE heeeelp I don't know what happened please help my work is gone".
It all has trade-offs.I prefer seeing the dirty laundry.
I don't; once the feature has been merged, how you got to that point is no longer relevant; it's noise. Git is a log of code changes, if you use it as a work log you're using it wrong.
That the code is “compilable, runnable” is never a given even if you never use rebase.
We can sling around stereotypes of people failing and doing a bad job with their strategy. Then going to cry to someone (presumably you?). But I don’t think that advances the conversation.
True. And I don't think I ever claimed that never using rebase is a guarantee for anything.
I rather wanted to point out the following:
Using a commit strategy based on git rebase is neither right nor wrong. It is not even best practice IMO. It has its own footguns.
Since the parent comment was very opinionated and cast judgement, I responded in kind.
I have been the person crying as well as the one who solved the mess, let's not kid ourselves.
It's a mindset problem; people use git commits as the equivalent of hitting save in their editor, and they feel like they should use it as a work log, justifying their time spent or feeling like they have to demonstrate how they got to a certain solution. It's not relevant. Ego is not important. You don't need to justify your time.
I agree with you in spirit (people need to learn their tools more) but your examples...not so much. Merging main into the feature branch was the original intent on how to do it. PRs are sent the way you describe because GitHub literally offers the maintainer a squash option and it's a lot easier to review a pull request when history is unedited.
PR should be sent to review in a state that author considers ready to merge. Assuming it will be squashed and taking that as a reason for submitting dirty history is just sloppy and unrespectful.
Merging is for keeping track of a group of commits that has been taken from a feature branch and included in the mainline. Why would you clutter the feature branch with periodic main merges when you can cleanly rebase it and keep it tidy?
> Merging is for keeping track of a group of commits that has been taken from a feature branch and included in the mainline.
Merging is for bringing a group of commits from branch A into branch B. It is, quite literally, the original way to perform this operation. It's not "clutter", it's a correct picture of how the code was developed.
It's clutter because it adds no information value on a short lived branch. If it's a branch that periodically syncs with another it's ok, if you're just basing off current master for a feature branch rebase is the way to go.
I've developed branches against a moving target all the time where the moving target introduced a problem in my code that wouldn't have been found by a simple merge conflict resolution. It's much much easier to find the source of the problem when you have the real history (a merge) instead of a rewritten history (a rebase).
> it's a lot easier to review a pull request when history is unedited.
Ist it really? If you would See my uncleaned history, it would take you days to understand what I was even trying to do.
Your statement seems based on the assumption that someone knows how to achieve a particular thing right from the start. But that isn't always the case and there might be a lot of different approaches until the correct or best one is found.
Do you really want to review dead code that's somewhere in the commits of a feature branch?
Yes, I'd much rather review a PR with the full unedited history, not the history that the submitter thinks is good or pretty. You often don't know what information you need until you need it. I'll take the entire, dirty history, and once I'm done with it it can be squashed.
Can we take a step back here and ask what point you are trying to make?
A user (who wasn't you) called me out for talking about GitHub instead of git, and I said it was perfectly fair because the original discussion was specifically about sending pull requests, which is a term we only talk about in 2024 because GitHub made it a thing. Therefore, it's entirely fair to discuss it in the context of GitHub and not git.
Now we are five posts down into this bizarre tangent and I am unsure what point, if any, you are trying to raise here. That people now use the term pull requests when not using GitHub? I don't think I've seen it anywhere except for hosted services, but my experience is not universal.
We don’t have to rehash things. Let’s chalk it up to me getting some wires crossed. (It took me until writing half of the original reply to realize. Hah!)
Linus Torvalds needs the merge operation as an integrator.
Considering the email workflow of the kernel I can’t really make sense of “intended way to do it”. For individual commits people send out patches. I’ve never seen an email thread where some merge topology is recorded: it’s just a list of patches. A straight line.
I’m pretty sure that people used patch queues before Git (and even now with quilt). Restacking a bunch of commits on top of the mainline is the same operation as a rebase.
I’ve certainly seen Linus get mad at another maintainer for allowing a back-merge into his history (merge main back into feature branch).
It seems there are no best practices here. I remember senior devs working with a gui for git that didn’t have rebase in it. I worked on the cli, whenever I mentioned that I rebased people looked at me with raised eyebrows since I was not a senior and new.
> It just seems like a decent proxy for "do you care at all?"
Because the reality is, when it comes to Git history, no, I don't care in the slightest. I get all the information I need by:
- Reading previous PRs (the final diff)
- Checking the name on a git diff of a line
- The ticket reference
Git commits are a tool to help me write code and reverting to a "known-good" state. Once it's merged into master/main, I don't care how messy it is because 99.999999% of the time, I'll just go back to the merge commit.
One of the nice things about Git is that it is a fast (not just local but fast—these people care about speed) lookup program for all things relating to the code.
I want all immediately useful code information inside Git. Because then I look it up quickly. Unlike having to go to at least two different web applications (PRs and issue tracker) and find the info there, often in an inferior and more convoluted format.
But it takes surprisingly little to sell back centralization and lock-in to developers, even when working on top of a decentralized tool.
> But it takes surprisingly little to sell back centralization and lock-in to developers, even when working on top of a decentralized tool.
I don't care about centralisation / decentralisation in my work. What I care about is that I have the information I need to do my job.
> Unlike having to go to at least two different web applications (PRs and issue tracker)
PR descriptions can be part of your merge commit message so I don't know why you need to go to a web application if you don't want to. You can also read the full diff in git diff so I don't really see what you're upset about.
From my experience, nobody teaches you Git properly in school. And once you get your first job, it is too late.
We had various lectures on languages models, math, algorithms, networking... absolutely nothing on git (I did my classes between 2008 and 2013, things might have changed now)
same case here. almost new guy on my team we need to train them because basically just know to use what vs code exposes as git interface and some of them can replicate the same flow on terminal. rebase? that's a forbidden command for them, while our entire company use this as daily basis.
It's so frustrating. We had a reasonably okay process going on, but then another team started working in our repository on their feature completely ignoring the fairly lightweight commit message format we ask them to uphold, and they were like "doesn't matter, we'll squash merge it". But it's not even about the commit messages; it's about the workflow they have behind it, where they commit and push as often as they hit ctrl+s, which to me communicates they're just hacking around until it works.
A commit should be atomic; it should be a complete change with code and tests all adjusted, and ideally a message explaining what and why it changes. (In practice / in my line of work this doesn't happen because it's all front-end code implementing some poorly documented user story in jira, but hey).
If I'm ever involved in hiring again I'll add git usage to my list of criteria. Along with whether they can actually touch type. I can't believe that the standards have dropped so far that basic computer skills are no longer necessary apparently.
I share your frustration, personally. But unfortunately Git is more user-unfriendly than it needs to be, and doing things the proper way tends to get you into a rabbithole.
Like merging main back into your feature branch: just rebase. But then you often need to re-do conflict resolution. You have git-rerere but, eh, it’s not discoverable at all. But let’s say you get over that hurdle. Now the next obstacle is the “never rewrite shared history”. And if you’re in a “corporate” environment chances are that you publicize your branch when you make a PR. And it can take a few days to get approval.
Now I care. But sometimes I have doubts about whether the caring is well-founded. Exactly because sometimes people around me seem to care not one bit.
I don't think this point is about configuring your editor/environment as much as just knowing it. There's certain features that come up all the time: find file, find class, go to implementation, find references, rename var, etc.
It stuns me how many devs do this stuff manually, when virtually all editors/ides have ways of doing these things.
In some ecosystems, the tooling is just not that good. To use ruby on rails as an example: if you lean into the IDE's expectations for code layout it works ok, but there will always be generated methods and variables with nothing to show but the name—no documentation, no source, no googleable identifier, nothing. In these cases there's nothing to do but pull out the rails (or library) source to try and discern their intention.
In my experience, lisp also has this issue of being very difficult to tool in a general sense, as did aspects of writing c/c++ years ago (maybe recognizing stuff like macro-generated symbols has improved by now).
> lisp also has this issue of being very difficult to tool in a general sense
Most of the common lisp tooling is already present in the language itself. Things like inspect, trace, describe and apropos already gives you the equivalent of most IDEs. I agree with you for some dynamic language and magic methods. It can be hard to trace back the exact function that are being called. But you can always design some tooling for it as long as the code follows the ecosystem convention (Laravel plugin in PhpStorm).
The nice thing about Vim (and other configurable editors) is how easy to mesh existing tooling with the editor itself, without requiring for that extension to be a whole project unto itself.
This is a disadvantage of the more dynamic languages, I find. IDEs for languages like C++ and Java are relatively fast and accurate when jumping to definitions, showing references, performing automated refactorings and so on. I rarely see that same precision for any editor or IDE working with TypeScript, Python, and other very dynamic languages, even with the improvements recently thanks to LSP and generally better tools. When almost anything could theoretically have been patched at runtime — even if doing so would be a terrible idea and every style guide says don’t do it — it’s not safe for tools to make the same kind of assumptions with the more dynamic languages that they can with the more static ones.
If you follow through to the study, they displayed products on a screen to people in an MRI scanner, and asked them to evaluate the offers they were seeing. Unsurprisingly, after about 40 minutes their minds start to wander. Because in some warped researchers mind, being in the dark staring at a screen for an hour is exactly like being walking through a supermarket.
Did anyone notice that the author talks about scams and cons, but when it actually comes to a list of examples the majority are actually about plagiarism?
Which... I largely don't care about. I understand why it's super important for academics, but in my book it's not a con or scam. It's accurate information. If somebody is giving me accurate information, the fact that they don't have correct citations isn't really a concern to me as a consumer of the data, and I absolutely don't put it in the same category as faking data or lying about results.
> Did anyone notice that the author talks about scams and cons, but when it actually comes to a list of examples the majority are actually about plagiarism?
It’s an article on a .edu written in the wake of one of the highest profile academic plagiarism scandals in a long time (Claudine Gay). It’s not an article targeted at general audiences, you have to read it in context.
The Claudine Gay plagiarism scandal has been difficult for academia because there were many reactionary responses trying to defend her, but after further investigation people are realizing that her plagiarism was something that would have gotten any average student in severe disciplinary trouble. This has refueled the conversation about everything from plagiarism to falsified data that has become a worrying trend in academia: People are getting duped at worryingly high rates and the initial response to uncovering the academic fraud is to deny and defend.
>Which... I largely don't care about. I understand why it's super important for academics, but in my book it's not a con or scam. It's accurate information.
This is not correct on a few levels, at least in the context of science. At the most basic, you're engaging in circular reasoning. You're accepting it's "accurate information" as a truism when the point of science is to discover what is actually "accurate information" in the absence of any oracles. Someone who is plagiarizing doesn't actually know whether what they're plagiarizing is accurate or not, by definition they haven't done the work. They don't know how it all connects together, and not only might what they're copying be wrong, they're more likely to introduce errors of context and omit qualifiers.
Tying into that is the issue of meta-information as well. One of the core foundations of science and assessing whether information is accurate or not is replication. A second/third/fourth/etc researcher independently reaching even 100% identical results is itself new information each time, even if conclusion is the same. More independent replication raises the chance of signal and decreases the chance that it was noise, some unaccounted for variables unique to a given lab or researcher. Everyone makes mistakes, but even with zero mistakes low probability things can happen in any single given experiment/study/place. Diverse distributed replication is a basic way to help discover/dismiss that.
A plagiarist in research is therefore, at a bare minimum, always engaged in a con/scam: they're claiming they have independently produced a result, which then adds to the weight and other people will be more likely to depend on. They have not.
Of course, they've also conned/scammed whatever money/time/resources anyone else contributed to them with the expectation of independent work and thus at a minimum new replication information. They took that, and then didn't follow through. It's fraud. And there is opportunity cost there since those are a zero-sum game, the resources that went to funding a plagiarist could have instead gone to fund someone honest who could produce something with actual ROI as expected.
Every single case of plagiarism can be fixed with some combination of quotation marks, proper citations, and rephrasing of relevant text. It is never a claim to have independently reproduced a result.
If a scientist claims to have independently reproduced an experimental result - and they haven't - that is outright fraud. It doesn't matter if they describe that experiment in original words, with proper citations and quotation marks/blocks.
>If plagiarizing is bad because the copied info might be wrong, then original research is bad because it might be wrong.
No, that's not what I wrote at all. Original honest research might be wrong and that's completely fine and understood, that's the entire point of replication! But if someone lies and says they've independently replicated it when they just plagiarized that's fraud and wrong. Telling the world original research has been replicated when it hasn't been is different then a tentative first result. Depending on the area nobody else may then attempt replication for awhile, assuming it's already been done and instead waste resources trying to build upon it only to later discover it was wrong. The original team has also been robbed of independent review they may have otherwise gotten, and themselves move on only to much later have the rug pulled out from under them. Depending on the nature of what caused the original research to be incorrect it may have further ripple effects. Say some piece of equipment or new instrument has some subtle flaw that a true replication attempt would have discovered, but instead it then keeps getting used and causing other experiments to go off.
And this is all in the generous case where the plagiarist does at least pretend to have the same result as what they're copying, vs hacking it into something else and generating total garbage completely.
> I also wasn't aware that "unit" referred to an isolated test
It never did. "Unit test" in programming has always had the meaning it does now: it's a test of a unit of code.
But "unit test" was originally used in electronics, and the meaning in electronics was a bit closer to what the author suggests. The author is being a bit fanciful (aka lying) by excluding this context and pretending that we all don't really understand what Kent Beck et. al. were talking about.
<< I call them "unit tests" but they don't match the accepted definition of unit tests very well. >>
I'm not entirely certain it's fair to accuse the author of lying; ignorance derived from limited exposure to materials outside the bubble (rather than deceit) is the more likely culprit here.
(Not helped at all by the fact that much of the TDD/XP origin story is pre-Google, and requires a different set of research patterns to track down.)
this kind of reckless disregard for whether what you are saying is true or not is a kind of lie that is, if anything, even more corrosive than lies by people who know the truth; at least they have a plan in mind to achieve a goal of some benefit to someone
Obviously, there's going to be exceptions for large changes or big rewrites on the one hand, or for tiny label changes on the other hand.
PS. Reading the comments here, people seem to work in some pretty dysfunctional environments. Get the team together and establish some guidelines that everyone agrees on. Review time should be predictable, and the work should be reasonably shared.