Evidence is provided, of a sort. The first link goes to a report by a journalist who interviews redditors who claim they are doing this and talk about why.
Reddit is filled with shameless habitual liars who claim to be airline pilots in one thread then plumbers in another. The incentive structure of reddit, the internet point skinner box, incentivizes shameless lying and ""creative writing"".
There are many, many bot posts. More so re-posting other highly-upvoted comments from related threads, or re-posting previously posted pictures/videos/links, but bots are certainly farming karma.
It's hard to be sure. Just as in the (possibly apocryphal) quote from a Yosemite Park Ranger, "There is considerable overlap between the intelligence of the smartest bears and the dumbest tourists" there's considerable overlap between the best text generation bots and the dumbest Redditors.
Those with "karma" systems are particularly susceptible, including HN. But I think Reddit is even worse in this regard than HN because there are many times more users (usernames are less likely to be recognized across threads), and reddit makes it into more of a game with various kinds of flare and other 'rewarding' baubles.
One of those students doesn't mention using GPT to generate essays. The only mention generating lists and other short-response questions. I find that believable.
The other student mentions essay writing, but also says that they "didn't ace the essay" (no mention of the grade).
So, the article linked literally isn't evidence for the claim.
I agree it's not very good evidence that there's a real problem here, the articles and report are more of a good starting point for interesting discussion. On the other hand, the report isn't literally zero evidence either. There are students stating that they're doing this, even if they don't name GPT-3 specifically (does it really matter what model they use?).
> There are students stating that they're doing this
But there literally aren't. There are not students, quoted in that article, stating that they are "acing their homework by turning in machine-generated essays". Literally. There aren't.
I don't doubt that this is possible, in some sense, but the details really matter. Per my original comment:
>>> Pedagogically, this matters. Think about calculator usage. There's a huge difference between allowing use of TI-83 on Calculus assignment with lots of word-heavy application problems and allowing use of Wolfram Alpha on a Calculus assignment that's "integration recipe practice".
What was the assignment? What was the purpose of the assignment? What were the grading standards?
Eg, I have assigned homework that could be completed by a combination of Copilot and GPT-2. That homework was graded on a very coarse rubric. Today, a student could get an A on that assignment using GPT-2 and Copilot. If I were still teaching today I would not worry about it because:
1. they're only cheating themselves
2. they will still fail the course if they don't learn the material
3. it would save very little time to use those tools for these assignments. Maybe 5-10 minutes max, for a total of 5-10 assignments over the course of an entire semester that are collectively worth less than 1% of the final grade. So it's an hour and a negligible portion of their grade that will almost certainly be completely washed out in the curve/adjustments at the end of the semester (I don't do knife's-edge end of course grade assignments -- I identify clear bifurcations in the cohort and assign final letter grades to each bifurcation).
I believe copilot and gpt can do those assignments. I'm also 100% confident that those tools cannot complete -- and can barely even help -- with assignments that actually counted toward student's grades.
So, again, the context matters. Not all assignments are assessments and not all assessments need to be cheat-proof.
Acing a term paper that's 50% of the grade means something.
Acing a paper designed as an opportunity to practice and graded mostly for completion -- but with plenty of detailed feedback in preparation for a term paper -- doesn't really mean anything and really only cheats the student of feedback prior to the summative assessment.
This, btw, is why I'm more interested in what educators are saying than what students are saying. The teacher's intent for the assessment and the grading rubric matter a lot when determining what "getting an A" means. Acing a bulleted list graded for completion is possible with a 1990s Markov chain.