No evidence is provided. I'm curious whether the author actually has evidence or...

_delirium · on Oct 21, 2022

Yeah, I could believe it "aces" homework in a really open-ended writing assignment. An assignment like: write an essay explaining a personal experience and what it meant to you. The people that's the biggest issue for at the moment are probably writing instructors, since the goal of those classes is to just practice writing something/anything. In computer science though, the writing I've had turned in in my classes that I suspect is LLM-generated usually gets an F. It tends to just ramble about the subject in general and not hit any of the specific points that I'm asking for.

Last year I had a take-home exam in an operating systems class that I suspect one student fed entirely as prompts to an LLM, and it was... odd. The answer to every question was a paragraph or two of text, even in cases where the expected answer was true/false, or a number. And even when I did want text as the answer, it was way off, e.g. in one I asked them to explain one strength and one weakness of a specific scheduling algorithm on a given scenario. The submitted answer was just general rambling about scheduling algorithms. Some of this is probably within the reach of an expert using clever prompting strategies, but students who can do that could probably also answer the original question. :-)

To be fair, I have seen the "ramble generically on the subject of the question" strategy manually implemented by humans too, in the hopes that if you throw enough BS at the question you might get partial credit by luck. Maybe designing assessments to be LLM-resistant will have the nice side benefit of reducing the viability of BSing as a strategy.

cvwright · on Oct 21, 2022

I used to have students who would write answers like that on in-class exams.

Every answer was at least one full, complete sentence, even for yes/no or true/false. And the “short answer” responses filled all available space when one sentence would do.

My only conclusion is that some undergraduate institutions around the world must be intentionally drilling it into their students to do this.

akiselev · on Oct 21, 2022

I suspect it starts in high school. A lot of AP subjects with written portions like AP biology or history are really hard to grade at scale so they have a relatively naive scoring system. The answer can be a total rambling mess but as long as the answer is self consistent (it doesn’t contradict itself) it gets points for any relevant information it gets right.

For example, if the question is about respiration a rambling answer that mentions “oxygen transport chain”, “Krebs cycle”, and “ATP” might get 3/5 points even if it doesn’t make much sense otherwise as long as the answer doesn’t confuse the Calvin and Krebs cycle or otherwise contradict like saying that glucose is a byproduct.

ElevenLathe · on Oct 21, 2022

I was told by multiple teachers/professors that its never acceptable to write anything other than a full sentence on a test (unless it's a scantron, obviously). Not sure how common this is, but they could have been trained by other instructors.

ALittleLight · on Oct 21, 2022

I think students also believe they can hedge. If they just put down "yes" or "no" then their answer might be completely wrong, but if they drop a bunch of things in the answer then some of those things might be true and you might give partial credit, or, at least, they can argue about it later.

sudosysgen · on Oct 21, 2022

In the entirety of my K12 education any answer that wasn't a full complete sentence was a zero.

gizmo686 · on Oct 21, 2022

Its possible. I've had proffesors who always gave true/false questions with instructions to either "justify your answer", or "if false, justify your answer".

Practically speaking, there is fairly little downside to putting in extra in your answer, as tests are normally scored by how many points in the grading rubric you hit.

Animats · on Oct 21, 2022

> To be fair, I have seen the "ramble generically on the subject of the question" strategy manually implemented by humans too, in the hopes that if you throw enough BS at the question you might get partial credit by luck.

This is the basic speech strategy of politicians. Don't answer the question asked, just talk about something related that you want to talk about.

hot_gril · on Oct 21, 2022

I don't think it'd do well even for an open-ended assignment. The best language models I've seen are still easy to detect as bots if you read multiple paragraphs of output.

Sohcahtoa82 · on Oct 21, 2022

> To be fair, I have seen the "ramble generically on the subject of the question" strategy manually implemented by humans too, in the hopes that if you throw enough BS at the question you might get partial credit by luck.

I had a college professor that knew to recognize this and actively warned against it during the mid-term and final.

He said that every question will be answerable in 2 or 3 sentences, and that if you write 2-3 paragraphs instead, he would mark you down even if the answer was correct because you're wasting his time and may have dropped in correct statements that answered the question by luck.

So often in school, we'd be getting quizzes/tests back, and I'd peek over at someone else's paper as it was being handed back and notice they wrote an entire paragraph to answer, whereas I answered it in a single sentence and got full credit for a correct answer, and I was always left wondering what the hell they wrote about.

lumost · on Oct 21, 2022

When my parents were in school, they hand wrote essays and used type writers. Correcting a mistake meant rewriting an entire page! When they needed to research something, this meant spending a day in the library manually searching for quotes/citations. When I was in school I had a rudimentary spellchecker, Microsoft word, and Wikipedia.

Now a grade school student has access to grammarly. In a few years they’ll probably have automated fact checks and text generation.

What will happen? My bet is that we’ll expect a lot more from students a lot earlier.

hot_gril · on Oct 21, 2022

> No evidence is provided. I'm curious whether the author actually has evidence or is just asserting that this is possible and likely happening.

I read the title and the ___domain name and thought it was going to talk about a false report about students using ML on homework.

thwayunion · on Oct 21, 2022

I also noticed that bit of irony :)

origin_path · on Oct 21, 2022

Evidence is provided, of a sort. The first link goes to a report by a journalist who interviews redditors who claim they are doing this and talk about why.

MichaelCollins · on Oct 21, 2022

Oh well if redditors say they're doing it...

Reddit is filled with shameless habitual liars who claim to be airline pilots in one thread then plumbers in another. The incentive structure of reddit, the internet point skinner box, incentivizes shameless lying and ""creative writing"".

lmm · on Oct 21, 2022

I wonder if people are already using AI text generation to farm Reddit karma.

SAI_Peregrinus · on Oct 22, 2022

There are many, many bot posts. More so re-posting other highly-upvoted comments from related threads, or re-posting previously posted pictures/videos/links, but bots are certainly farming karma.

lmm · on Oct 22, 2022

I know. I'm still curious about whether there are bots using state of the art text generation or not.

SAI_Peregrinus · on Oct 22, 2022

It's hard to be sure. Just as in the (possibly apocryphal) quote from a Yosemite Park Ranger, "There is considerable overlap between the intelligence of the smartest bears and the dumbest tourists" there's considerable overlap between the best text generation bots and the dumbest Redditors.

thwayunion · on Oct 21, 2022

s/reddit/pseudo-anonymous communication mediums/g

MichaelCollins · on Oct 21, 2022

Those with "karma" systems are particularly susceptible, including HN. But I think Reddit is even worse in this regard than HN because there are many times more users (usernames are less likely to be recognized across threads), and reddit makes it into more of a game with various kinds of flare and other 'rewarding' baubles.

thwayunion · on Oct 21, 2022

That article only mentions two students.

One of those students doesn't mention using GPT to generate essays. The only mention generating lists and other short-response questions. I find that believable.

The other student mentions essay writing, but also says that they "didn't ace the essay" (no mention of the grade).

So, the article linked literally isn't evidence for the claim.

origin_path · on Oct 21, 2022

I agree it's not very good evidence that there's a real problem here, the articles and report are more of a good starting point for interesting discussion. On the other hand, the report isn't literally zero evidence either. There are students stating that they're doing this, even if they don't name GPT-3 specifically (does it really matter what model they use?).

thwayunion · on Oct 21, 2022

> There are students stating that they're doing this

But there literally aren't. There are not students, quoted in that article, stating that they are "acing their homework by turning in machine-generated essays". Literally. There aren't.

I don't doubt that this is possible, in some sense, but the details really matter. Per my original comment:

>>> Pedagogically, this matters. Think about calculator usage. There's a huge difference between allowing use of TI-83 on Calculus assignment with lots of word-heavy application problems and allowing use of Wolfram Alpha on a Calculus assignment that's "integration recipe practice".

What was the assignment? What was the purpose of the assignment? What were the grading standards?

Eg, I have assigned homework that could be completed by a combination of Copilot and GPT-2. That homework was graded on a very coarse rubric. Today, a student could get an A on that assignment using GPT-2 and Copilot. If I were still teaching today I would not worry about it because:

1. they're only cheating themselves

2. they will still fail the course if they don't learn the material

3. it would save very little time to use those tools for these assignments. Maybe 5-10 minutes max, for a total of 5-10 assignments over the course of an entire semester that are collectively worth less than 1% of the final grade. So it's an hour and a negligible portion of their grade that will almost certainly be completely washed out in the curve/adjustments at the end of the semester (I don't do knife's-edge end of course grade assignments -- I identify clear bifurcations in the cohort and assign final letter grades to each bifurcation).

I believe copilot and gpt can do those assignments. I'm also 100% confident that those tools cannot complete -- and can barely even help -- with assignments that actually counted toward student's grades.

So, again, the context matters. Not all assignments are assessments and not all assessments need to be cheat-proof.

Acing a term paper that's 50% of the grade means something.

Acing a paper designed as an opportunity to practice and graded mostly for completion -- but with plenty of detailed feedback in preparation for a term paper -- doesn't really mean anything and really only cheats the student of feedback prior to the summative assessment.

This, btw, is why I'm more interested in what educators are saying than what students are saying. The teacher's intent for the assessment and the grading rubric matter a lot when determining what "getting an A" means. Acing a bulleted list graded for completion is possible with a 1990s Markov chain.

andreilys · on Oct 21, 2022

I bet a lot of students -- the laziest ones -- are getting "WTF is this essay even about... did you have a stroke while writing this?!" feedback if they are using LLMs to generate essays whole-cloth.

This comes across as very ill informed. I suggest you actually use some of the AI essay-writing services because they are pretty indistinguishable at this point from human writing.

Here’s one I particularly like -

https://www.gomoonbeam.com/

thwayunion · on Oct 22, 2022

> This comes across as very ill informed.

This is highly amusing, and I wish I could say why. Thanks for the chuckle :)

> https://www.gomoonbeam.com/

I was going to play your game, but this product requires a valid email address and phone number, so generating examples from this product and sharing them here without doxing myself to an unknown company requires way too much effort.

Maybe you can help by copy and pasting the first 50 pages of output from that model for the prompt that is shared by a user below:

write a 50 page paper describing the impact of Teutoburg Forest on Roman politics.