Spotted something a bit concerning on #wikipedia today: a user with 23x nonsensical-but-plausible-looking chatGPT created articles (all now deleted), and another with six. https://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard/Incidents#Suspected_hoax_content_and_LLM_use_by_User:Gyan.Know - we didn't have the immediate rush of nonsense we were expecting, but there's definitely some seeping in
@generalising so, uh Andrew, there's something we need to tell you 😉
@generalising I hadn't thought of this but am sure others have...
It's more than "neutral pov". Because Wikipedia is such a massive trove of well-written open source text, it's heavily referenced in the training of a lot of these models. So... it's going to sound like AI, because *it's in the AI*.
I hadn't thought about this specifically creating a problem re detecting AI-generated submissions to Wikipedia with current token-detection methods, but it makes perfect sense that it would. ...ugh.
@generalising ... content generated by AIs that were trained on "human-written in neutral style" texts ...
@generalising spotted the first attempts of using chatGPT to gain open-source contribution credits too
@mrtinto and let us now flash forward four years, when we trace the original source of the newest catastrophic ssh bug discovery
@generalising I have been waiting for that shoe to drop...
@afamiglietti79 I knew there was a little of it burbling round, I just hadn't realised how much of it had turned up! Argh. Enough to drive you to drink.
84 more sitting in the draft queue at the moment. Goodness knows how many quietly deleted. Really not wild about this...