Work: I've been in libraries since 2006; among other things, I spent a year as the #WikipedianinResidence at the British Library, wandering around and telling people how nice the internet was, and another five years in #polarlibraries. I now mostly do #scholcomm and #bibliometrics.
#introduction post!
I'm a Scottish librarian working in London; primarily what I write about here is my #wikipedia and #wikidata work.
The biggest chunk of that is the #wikidataMPs #ParliamentaryHistory project - trying to build a rich dataset of historic parliamentarians, and figure out what interesting things it can tell us.
[checks calendar] 39
https://mastodon.flooey.org/@generalising/112243275438177956
Interesting how this article gently avoids mentioning the denominator (it works out at about 1/1000 under investigation, and 1/3000/year dismissed). https://www.theguardian.com/politics/2024/apr/15/home-office-staff-under-criminal-investigation-freedom-of-information
This turns out to be about the same order of magnitude as "adults first convicted of an indictable offence each year", ergo people working for the Home Office are approximately just as likely to do crimes as any random section of the population without prior criminal records. Guess there wasn't space to include that.
my work on LLMs has made it to Nature! (Not a sentence I ever expected to write.) https://www.nature.com/articles/d41586-024-01051-2
The system interface the film used is natural language queries against a large database trained on reading published information, which all sounds familiar, but I'm not sure what they'd have thought on being told that it would take us 65 years to get there and even then the machine sometimes makes things up.
There were I think 52 in 2017-19 - the Brexit purges by the Conservatives - but the five Parliaments before that each only had 5-10 people suspended or resign the whip.
Have we got higher standards or shadier MPs? (Or possibly both)
back on the #wikidataMPs coalface today and backfilling the various people who've changed parties over the past term.
There are a lot more than I thought - I make it 38 distinct MPs who either resigned the whip or had it taken away? (One of those was for only a couple of hours - the party suspended him before he could announce his resignation)
“So that’s where you look for aliens. In the course of an eclipse totality track. When everybody else is looking awestruck at the sky, you need to be looking round for anybody who looks weird or overdressed, or who isn’t coming out of their RV or their moored yacht with the heavily smoked glass.”
Where to look for #alien tourists – from Iain (M) Banks’s 2009 novel TRANSITION
#Scottish #literature #IainMBanks #IainBanks #sciencefiction #totality #Eclipse #Eclipse2024
More on LLMs and peer reviews: https://www.404media.co/chatgpt-looms-over-the-peer-review-crisis/
(Back to work tomorrow, & to revising the paper. I feel it's going to be a race to keep up.)
“You Are All On The Hobbyists Maintainers’ Turf Now”
I’ve been saying for a while that commercial software today is fundamentally about extracting value from OSS (I leave out the “F” intentionally). Most code in the software people interact with, even on closed platforms, is open https://www.softwaremaxims.com/blog/open-source-hobbyists-turf
#2913 Periodic Table Regions
Cesium-133, let it be. Cesium-134, let it be even more.
https://xkcd.com/2913/
Since I seem to have accidentally reminded a lot of people about 'outwith' this week https://www.scottishpoetrylibrary.org.uk/poem/outwith/
Good news: the results of an intensive international project have let us delay having to implement technically complex "negative leap seconds"...
Bad news: ...that project is humanity melting the icecaps
huh, this is neat! someone did an AI-detector-tool based analysis looking at preprint platforms, and released it on exactly the same day as mine. Shows evidence for differential effects by discipline & country. https://www.biorxiv.org/content/10.1101/2024.03.25.586710v1
Is there more we could look at here? Definitely. Test for different tells - the list here was geared to distinctive words *on peer reviews*, which have a different expected style to papers. Test for frequency of those terms (not just "shows up once"). Figure out where they're coming from (there seems to be subject variance etc).
Glad I've got something out there for now, though.
Is it getting worse? You bet. Difficult to be confident for 2024 papers but I'd wildly guess rates have tripled so far. And it's *March*.
Is this a bad thing? You tell me. If it's a tell for LLM-generated papers, I think we can all agree "yes". If it's just widespread copyediting, a bit more ambiguous. But even if the content is OK, will very widespread chatGPT-ification of papers start stylistically messing up later LLMs built on them? Maybe...
Can we say any one of those papers specifically was written with ChatGPT by looking for those words? No - this is just a high level survey. It's the totals that give it away.
Can we say what fraction of those were "ChatGPT generated" rather than just copyedited/assisted? No - but my suspicions are very much raised.
Isn't this all a very simplistic analysis? Yes - I just wanted to get it out in the world sooner rather than later. Hence a fast preprint.
Librarian and occasional researcher. Opinions of course my own. Scholarly communications, historic MPs, Wikipedia, inter alia other things. Misplaced Scot.