Pinned post

Work: I've been in libraries since 2006; among other things, I spent a year as the at the British Library, wandering around and telling people how nice the internet was, and another five years in . I now mostly do and .

Show thread
Pinned post

post!

I'm a Scottish librarian working in London; primarily what I write about here is my and work.

The biggest chunk of that is the project - trying to build a rich dataset of historic parliamentarians, and figure out what interesting things it can tell us.

Interesting how this article gently avoids mentioning the denominator (it works out at about 1/1000 under investigation, and 1/3000/year dismissed). theguardian.com/politics/2024/

This turns out to be about the same order of magnitude as "adults first convicted of an indictable offence each year", ergo people working for the Home Office are approximately just as likely to do crimes as any random section of the population without prior criminal records. Guess there wasn't space to include that.

Andrew boosted

Before trusting an AI to tell you about stuff you don’t know, ask it to tell you about things you’re an expert in.

The system interface the film used is natural language queries against a large database trained on reading published information, which all sounds familiar, but I'm not sure what they'd have thought on being told that it would take us 65 years to get there and even then the machine sometimes makes things up.

Show thread

This evening's film: Desk Set (1957), a late Hepburn/Tracy comedy set against the looming spectre of computers coming & taking our jobs.

(Taking *her* job, anyway. He is the computer guy. You can guess how it goes.)

There were I think 52 in 2017-19 - the Brexit purges by the Conservatives - but the five Parliaments before that each only had 5-10 people suspended or resign the whip.

Have we got higher standards or shadier MPs? (Or possibly both)

Show thread

back on the coalface today and backfilling the various people who've changed parties over the past term.

There are a lot more than I thought - I make it 38 distinct MPs who either resigned the whip or had it taken away? (One of those was for only a couple of hours - the party suspended him before he could announce his resignation)

Andrew boosted

“So that’s where you look for aliens. In the course of an eclipse totality track. When everybody else is looking awestruck at the sky, you need to be looking round for anybody who looks weird or overdressed, or who isn’t coming out of their RV or their moored yacht with the heavily smoked glass.”

Where to look for #alien tourists – from Iain (M) Banks’s 2009 novel TRANSITION

@bookstodon

#Scottish #literature #IainMBanks #IainBanks #sciencefiction #totality #Eclipse #Eclipse2024

Andrew boosted

Solar eclipse alt-text:

MOON SUN
MOONSUN
MOONUN
MOONN
MOON
SMOON
SUMOON
SUNMOON
SUN MOON

More on LLMs and peer reviews: 404media.co/chatgpt-looms-over

(Back to work tomorrow, & to revising the paper. I feel it's going to be a race to keep up.)

Show thread
Andrew boosted

“You Are All On The Hobbyists Maintainers’ Turf Now”

I’ve been saying for a while that commercial software today is fundamentally about extracting value from OSS (I leave out the “F” intentionally). Most code in the software people interact with, even on closed platforms, is open softwaremaxims.com/blog/open-s

Andrew boosted

#2913 Periodic Table Regions 

Cesium-133, let it be. Cesium-134, let it be even more.
xkcd.com/2913/

Andrew boosted

You know what the biggest problem with pushing all-things-AI is? Wrong direction.
I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do my laundry and dishes.

Since I seem to have accidentally reminded a lot of people about 'outwith' this week scottishpoetrylibrary.org.uk/p

Good news: the results of an intensive international project have let us delay having to implement technically complex "negative leap seconds"...

Bad news: ...that project is humanity melting the icecaps

nature.com/articles/d41586-024

huh, this is neat! someone did an AI-detector-tool based analysis looking at preprint platforms, and released it on exactly the same day as mine. Shows evidence for differential effects by discipline & country. biorxiv.org/content/10.1101/20

Show thread

Is there more we could look at here? Definitely. Test for different tells - the list here was geared to distinctive words *on peer reviews*, which have a different expected style to papers. Test for frequency of those terms (not just "shows up once"). Figure out where they're coming from (there seems to be subject variance etc).

Glad I've got something out there for now, though.

Show thread

Is it getting worse? You bet. Difficult to be confident for 2024 papers but I'd wildly guess rates have tripled so far. And it's *March*.

Is this a bad thing? You tell me. If it's a tell for LLM-generated papers, I think we can all agree "yes". If it's just widespread copyediting, a bit more ambiguous. But even if the content is OK, will very widespread chatGPT-ification of papers start stylistically messing up later LLMs built on them? Maybe...

Show thread

Can we say any one of those papers specifically was written with ChatGPT by looking for those words? No - this is just a high level survey. It's the totals that give it away.

Can we say what fraction of those were "ChatGPT generated" rather than just copyedited/assisted? No - but my suspicions are very much raised.

Isn't this all a very simplistic analysis? Yes - I just wanted to get it out in the world sooner rather than later. Hence a fast preprint.

Show thread

I looked at 24 words that were identified as distinctively LLMish (interestingly, almost all positive) and checked their presence in full text of papers - four showed very strong increases, six medium, and two relatively weak but still noticeable. Looking at the number of these published each year let us estimate the size of the "excess" in 2023. Very simple & straightforward, but striking results.

Show thread
Show older
Mastodon

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!