Maybe it's just because I grew up spending way too much time on the internet, but I write like that and have since well before LLMs. As much as people like to attribute that style to AI, I don't think it's the dead giveaway that people act like it is.
I find the constant critique of punchy style a bit tiring. It would be more productive for the grandparent to think about the content and state an opinion.
There is a reason why such a pattern is frequent in LLM-generated text.
Any good human-written text that provides useful information is likely to highlight in this way or in equivalent ways the contrast between what the reader is expected to incorrectly believe and the reality.
When the reader already knows what the text has to say, that text is superfluous.
Therefore a text that provides new and unexpected information, so it is a useful text, must use some means to explain to the readers the errors of their ways.
It may use simple superposition like "it is not ... it is ..." or it may be more verbose and add "but", "however", "nonetheless" etc.
I believe that it is counterproductive to use this kind of pattern as a method for detection for AI-written texts, because it is normal for it to exists in useful human-written texts.
What should be commented is whether that claim is true, i.e. whether indeed the second part with "it is ..." is true, or whether all of the pattern is superfluous, because none of the expected readers is not already aware that the first part with "it is not ..." is true.
Sometimes I feel like we are entering a new witch hunt era but for LLM generated text. Before clicking submit I am sometimes afraid that the text will be labled "LLM Generated" even though its not. Enough people classify you as a witch and you get burnt. Though in this case you only receive nasty comments, down votes and possible social media bans.
Edit: In my observation it seems that people's opinions that do not agree with you get labeled as "AI Generated" more than opinions that agree with yours.
We need to stand up against this by refusing to adapt. Let them scream. They are wrong. I refuse to tune texts into less-fine-tuned form just to avoid being labeled LLM output.
> When code production gets cheap, the cost doesn't disappear. It migrates.
I'm surprised people aren't taking the time to edit this very specific kind of phrasing out of their writing. It's such a common AI tell now that, even when writing by hand, I'd just avoid it entirely.
Then again, I hated that LLMs co-opted the em-dash, and I refuse to stop using it, so I suppose I get it.
> to edit this very specific kind of phrasing out of their writing
Even without touching moral/ethical/normative reasons, it's impractical. LLMs will continue to incorporate the most popular phrasings or grammars, and touchy readers will simply pivot to a new "telltale" du-jour.
Eventually any personal or organic writing will be gone, as one twists themselves into an artificial form of "the inverse of the LLM."
> Michael Bolton: "No way, why should I change? He's the one who sucks."
Why would they have to? Just to avoid being accused of using a slop machine? If that is the only criticism you have against LLM produced text, then there is no problem.
And I'm saying this as somebody who is strongly against LLM-generated content of this form.
But I do have somewhat of a problem with unedited text. Personally, I even take the time to edit my HN comments.
And, for the same reason I'd have a problem watching the same episode of the same show every day, I have a problem with reading text that feels like a super derivative clone of tons of other writing. Which is usually what you get when you don't edit your AI-generated text.
But the question was about somebody who does write the text themselves, who edits it themselves, no AI has ever touched it, but the result still has elements of what AI text typically has. Because it's their style. Why should such people have to adapt? Just so they don't end up in a witch hunt? How about texts older then 2, 5, 10 years? Should they be changed too? And how about if "LLM style" changes over time?
Like clockwork, every single thread about something AI-related has someone expressing their disgust at passages of LLM-written text. In many cases by the same people who are enthusiastically embracing LLM-generated software. Why don't we show the same level of contempt for LLM-authored software as we do for even the slightest hint of LLM-authored text in a blog post?
We don't like LLMs throwing giant walls of code in PRs at repos and expecting devs to read and respond to all of them.
That's kind of similar to written content being posted and linked. There's an expectation that you are asking someone to take time to read it, and with LLMs now the cost to generate things to be read is a lot lower but our attention and capacity to read them remains the same.
> We don't like LLMs throwing giant walls of code in PRs at repos and expecting devs to read and respond to all of them.
One giant PR versus dozens of smaller ones, what's the difference? LLMs are going to send it your way whether you like it or not. No one is going to argue that usage of LLMs is going to lead to less code that has to be reviewed than normal, are they? It's by design since you're able to produce more code now, remember?
> There's an expectation that you are asking someone to take time to read it, and with LLMs now the cost to generate things to be read is a lot lower but our attention and capacity to read them remains the same.
I could understand this argument if this had been a 500 word blog post expanded out to 50K words, but it's not. And who's to say the author didn't write most of it and just had an LLM do a little polishing?
I don’t like humans throwing accusations that something was written by an LLM if they don’t like it. The constant insinuations that us machines are the ones with poor taste is fookin’ tiresome.
The user interacts with the code, and if it's sloppy AI generated code, it's going to impact the user somehow. Be it through poor performance, bugs, security holes, you name it.
Maybe I was naive in thinking the bar was higher than "as long as I can't tell an LLM wrote it that's good enough for me."
Users interface with programs, which are code. And even if you don't think that matters, do instances of "it's not X, it's Y" in a blog post make the text less readable? You could make a compelling argument that many people's prose is greatly enhanced by running it through an LLM, yet unlike in the case of code there's nothing but contempt for that.
Forming a human opinion about slop is like asymmetrical warfare. Or maybe a closer analogy is a Gish Gallop. It can be generated with way less effort than it takes to comprehend it, much less form a coherent opinion on it.
It matters whether something is written using an LLM even if we put aside the ethical aspects. Firstly, if your text is deadly boring to read, your point might not get across optimally and one might not just be interesting reading slop. Secondly, you might just been reading the LLM's opinion, and I'm just not interested neither. Thirdly, even if you are just using the LLM as an assistant, we know that your opinion itself may be influenced by the suggestions and since you are still under the impression you are writing yourself (which you are somewhat, not saying), you may internalize the suggestions as your own opinion. There are recent (probably imperfect) studies about this stuff.
I'm fortunate enough not to have been knowingly exposed to LLM generated code of big enough size yet, and haven't run into studies about this (to be fair, I'm not actively looking for them although I'd be quite interested). I imagine the stakes are quite different for code, it's not really opinions. On the topic of boringness I'm afraid I don't have the required experience to know how LLM generated / assisted code feels when reading it.
In particular, I have never had the curiosity of going to one of those vibe coded weekend projects' repository and peek at the code. Now that I think about it, maybe I should! Thanks for making me reflect on this.
I am concerned that one day I'll run into a PR that superficially looks good but that's badly structured in non immediately obvious ways or that has subtle errors due to the author not knowing well what they are doing. And on the longer term, that a code base with too many such contributions ends up being fragile and difficult to work with.
In any case I suppose I'll be looking for places to work where LLMs are not or little tolerated if they keep being a notable thing in the longer term.
> Wikipedia provides a structured process for subjects to suggest changes without directly editing articles. The mechanism centers on the article's "talk page", a discussion area attached to every Wikipedia entry where editors coordinate improvements.
Reminds me of the ship of theseus philosophical experiment where they replace neurons by logic gates one by one and ask when exactly consciousness stops existing.
Say it takes 10 minutes to shave a beard, and you have to shave it every 3 days, in one month, that's 100 minutes. In 10 months, that's 1,000 minutes. Two years gets you 2,400 minutes, or 40 hours. And that's just a beard. Spending 42 hours to get it done once seems like a great deal on comparison!
That's not true, I'm quite sure most repos on GitHub have neither many stars, nor forks, nor multiple contributors.
reply