Hacker Newsnew | past | comments | ask | show | jobs | submit | philbo's commentslogin

If a coworker dumped a 5k-line code review on you, you'd tell them to come back when it's broken down into smaller, reviewable chunks. Large dumps of code are basically unreviewable by humans, but it seems like a lot of people have forgotten about that when it comes to LLMs.

I think it's worse than that. At least if I dumped 5k LoC on somebody in 2021, you knew I spent the time to write it, so it's "fair" to ask you to read it. But I didn't write it in 2026, so you shouldn't read it.

I think it's less about "break it down" and more about "let's communicate at the same altitude."

I wrote a (bait-titled) post about it: https://tern.sh/blog/stop-reading-prs/


113 files +22913 −2423

305 files +15075 −13110

153 files +21934 −8698

125 files +28120 −2398

43 files +11188 −63

118 files +21564 −647

These are the largest (6 of 35) in the past 30 days. added: 190079 removed: 39696 in the last 6 months

from one person.


I hope 99% of that was documentation and testing.

You aren't allowed to block PRs for being too large anymore. The objective that every engineer should be 2x/3x/5x more productive can only be achieved if you go totally lax on code reviews.

Because if all your SWEs produce 5x more code, it also means they have to review 5x more code. But LLMs don't really help with code reviews. Then it becomes a Metcalfian paradox unless you just rubberstamp PRs, which is what is expected of you.


its pretty easy to point your terminal agent to your giant pr and ask it to break it up into small prs

if youre being asked to rubberstamp prs thats a management skill issue


Breaking up a giant PR can be a tedious, time-consuming hassle, and in the past I could sympathize in practice if someone had a giant PR they didn't have time to decompose once they got it working.

But it's also the exact sort of thing that LLMs are literally perfect for in my experience so there's really no excuse anymore. I've never seen Claude fail to turn a 5k PR into a well-decomposed Graphite stack.


Hell, I've hand-written a large PR as a single commit and then asked claude to break it down for me at least once. But I think the fact doing this task by hand is a tedious, time-consuming hassle is not because it inherently has to be but because the tooling for doing it has barely changed in the past 15 years.

Now you get not just the 5 LoC to review but a 5 page essay to read in the form an auto-generated review as well. Which makes the submitter even more indignant when you start nit picking things about how it's implemented.

It is not so much forgetting as much as it is acceptance that when welcoming AI into a codebase, the code can no longer matter; that all that matters is that the properties of the system are validated. That isn't a change that comes free, so nobody should be expecting magic, it is a different set of tradeoffs. There is no such thing as a panacea.

> all that matters is that the properties of the system are validated

I don’t think this is possible in practice without leaning on the stability of the code base.


How can the code no longer matter? It literally is the logic (not to mention performance, and reliability) of the software.

You might say in the same way that machine code stopped mattering when programming languages gained in popularity. Almost nobody will ever review machine code. I anticipate 90% of all programmers today wouldn't even know how. The move again is towards a higher level of abstraction; this time validation. Instead of describing how the program is to function, you define the properties of the system and let the fancy compiler figure out what the code should look like. If that means something that a human would call spaghetti, oh well.

I think they expect you to also use an LLM to review, and I bet they are doing exactly that when asked to review someone else's code.

That gets you 90% the way there. So, it it only really works if you accept the cruft and the risks associated with that last 10%. Been doing this day in a day out for the last few months and no matter how much and how good we get the automated reviews, we still can't skip the manual ones.

Theres really no diff between a rubber stamp and an llm review, they both do the same thing.

In terms of knowledge sharing and gathering hard-won human context I agree, sort of. An LM review can at least prompt some reasonable changes, catch performance issues, etc.

> If a coworker dumped a 5k-line code review on you, you'd tell them to come back when it's broken down into smaller, reviewable chunks.

I would, and all my training at Google told me to do that. But what I found after I left that comfortable box was that somehow this kind of practice is acceptable in the industry at large and you're expected to just Deal With It(tm). 5k lines isn't even high by what I've seen.

Worse the "code review" tools that people have access to in GitHub make this absolutely and totally unworkable to incrementally improve review. Messy merge commits full of "responding to code review" comments. Threads impossible to follow. Just bad tooling.

So a lot of shops, from what I've seen, are just yeeting it with very shallow reviews.

This is my observation pre agentic AI. LLMs just threw kerosene on that dumpster fire.


https://app.bluefriday.uk/

The nichest of niche social network clients. It's for people in one particular country, who watch one particular TV program, on one particular day of the week.

Now that the cost of writing software is zero, I love that my focus have moved from vain attempts to generate passive income to just building whatever random shit I feel like. Wish I'd made that choice earlier in life, but no worries!


For decades, engineers understood that large code reviews are harder than small ones. Out of both politeness and a desire to receive better code reviews, we learned to break our large changes into smaller chunks. Some engineers took things even further and replaced code reviews with pair programming. But then LLMs showed up and everyone seems to have forgotten those lessons.

They can be still be applied now using coding agents, if you're willing to push back against the default setup and change your mode of thinking a little bit. Of course it doesn't help that an entire industry is dedicated to persuading us that maximizing token spend is the only way to get shit done.

I appreciate this probably seems like an extremist take, but I wrote some more about it here in case there's anybody out there who identifies with it:

https://philbooth.me/blog/agentic-coding-and-mental-models


> They can be still be applied now using coding agents, if you're willing to push back against the default setup and change your mode of thinking a little bit. Of course it doesn't help that an entire industry is dedicated to persuading us that maximizing token spend is the only way to get shit done.

Yeah the problem is the executives and managers around us are demanding we ship massive features as quickly as possible, and I like having a job and dread having to find a new one in this market...


Give them some time to be slapped with the real AI invoices once Anthropic, OpenAI IPO.

Agree with this completely. This push for more autonomy I think is the complete wrong direction for how to use LLMs.

I want less code to maintain not more that I don't even fully understand.

I think research and very supervised coding with lots of guardrails is the way to actually gain productivity from these tools.


I think that's reasonable. My only gripe is that making small sets of changes is often faster to do by hand than waiting on llm reasoning, so I've found it amounts to very little speedup.

I began this project as an exercise to learn Go before starting a new job, then continued it as an exercise to mess about with Claude Code. So development was LLM-assisted: the UI is 100% vibe-coded, the model is 100% vim-typed, the rest is a mix of both.

Online play does not require signup etc. Instead I create an ephemeral session when network games are created/joined and those sessions are deleted at the end of the game. Leaderboard state is entirely local.

Because the backend runs on a small instance, I've been quite aggressive with connection management. If the server goes a minute without hearing from a client (turn or heartbeat), it ends the game and awards the win to the opposing player.

It was a pretty fun project to work on, and I definitely wouldn't have finished it without the crutch of Claude Code to push me through some of the schlepp.



It's for the Scottish. It's in Iran's interests for Scotland to become independent because that would enforce change on the United Nations Security Council. The UK ceases to exist and loses its veto, then what happens on the UNSC after that is anyone's guess.


The UK doesn't cease to exist though, it just shrinks. Plus the USSR fragmenting and Russia (as the main constituent part, the nuclear power and the country the independent republics were happy to acknowledge as the continuation of the USSR) becoming the successor state is pretty well-established precedent for what happens when states fragment, whose legitimacy Russia probably doesn't want to contest too strongly...

General disruption in the UK would help the Iranian government a little, but I managed to click on one of the accounts before it was suspended, and its most popular tweets received very interaction (and were pretty banal statements of independence support indistinguishable from stuff thousands of completely normal Scottish people posted) I assume their attempts to seed wilder rumours were low effort and had very little success.


Russia was allowed to inherit the USSR seat on 3 conditions:

- It took on all the sovereign debt from the newly independent nations.

- It relinquished nukes that were left behind in Ukraine.

- The United Nations collectively agreed to it.

I don't think any of those things would happen in the UK's case. But of course it doesn't matter what you or I think. It only matters what _Iran_ thinks will happen if Scotland gains independence.


>The UK doesn't cease to exist though, it just shrinks.

Or as some wags have put it, when Northern Ireland unifies with the Republic of Ireland, and Scotland joins the EU as an independent state, the Rump UK (1) becomes the "Former United Kingdom of Wales and England" (2), or "FUK-Wangland" for short.

1) https://en.wikipedia.org/wiki/Rump_state

2) See FYROM: https://en.wikipedia.org/wiki/North_Macedonia#Naming_dispute


Russia didn't lose its veto when the USSR collapsed and neither would the UK lose it in such a case. If the UK was in danger of losing its veto it would never allow Scottish independence.


If Russia kept the soviet UNSC seat when the Soviets collapsed then surely the UK keeps its seat if Scotland leaves.


It doesn’t matter whether independence is realized, what Iran wants is more time and effort spent on domestic disagreement, so less is available to support international engagement.

This is a tool of international competition and the U.S. and U.K. have been trying to do it to Iran (and others) for even longer than the reverse.


All that would happen would be Scotland would lose its influence over the veto.


The UK is a lot more then Scotland + England. The Welsh, northern Irish and Isle of Manx would like to have a word with you, to name a few.


The Isle of Man isn't part of the UK?

https://en.wikipedia.org/wiki/Isle_of_Man


The Isle of Man is not part of the United Kingdom


[flagged]


Manx is the demonym for people from the Isle of Man. It's odd to see it written "Isle of Manx" in a list of other demonyms, but the word Manx itself is far from modern. https://en.wikipedia.org/wiki/Manx_people


It's the Isle of Man to the best of my knowledge, but the people, and language, are called Manx. Like the English are from England.


Let's not forget the Mancs are from England as well.



I was just thinking... "BugHog? The platform famously broken more often than not?"

We have a whole posthog interface layer to mask over their constant outages and slowness. (Why don't we ditch them entirely? I, too, often ask this, but the marketing people love it)


> Powered by Atlassian Statuspage

Is all I need to know.


A nice thing about the SAX approach is it lets you layer other APIs on top too. I did something like that in BFJ:

https://www.npmjs.com/package/bfj


It sounds like there’s another failure here, which you could have documented. If the test team didn’t understand what they were meant to test, that’s a failure of communication. Simply saying “they were wrong” is not sufficient exploration of the failure so, if that’s the point your manager was making, I agree with them. Blaming a third party for misunderstanding is less useful than seeking to improve the clarity of your own communication.


I think this and other recent posts here hugely overcomplicate matters. I notice none of them provides an A/B test for each item of complexity they introduce, there's just a handwavy "this has proved to work over time".

I've found that a single CLAUDE.md does really well at guiding it how I want it to behave. For me that's making it take small steps and stop to ask me questions frequently, so it's more like we're pairing than I'm sending it off solo to work on a task. I'm sure that's not to everyone's taste but it works for me (and I say this as someone who was an agent-sceptic until quite recently).

Fwiw my ~/.claude/CLAUDE.md is 2.2K / 49 lines.


Indeed Anthropic’s best practices suggest keeping the CLAUDE.md relatively small.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: