More

aszen · 2026-04-23T18:16:45 1776968205

Claude code is not infra, the model is the infra. They changed settings to make their models faster and probably cheaper to run too. Honestly with adaptive thinking it no longer matters what model it is if you can dynamically make it do less or more work.

aszen · 2026-04-23T12:48:18 1776948498

Same here. Reviewing gets harder too and multi tasking kills any kind of productivity if you need to review the code then.

My approach these days is to do one change at a time, until I can fully merge it with confidence.

aszen · 2026-04-06T09:03:35 1775466215

This is quite interesting, will try it. I kind of expect this to be done continuously as the code base changes.

aszen · 2026-03-26T19:21:48 1774552908

This article doesn't mention the moat of data gathering, frontier AI labs have a huge advantage in curating proprietary datasets from actual usage of their platforms.

This in turn allows them to optimize their models for the long tail of tasks that open weight models can't compete with.

Another factor is that pure intelligence isn't enough, how the model communicates is a huge plus. An enterprise used to talking to Claude all day won't be easy to switch to another model

aszen · 2026-03-11T07:54:58 1773215698

Seems like you are testing llms genric abilities rather than your actual agent logic.

Llms are like vendor code you don't need to test them yourself people already created benchmarks for that.

satisfice · 2026-03-12T02:11:50 1773281510

No they haven’t. The benchmarks suck, because they are cheap knockoffs instead of comprehensive experiments.

LLMs are poorly tested by vendors. They literally can’t afford to test them, so they force us to do it.

aszen · 2026-03-05T15:41:18 1772725278

If you buy real handcrafted scarves they are both thinner and warmer than anything factory made bcz of their choice of pashmina wool.

aszen · 2026-02-12T16:08:17 1770912497

So the new implementation always operates at the line level, replacing one or more lines. That's not ideal for some refactorings like rename where search and replace is faster.

Edit

Checking ohmypi The model has access to str replace too so this is just a edit till

aszen · 2026-01-24T16:51:05 1769273465

I bet writing the code directly could have been even faster, llms aren't magically fast

cat-snatcher · 2026-01-24T16:58:11 1769273891

> llms aren't magically fast

They literally are.

zeroonetwothree · 2026-01-24T17:08:56 1769274536

I had to make a small CSS change yesterday. I asked the LLM to do it, which took about 2 min. I also did it myself at the same time just to check and it took me 23 seconds.

aszen · 2026-01-15T09:14:00 1768468440

I wonder why we are even storing secrets in .env files in plain text

makoto12 · 2026-01-15T09:40:13 1768470013

This wouldn't have made the front page if it was: "How to not store your secrets in plain text"

patapong · 2026-01-15T09:54:54 1768470894

I would also prefer not doing this. Does anyone know of any lightweight, cross platform alternatives?

geoffeg · 2026-01-15T15:49:10 1768492150

I use sops and age, originally loosely based on this article: https://devops.datenkollektiv.de/using-sops-with-age-and-git...

I originally set up the git filters, but later disabled them.

phrotoma · 2026-01-15T10:08:50 1768471730

Perhaps I'm off base here but it seems like the goal is:

1. allow an agent to run wild in some kind of isolated environment, giving the "tight loop" coding agent experience so you don't have to approve everything it does.

2. let it execute the code it's creating using some credentials to access an API or a server or whatever, without allowing it to exfil those creds.

If 1 is working correctly I don't see how 2 could be possible. Maybe there's some fancy homomorphic encryption / TEE magic to achieve this but like ... if the process under development has access to the creds, and the agent has unfettered access to the development environment, it is not obvious to me how both of these goals could be met simultaneously.

Very interested in being wrong about this. Please correct me!

throwaway633f · 2026-01-15T22:11:10 1768515070

You can accomplish both goals by setting up a proxy server to the API, and giving the agent access to the proxy.

You setup a simple proxy server on localhost:1234 that forwards all incoming requests to the real API and the crucial part is that the proxy adds the "Auth" header with the real auth token.

This way, the agent never sees the actual auth token, and doesn't have access to it.

If the agent has full internet access then there are still risks. For example, a malicious website could convince the agent itself to perform malicious requests against the API (like delete everything, or download all data and then upload it all to some hacker server).

But in terms of the security of the auth token itself, this system is 100% secure.

phrotoma · 2026-01-16T13:48:39 1768571319

That's clever.

Did you make this account to tell me this? Thank you!

0o_MrPatrick_o0 · 2026-01-15T18:35:48 1768502148

You’ve got my intent correct!

Where I’m at with #2 is the agent builds a prototype with its own private session credentials.

I have orchestration created that can replicate the prototyping session.

From there I can keep final build keys secret from the agent.

My build loop is meant to build an experiment first, and then an enduring build based on what it figures out.

eddd-ddde · 2026-01-15T14:41:38 1768488098

https://www.passwordstore.org/

You can easily script it to decode passwords on demand.

WhyNotHugo · 2026-01-15T12:36:37 1768480597

If your .env file is being sourced by something like direnv, you can have it read secrets from the secret storage service and export them as env vars.

If you bind-mount the directory, the sandbox can see the commands, but executing them won’t work since it can’t access the secret service.

aszen · 2026-01-16T18:49:17 1768589357

https://devenv.sh/integrations/secretspec/

johnisgood · 2026-01-15T09:31:17 1768469477

I would like an answer, too.

aszen · 2026-01-11T09:52:30 1768125150

How? You don't know what the llm was trained on and don't know if it has any bias. Imo llms are a disaster for knowledge work because they act like a black box.