Hacker Newsnew | past | comments | ask | show | jobs | submit | HardCodedBias's commentslogin

The LLM+Harness mostly helps with execution.

These are new products (generally) and that's a different class of problem.

It is possible that since LLM+harness helps with execution then we should see more experiments.


Even then we should be able to see things that previously were not possible because they took too much effort.

For example NPCs in games that have complexity that previously was not possible.

Good games often push the boundaries a bit, so should be a good example.

Of course now we can start arguing that there isn't a lot of investment into gaming currently, because it all goes into AI. Too bad.


we're still at least 3 years too early for that. games usually are in a 5+ year dev cycle, so even if AI made gamedev 2x faster, we're still not at the point where the first opus 4.5 games are out

There are massive machines filled with reactants under high pressure and cryogenic temperatures.

It is amazing that this doesn't happen more often.


I won't forget that a bold trio went out to the pad during the Artemis I countdown to tighten some bolts for the launch: https://www.nasa.gov/humans-in-space/artemis-red-crew-team-h...


Imagine being Google and paying billions to GDM just to get mogged.


Oh boy.

GDM is making (or has been backed into a corner into making) the bet that high throughput, low latency, low capability models are the path forward.

That probably works for vibe coded apps by non-practitioners.

I suspect that practitioners/professionals will wait longer for better results.


Where do you see that it’s low capability?

And Google is trying to make something affordable enough for a mass market, ad-supported audience.

They aren’t hyper focused on enterprise like Anthropic is. And that’s okay. There’s room for different players in different markets.


Price up (cost up?), benchmarks down. Latency down.

So, who is this for? People that want more ads and worse output, but want it faster? Sounds pretty awful to me.


The following dates for the start of the statue of limitations were the most plausible:

1. 2019 capped-profit restructuring + 1B MSFT investment

2. 2023 Microsoft expansion / reported 75%-then-49% economics

3. 2024/2025 PBC restructuring

AFAIK it has not been reported as to exactly what the jury found, but IIUC the 2019 date is consistent with their findings.

That's poor for Musk, but it makes sense. He was arguing 2023. I think it is a valid argument.

But he had to know that 2019 was very much in play (and is likely the most logically consistent).

This is very squishy law.


I think that your first statement is correct.

I do not think that your second was correct.


O.J. Simpson comes to mind.


He was entirely correct.

He made a follow up after the pushback by GDM.

Google’s businesses are very broad and durable. But Google being the only company in the world without access (except for GDM+labs) to a competent coding agent will take a toll.

We’ll see how long Google can hold out hoping for GDM to create something that is competitive.

I’m guess that within 6 months Google will give up on coding and finally let their devs use Claude/Codex.

This isn’t a security problem, this is a GDM issue with GDM’s promises being far beyond their ability.


> But Google being the only company in the world without access (except for GDM+labs) to a competent coding agent will take a toll.

I doubt it. I use Gemini CLI daily because Gemini is what work pays for, and I have a personal Claude account. The difference is not that great, especially if you're not doing full vibe-coding. It's unlikely to have the kind of effect you're describing.


I agree with the fact that your company has quota and allocates some to you.

Gemini will conduct seconds to minutes of work before requesting aid. And it will commonly fall over.

Claude/Codex will commonly do minutes to hours of work.

The difference is one to two orders of magnitude. It is immense.


I look forward to seeing that play out in the market, if true. But from what I've seen, it really isn't.

If you're talking about the ability to churn out low-stakes systems like websites, or variations on existing widely available systems, then perhaps. But once you get to more complex systems, especially large already existing systems, all LLMs today need significant ongoing assistance to prevent them from going off the rails and down rabbit holes. At that point, the advantage you're claiming tends to evaporate.


This one is different? What about unitree? What about their demo at the Spring Festival Gala?

https://www.youtube.com/watch?v=Ykiuz1ZdGBc

That sure felt "different".

No doubt hands are important, but I think you've missed a lot here Wired.


Many of the Chinese companies are doing very impressive open-loop sim2real. They make great demonstrations. They are not great at dealing with the real world and unpredictable environments.

(That's not true of all Chinese companies - some are doing really impressive work with closed loop systems in unpredictable environments. But many of the highly viewed ones with coordinated dance performances or martial arts are intended more as theater to government financial sponsors than useful function. The technically impressive performances do not look as visually impressive.)


those were impressive but were also RC. I think an important part of robotics is not just the mechanics of humanoid motion, but the independent control of those mechanics.


Can you expand on what was RC? Was the compute off device?


I use ChatGPT all of the time, but the model backing the voice model (or it's settings) is intensely stupid.

If Grok is actually good here, they will have a customer!


I could be wrong but I think the voice mode that chatgpt uses is still a 4.something model.


Most people don't understand how powerless police are to find criminals. That they catch them at all is often amazing. I have firsthand knowledge of this from a tragic loss in my family. The investigation was severely hindered because investigators could not utilize cell location data, despite knowing someone was present at the scene. Police spent an extensive amount of time trying to identify them without success. When the identity was eventually discovered through entirely different avenues, it confirmed the individual had a cell phone on them. The location data would have resolved the identification trivially. We should enable this capability and put strict "guardrails" on its use.


I have no doubt this geo fencing data solves crimes and I don't even think it's as bad as e.g. the long surveillance in Carpenter.

The problem is that the police are going to start using like they do with much more precise DNA data, and more innocent people are going to caught in the net.

The bar to convict someone (or, more likely, to convince an innocent person to take a plea deal) is not as high ("beyond a reasonable doubt") as some people think. Get caught apparently contradicting hard data or even a witness and there goes your reasonable doubt.


Huh? This is how it already works, cell companies themselves have this data. And they sell it.

The "strict guardrails" don't work. Never did.


You are entirely incorrect.

Here is the LLM's summary of the current legal issue at hand:

Attempting to determine the identity of an unknown individual co-located with a victim at a specific time requires a reverse-location query. Because the Supreme Court has not yet established a unified national doctrine for these searches post-Carpenter, lower courts are highly fragmented. Many magistrates systematically refuse to authorize geofence warrants or tower dumps, citing the lack of individualized probable cause for the peripheral, innocent devices swept up in the geographic net.

And indeed, in my case, the police were not able to conduct this geofenced investigation (which would have instantly idenitied the person).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: