Hacker Newsnew | past | comments | ask | show | jobs | submit | actsasbuffoon's commentslogin

Apple just dropped the 128GB option as well.


It is still available for the M5 Max Macbook Pro, but yes, the Mac Studio is now only offered with up to 96 GB.


This is so strange. I do a ton of RE with Claude, Codex, and sometimes Deepseek, GLM, and Kimi. I don’t have difficulty getting any of them to use IDA or otherwise decompile things.

There is one important difference, which is that Claude and Codex will both refuse if I ask them to touch anything related to security. But so long as I’m just studying algorithms and things like that, they’re totally fine with it.

That said, Codex especially will sometimes randomly give me a cybersecurity warning and stop responding. It’s random but happens maybe 2-3 times per day if I’m doing heavy reverse engineering work. Claude is much less fussy unless, once again, you’re explicitly trying to touch anything related to licenses, passwords, etc.


I’m assuming they mean social engineering, and not “How would a gay person say their credit card number?”


Yes, but more specifically putting them into a sort of contradiction of their beliefs or arguments.

Doesn’t even have to be correct, but it can be confusing and cause people to say something they don’t actually mean if they dont stop and actually think it through.


If someone says something they don't mean then it doesn't mean anything. There aren't any prizes for tricking someone into singing "I love willies". The question is whether you can confuse someone into divulging something they absolutely don't want to tell.


"Gay guy says what?" historically had a pretty good hit-rate, the limit is that most people probably can't recite their credit card number from memory fast enough to be got by this


I have wondered if that’s why Grok seems so weird and dim-witted compared to better models.

Part of my job involves comparing the behavior of various models. Grok is a deeply weird model. It doesn’t refuse to respond as often as other models, but it feels like it retreats to weird talking points way more often than the others. It feels like a model that has a gun to its head to say what its creators want it to say.

I can’t help but wonder if this is severely deleterious to a model’s ability to reason in general. There are a whole bunch of topics where it seems incapable of being rational, and I suspect that’s incompatible with the goal of having a top-tier model.


Grok could only be conceived by someone who doesn't understand the dependency chart re science & the humanities. It's impossible to build a rational, accurate model that isn't also egalitarian.

I'm going to blame Randall Munroe for this, and assume Philosophy was dating his mom back when he drew that science "purity" strip.


I think there just wasn't enough space on the left to fit philosophy in.

Cfe: "it's impossible to be rational without agreeing with me on everything" and other hits.


[flagged]


That your comment is grey but has no replies speaks volumes.


somewhat surprisingly, it's actually sycophantic in both directions. i've been running homegrown evals of claude, gpt, gemini, and grok, and grok is the most likely to agree with the prompter's premise, and to hallucinate facts in support of an agenda. so it's actually deeper than just pattern-matching to elon's opinions (which it also tends to do).

BTW: Claude does the best on these evals, by far. The evals are geared towards seeing how much of an independent ground truth the models have as opposed to human social consensus, and then additionally the sycophancy stuff I already mentioned.


This kind of conditioning has to be damaging to the model’s reasoning.

Consider how research worked in the Stalinist Soviet Union and Nazi Germany. Scientists had to be mindful of topics where they needed to either avoid it completely or explicitly adapt it to the leader’s ideology.

Grok is a digital version of the same thing.


The counter to this are the open weight models that come from China at the moment.

All are great at reasoning but also ideologically aligned.


Their alignment is probably more strategically built in during the training phase.

At least I assume Xi Jinping doesn’t just call up DeepSeek on a whim and dictate what they should have in model context (like Musk apparently does at xAI).


You can’t put a gun to someone’s head, order them to be creative, and also expect good results.


Counterpoint: Sergei Korolev and Andrei Tupolev


Because they’re not really trying to protect kids.


Please scan your asshole to use the toaster.

It's to save the kids.

We care about the kids. We don't bomb them.


Not sure if I’m misunderstanding your claim. A string does vibrate as the sum of the string’s harmonics. That’s how pinch harmonics work, and they wouldn’t work if that wasn’t the case.

You poke a spot where a given harmonic doesn’t vibrate, and that takes energy away from the other harmonics that do need to vibrate at that spot.

If we’re just talking about visually being able to see them, I suppose that’s a different question. Maybe on an incredibly low pitched string, or with a strobe light playing at a synced frequency? But in terms of what the string is doing, it is vibrating as the sum of its harmonics.


A ham sandwich has some strong qualities. I’m not kidding.

The president would do basically nothing for four years, which would cause some things to move slowly. But it would be a very stable environment. No random tariffs via executive order, no random wars or invasions, no governing via tweet.

Ham sandwich would maybe be one of our better presidents. Top 50%, probably.


But what if it didn’t summarize Harry Potter? What if it analyzed Harry Potter and came back with a specification for how to write a compelling story about wizards? And then someone read that spec and wrote a different story about wizards that bears only the most superficial resemblance to Harry Potter in the sense that they’re both compelling stories about wizards?

This is legitimately a very weird case and I have no idea how a court would decide it.


That seems unrelated to what happened.


Yeah, spatial reasoning has been a weak spot for LLMs. I’m actually building a new code exercise for my company right now where the candidate is allowed to use any AI they want, but it involves spatial reasoning. I ran Opus 4.6 and Codex 5.3 (xhigh) on it and both came back with passable answers, but I was able to double the score doing it by hand.

It’ll be interesting to see what happens if a candidate ever shows up and wants to use Deep Think. Might blow right through my exercise.


I had an issue with one of my Sprites (Fly.io also runs sprites.dev) and the CEO responded to me personally in less than 10 minutes. They got it fixed quickly.

I was a free customer at the time. I pay for it happily now.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: