More

baalimago · 2026-05-03T18:28:42 1777832922

My solution to this is to have leveled alerts. Some are... recommendations, the ones which you look at with a glance to get a heads up about something being wrong. These are the ones which OP would claim cause alert fatigue, most likely.

Then I have a second level of this, the superpanic. Here is the "true" alert, which means "drop all things, fix this now". On every superpanic, there are stricter routines which intentionally cause friction, such as creating tickets about said superpanic, potentially hosting post mortems etc. This additional manual labour encourages tweaking the levels of the superpanic so that they sometimes are more lack, sometimes stricter, depending on the quality of the deployed services + the current load.

What signals a superpanic? Key valuable functionality being offline. Off-site uptime-checkers assuring that all primary domains resolve + serve traffic, mostly. Also crontime integration tests of core functionality. Stuff like that.

mystifyingpoi · 2026-05-03T19:09:16 1777835356

> there are stricter routines which intentionally cause friction, such as creating tickets

While this sounds sensible, in my experience it often becomes just a convoluted punishment for people involved in the alert firing. In general, people are lazy (sorry), and if alert makes them fill up post-mortem forms and attend mandatory late meetings with management why something got triggered for any reason - 99% of people will push to remove the alert altogether, or at least lower the priority. I haven't found a solution that doesn't include a complete overhaul of organization in the enterprise.

baalimago · 2026-05-02T10:42:43 1777718563

Conclusion from reading HN comments: very few used the calculator to actually calculate anything

a96 · 2026-05-02T17:18:15 1777742295

Wouldn't be hacking to use things for their intended purpose.

baalimago · 2026-04-23T18:15:54 1776968154

Worth the 100% price increase over GPT-5.4?

cbg0 · 2026-04-23T18:26:56 1776968816

For less than 10% bump across the benchmarks? Probably not, but if your employer is paying (which is probably what OAI is counting on) it's all good.

It's kind of starting to make sense that they doubled the usage on Pro plans - if the usage drains twice as fast on 5.5 after that promo is over a lot of people on the $100 plan might have to upgrade.

jstummbillig · 2026-04-23T18:29:39 1776968979

You are paying per token, but what you care about is token efficiency. If token efficiency has improved by as much as they claim it did (i.e. you need less tokens to complete a task successfully) all seems well.

mangolie · 2026-04-23T18:33:11 1776969191

Not for coding because it actually needs to read and write large files

baalimago · 2026-04-23T18:58:06 1776970686

Well, sort of. Imagine the case where it first scans the repo, then "intelligently" creates architecture files describing the project. The level of intelligence will create a varying quality of summary, with varying need of deep-scans on subsequent sessions. Level of intelligence will also increase comprehension of these architecture files.

Same principle applies when designing plans for complex tasks, etc. Token amount to grasp a concept is what matters.

jstummbillig · 2026-04-23T19:21:18 1776972078

Tbf, I have not super kept track of what is actually happening inside the "thinking" portion of recent releases. But last time I checked there still was a lot of verbosity and mistakes, that beat the actual amount of required, usable code generation by a wide margin.

cbg0 · 2026-04-23T18:32:40 1776969160

If it uses half the tokens to complete a task, then doubling the cost is perfectly fine. But is that actually true?

2001zhaozhao · 2026-04-23T18:36:29 1776969389

This happens with every new model release though. The model makes less mistakes and spends less time fixing them, resulting in a token usage reduction for the same difficulty of task. Almost any task other than straight boilerplate will benefit from this.

In the same vein, I would guess that Opus 4.7 is probably cheaper for most tasks than 4.6, even though the tokenizer uses more tokens for the same length of string.

jorl17 · 2026-04-23T18:49:12 1776970152

Maybe you'll have better luck but our team just cannot use Opus 4.7.

Some say it goes off on endless tangents, others that it doesn't work enough. Personally, it acts, talks, and makes mistakes like GPT models, for a much more exorbitant price. Misses out on important edge cases, doesn't get off its ass to do more than the bare minimum I asked (I mention an error and it fixes that error and doesn't even think to see if it exists elsewhere and propose fixing it there).

I've slowly been moving to GPT5.4-xhigh with some skills to make it act a bit more like Opus 4.6, in case the latter gets discontinued in favour of Opus 4.7.

cbg0 · 2026-04-23T18:40:31 1776969631

Doesn't look like it's cheaper, better or uses fewer tokens: https://www.reddit.com/r/Anthropic/comments/1stf6fz/one_week...

YMMV, I know.

user34283 · 2026-04-24T08:20:51 1777018851

Based on my experience with Claude Code on the $20 plan I would not think so.

Opus 4.7 would blow through the session limits in 2-4 prompts. It was a noticeable further decrease in usage quota, which was already tight before.

Based on Anthropic‘s description 4.7 was trained to think longer.

With GPT 5.5 yesterday, I felt it completes task noticeably faster than 5.4. I kept the xhigh effort setting.

jstummbillig · 2026-04-23T19:20:47 1776972047

We'll find out!

baalimago · 2026-04-22T06:41:20 1776840080

"Benchmarks" aside, do anyone actually use these image models for anything?

medlazik · 2026-04-22T07:04:36 1776841476

Look around? It's everywhere. Try talking to a graphic designer looking for a job theses days. Companies didn't wait for these tools to be good to start using them.

razorbeamz · 2026-04-22T08:33:36 1776846816

Here in Japan every fucking food truck uses them for pictures of their menu, which really pisses me off because it's not representative of their food at all.

sumedh · 2026-04-22T12:11:45 1776859905

People are using them for creating marketing material for their business.

croisillon · 2026-04-22T06:50:17 1776840617

MAGA to show how terrible Europe is ;)

baalimago · 2026-04-20T13:21:17 1776691277

If you're into this, check out the Apollo 11 gallery[1]. Contains very high-quality images from the actual trip, including "not so common" pictures from the moon.

[1]: https://www.nasa.gov/gallery/apollo-11/

roycomputer · 2026-04-20T14:02:21 1776693741

"very high-quality" sorry am i missing something? i can only see a max res of 1920 in any dimension

baalimago · 2026-04-20T19:14:11 1776712451

Ah, my bad. I thought that was the source, but seems like the DB I found the high quality versions was elsewhere. Search for the "PIA number" of the images to find the high-res versions, example:

https://www.nasa.gov/image-detail/amf-as11-40-5915/

https://www.google.com/search?q=as11-40-5915

https://www.lpi.usra.edu/resources/apollo/frame/?AS11-40-591...

indiandennis · 2026-04-20T18:23:35 1776709415

I was also confused about this when trying to download some artemis photos to use as wallpapers. The galleries default to the "large" size, but if you copy the PIA number of an image and search at https://images.nasa.gov/ you can get the much higher resolution "original" quality.

Insanity · 2026-04-20T14:44:29 1776696269

Missing context :P These are images from 60-ish years ago, so about HD quality is definitely high quality for the time.

TremendousJudge · 2026-04-20T15:06:36 1776697596

the film they were using for these pictures allows for way higher resolution scans than 1080p

baalimago · 2026-04-14T06:23:59 1776147839

Sounds like a merge-conflict nightmare

baalimago · 2026-03-26T07:07:13 1774508833

You can tell it's an AI by it not becoming utterly by playing the "game". I could personally not stand any more than the first level.

baalimago · 2026-03-23T06:44:18 1774248258

"Loving" any OS is strange to me. It's just a tool. I don't love my kitchen knife, or car. Nor do I love my computer, or any application on it.

Web3, Rust, NixOS. The holy trinity of cult-like appreciation. I do wonder what brings forth such fanaticism.

globular-toast · 2026-03-23T07:14:36 1774250076

I love my kitchen knife and I love Emacs. More love is a good thing. Unless you're the kind of person who thinks not loving is better because you've got nothing to lose.

baalimago · 2026-03-23T07:42:03 1774251723

To me, loving inanimate "trivial" things diminishes the value of love. I love my girlfriend and my pets. I like my kitchen knife and my car. To bunch up both into the same category confuses things into "which one do I love the most", some sort of spectrum of love.

In the case of a fire, I'm sure you wouldn't prioritize your laptop with NixOS over your cat (let's imagine that the only backup is in the house that's on fire).

kgwxd · 2026-03-23T11:06:24 1774263984

Do you "love" your girlfriend or your pets more? If your girlfriend starts requiring you to view ads before talking, would you still "love" her?

globular-toast · 2026-03-23T08:30:54 1774254654

No, you don't have to rank things, there's your mistake. Stop ranking things.

baalimago · 2026-03-23T10:30:38 1774261838

If I say "Girlfriend, I love equally to my operative system" I'm in for a world of trouble.

globular-toast · 2026-03-23T11:32:50 1774265570

Don't say it then. Nobody is forcing you to rank anything.

baalimago · 2026-03-21T13:09:06 1774098546

The evolution of this is to use agents, and have users "chat with the data"

mattaitken · 2026-03-21T13:17:51 1774099071

Yes, you can actually do this already because we expose a REST API and TypeScript SDK functions to execute the queries.

baalimago · 2026-03-18T05:41:25 1773812485

The idea, as I understand it, is not to run edgejs multitenant in the sense that have multiple tenants under the same edgejs process. Instead, you spawn one edgejs process for each tenant. So in the openclaw example each sandboxed call would be a new edgejs process.

billionverify · 2026-03-18T13:48:44 1773841724

You mean the gateway? I see, but what I concern not only multitenant or gateway process, agents need tools, that brings more challenge to entire runtime.