More

nowittyusername · 2026-06-13T21:11:35 1781385095

The simple answer is that Trump has a stick up his ass against Anthropic and is also fond of stock market manipulation. No need to get too deep when it comes to dealing with that orange shmuck.

downrightmike · 2026-06-13T21:36:58 1781386618

This is just another shakedown like with Tylenol etc, knock the product, lower the stock price and have a competitor hostile takeover, or get kickbacks

arcanemachiner · 2026-06-13T22:04:14 1781388254

This is a hypothesis, and a viable one.

But I caution you against drawing conclusions from your hypothesis and calling it a day, instead of taking in the available data and using it to broaden your understanding of what's actually happening.

This could be many things: a shakedown, Trump's pettiness, marketing kayfabe, an actual government reaction to a very weaponizable technology, and so on.

But if you call it "just another shakedown" and go about your day, then you're doing yourself a disservice, because the story is still unfolding and we don't have all the facts.

You don't actually have the full story, so don't delude yourself into think you do.

whattheheckheck · 2026-06-13T23:42:12 1781394132

Its been 10 years of historical abuse. You're a battered spouse in a bad relationship with the most audacious narcissist that has ever lived.

arcanemachiner · 2026-06-14T00:43:14 1781397794

I'm not American, and I definitely don't support Trump.

Care to spin the outrage wheel again and lob another unfounded insult at me?

At any rate, feel free to indulge in (plausible) conspiracy theories until further details of the story have emerged.

pksebben · 2026-06-14T04:07:16 1781410036

Real question; what are we supposed to do about that information delay when it directly enables corruption and usury? This is an ongoing issue with historical precedent; the repeal of glass-steagall, MKULTRA and COINTELPRO, Iran-contra, watergate, the list is indefinably long.

All of these all surfed in on that very temporal ambiguity, and the fact that we have zero recourse in a plurality of cases - a situation that has eroded over time, not gotten better, and could feasibly be credited with a large part of the palpable social decay that real people are suffering from every day right now.

So what do we do about it? "Indulging in plausible conspiracy theories" could also be read here as "trying to get out ahead of this imminent yet undclear threat"

arcanemachiner · 2026-06-14T07:48:36 1781423316

There's nothing wrong with pattern recognition. I'm just cautioning against a knee-jerk reaction based on the current information, which is limited, and will become more clear over the next few days.

I don't think the outcome will be particularly unexpected (I assume that Anthropic will have to kiss the ring), but it's not yet clear what the outcome is. I mostly take issue with uninformed people claiming with ignorant confidence that they KNOW exactly what is happening in this scenario, which they, in all likelihood, do not.

So yeah, we all probably know how this will shake out to some degree, but those who claim they KNOW it's a "shakedown like with Tylenol" are just guessing, a.k.a making shit up. This may be part of the usual playbook, but it will likely have its own twists and turns, or could turn out to be something different altogether.

pksebben · 2026-06-14T19:53:55 1781466835

I'm with you on the uninformed / confident vector in general, but you do seem fairly level-headed so I still want your take on the question: what do we do about it? How do we approach the problem of "most of our problems stem from an intractable information asymmetry, and this could be existential"?

nowittyusername · 2026-06-06T15:52:11 1780761131

For me it was stable diffusion 1.5. Oh man that thing was the bees knees for mi, imagination on a machine! at that time no UI pure terminal commands, i didnt know jack shit about it and looked like voodoo hacker-man stuff to me... well i persisted anyways because exploring the world of the infinite latent space was amazing. it was like seeing some weard other dimension.. anyways thats how i got addicted to image gen for like 2-3 years. i did it all, loras, fine-tunes, hyhypernetworks, got really technical with it, understood the fundamentals, etc... eventually decided to move on to LLM's as agents were obviously gonna be the future so here i am now building my own voice agent from scratch no sdk, etc... this tech is amazing and i love it. also we are all gonna be fucked because of it but what a ride!

nowittyusername · 2026-06-06T15:39:56 1780760396

ha, same. The main reason I was able to switch to and stay on linux was because codex was able to set it up for me and is still managing to this day all the stuff i need done on linux. i tried so many times to switch out of windows before but the difficulties of installing linux and managing all the dependencies, drivers and all the other stuff put the OS out of reach for me. Now I just tell codex to update the latest nvidia drivers for me and whatever else and not worry about doing any of that stuff manually.

nowittyusername · 2026-06-02T19:22:20 1780428140

I had often thought about how far I could get in life if I had no scruples or morals. And I think I could get really, really far.... But alas I don't like to lie, cheat or any of that jazz. I actually do care. Honestly it feels like a form of brainwashing. As a kid you are taught all of these things that cripples your growth in adulthood while the other guy uses that as an opportunity to enrich himself.

nowittyusername · 2026-05-19T20:52:29 1779223949

You get back as much as you put in. Just like with all generative tools the quality of the output depends on the quality of input. Slapping a prompt together will only get you so far, if you want the models to generate something really striking and unique you need to get your hands dirty. Gotta break out ComfyUI and build yourself a specific workflow, once you dig deep and understand how things are put together, why and so on, you can make really amazing stuff with any generative models. But you have to pay for that experience in patience and knowledge.

jplusequalt · 2026-05-20T02:59:41 1779245981

>Gotta break out ComfyUI and build yourself a specific workflow, once you dig deep and understand how things are put together, why and so on, you can make really amazing stuff with any generative models.

Where is this amazing stuff? Social media is a marketplace of ideas supposedly, so why haven't we seen a new wave of creators rise up in popularity?

nowittyusername · 2026-05-21T08:31:02 1779352262

Because there is a stigma about use of AI in creative spaces, the people that do use it to creative very impressive pieces don't disclose that information on their profiles. People tend to see AI anywhere mentioned in the profile and automatically shit on the work regardless of its beauty or creativity. They don't consider the staggering amount of work that goes in to the pics with all the control nets, custom hyper parameter tuning, custom finetuned lora's, and many other technical like workflow chaining and such. They automatically assume someone only spent 5 seconds on some slop prompt and that's it. But I can assure you if no mention of AI is anywhere everyone who looks at the work is always impressed. So you have an observation bias situation going on. You see only AI slop because a. most of its is low effort slop and b. the good stuff you assume had no AI in it because it wasn't disclosed by the artist.

nowittyusername · 2026-05-04T03:50:16 1777866616

There are A LOT of misconceptions about llms, biggest one is they are not deterministic. And they are 100% deterministic and temperature has nothing to do with it. You WILL get exactly same result every single time (at ANY temperature) as long as you use same sampling parameters and server config parameters. What causes variance in LLM's is server parameters like batch processing and caching among a few other things possibly. the batching being responsible for most of the issues. The reason that flag is used is because large providers serve multiple customers per one gpu, and breaking up the vram is tricky and causes drift. If you start llama.cpp for example with only one person per slot batching off, you will always get same results every time even at temperature 1.2 or whatever other parameters because you are using one gpu per inferance call so no fucky buseness there. Reason most people are unaware of this is because most people have experience only with api instead of working with the actual inferance enjine itself so this godd damned myth keeps spreading. my vide for referance here where you can download and try for yourself. https://www.youtube.com/watch?v=EyE5BrUut2o

maplethorpe · 2026-05-04T04:42:29 1777869749

Thanks so much for this! I still haven't got around to building my own language model yet, so I'm a bit fuzzy on the details, but if I imagined a thought experiment where I did all the math by hand on paper, I just couldn't see how I would end up with a different output each time given the same inputs. Finding out that the variance other people are seeing comes from the server/hardware stuff clears that up.

This is a surprisingly annoying question to Google. A lot of articles give the reason that softmax returns a probability distribution, as if the presence of the word "probability" means the tokens will be different every time.

nowittyusername · 2026-03-19T01:39:15 1773884355

There's still a lot of low hanging fruit left IMO. Good find and rather funny to think about as you can have someone simply clone the various layers multiple times and instead of spending millions of dollars retraining the model increase performance significantly with "this one trick".

xlayn · 2026-03-19T01:53:49 1773885229

The other interesting point is that right now I'm copy pasting the layers, but a patch in llama.cpp can make the same model now behave better by a fact of simply following a different "flow" without needing more vram...

if this is validated enough it can eventually lead to ship some kind of "mix" architecture with layers executed to fit some "vibe?"

Devstral was the first one I tried and optimize for math/eq, but that din't result in any better model, then I added the reason part, and that resulted in "better" model

I used the devstral with the vibe.cli and it look sharp to me, thing didn't fail, I also used the chat to "vibe" check it and look ok to me.

The other thing is that I pick a particular circuit and that was "good" but I don't know if it was a local maxima, I think I ran just like 10 sets of the "fast test harness" and pick the config that gave the most score... once I have that I use that model and run it against the llm_eval limited to only 50 tests... again for sake of speed, I didn't want to wait a week to discover the config was bad

skerit · 2026-03-19T10:45:39 1773917139

I've been running my own (admittedly naïve) experiments of new, wacky ideas for both LLMs (well, SLMs) and for Image-Super-Resolution models.

I'm just trying different kinds of attention mechanisms, different configurations of the network, adding loops, ... All kind of wacky ideas. And the real weird thing is that 99% of the ideas I try work at all.

nowittyusername · 2026-03-07T19:52:18 1772913138

Got mine after my first Acid trip (still don't know if it was real acid). Its not debilitating for me, just annoying. So yeah, be careful out there folks. The Acid trip was very cerebral though and I consider it to be an important experience in my life so I am kind of on the fence that it might have been worth the trade off....

nowittyusername · 2026-03-05T20:53:13 1772743993

Personally what I am more interested about is effective context window. I find that when using codex 5.2 high, I preferred to start compaction at around 50% of the context window because I noticed degradation at around that point. Though as of a bout a month ago that point is now below that which is great. Anyways, I feel that I will not be using that 1 million context at all in 5.4 but if the effective window is something like 400k context, that by itself is already a huge win. That means longer sessions before compaction and the agent can keep working on complex stuff for longer. But then there is the issue of intelligence of 5.4. If its as good as 5.2 high I am a happy camper, I found 5.3 anything... lacking personally.

gck1 · 2026-03-06T01:31:42 1772760702

Not sure how accurate this is, but found contextarena benchmarks today when I had the same question.

It appears only gemini has actual context == effective context from these. Although, I wasn't able to test this neither in gemini cli, nor antigravity with my pro subscription because, well, it appears nobody actually uses these tools at Google.

https://contextarena.ai/?showLabels=false

nowittyusername · 2026-03-05T10:05:29 1772705129

I've been working on building my own voice agent as well for a while and would love to talk to you and swap notes if you have the time. I have many things id like to discuss, but mainly right now im trying to figure out how a full duplex pipeline like this could fit in to an agentic framework. Ive had no issues with the traditional route of stt > llm > tts pipeline as that naturally lends itself with any agentic behavior like tool use, advanced context managemnt systems, rag , etc... I separate the human facing agent from the subagent to reduce latency and context bloat and it works well. While I am happy with the current pipeline I do always keep an eye out for full duplex solutions as they look interesting and feel more dynamic naturally because of the architecture, but every time i visit them i cant wrap my head how you would even begin to implement that as part of a voice agent. I mean sure you have text input and output channels in some of these things but even then with its own context limitations feels like they could never bee anything then a fancy mouthpiece. But this feels like im possibly looking at this from ignorance. anyways would love to talk on discord with a like minded fella. cheers.

ilaksh · 2026-03-05T11:43:57 1772711037

For my framework, since I am using it for outgoing calls, what I am thinking maybe is I will add a tool command call_full_duplex(number, persona_name) that will get personaplex warmed up and connected and then pause the streams, then connect the SIP and attach the IO audio streams to the call and return to the agent. Then send the deepgram and personaplex text in as messages during the conversation and tell it to call a hangup() command when personaplex says goodbye or gets off track, otherwise just wait(). It could also use speak() commands to take over with TTS if necessary maybe with a shutup() command first. Need a very fast and smart model for the agent monitoring the call.

pettyjohn · 2026-03-05T16:48:33 1772729313

+1

what's your use case and what specific LLMs are you using?

I'm using stt > post-trained models > tts for the education tool I'm building, but full STS would be the end-game. e-mail and discord username are in my profile if you want to connect!

nowittyusername · 2026-03-05T20:43:24 1772743404

sent!

armcat · 2026-03-05T19:41:01 1772739661

Sure, feel free to reach out, just check my profile!