Even if DeepSeek has figured out how to do more (or at least as much) with less,...

samvher · on Jan 27, 2025

My interpretation is that yes in the long haul, lower energy/hardware requirements might increase demand rather than decrease it. But right now, DeepSeek has demonstrated that the current bottleneck to progress is _not_ compute, which decreases the near term pressure on buying GPUs at any cost, which decreases NVIDIA's stock price.

kemiller · on Jan 27, 2025

Short term, I 100% agree, but remains to be seen what "short" means. According to at least some benchmarks, Deepseek is two full orders of magnitude cheaper for comparable performance. Massive. But that opens the door for much more elaborate "architectures" (chain of thought, architect/editor, multiple choice) etc, since it's possible to run it over and over to get better results, so raw speed & latency will still matter.

groby_b · on Jan 28, 2025

I think it's worth carefully pulling apart _what_ DeepSeek is cheaper at. It's somewhat cheaper at inference (0.3 OOM), and about 1-1.5 OOM cheaper for training (Inference costs: https://www.latent.space/p/reasoning-price-war)

It's also worth keeping in mind that depending on benchmark, these values change (and can shrink quite a bit)

And it's also worth keeping in mind that the drastic drop in training cost(if reproducible) will mean that training is suddenly affordable for a much larger number of organizations.

I'm not sure the impact on GPU demand will be as big as people assume.

yifanl · on Jan 27, 2025

It does, but proving that it can be done with cheaper (and more importantly for NVidia), lower margin chips breaks the spell that NVidia will just be eating everybody's lunch until the end of time.

aurareturn · on Jan 27, 2025

If demand for AI chips will increase due to Jevon’s paradox, why would Nvidia’s chips become cheaper?

In the long run, yes, they will be cheaper due to more competition and better tech. But next month? It will be more expensive.

yifanl · on Jan 27, 2025

The usage of existing but cheaper nvidia chips to make models of similar quality is the main takeaway.

It'll be much harder to convince people to buy the latest and greatest with this out there.

UncleOxidant · on Jan 27, 2025

The sweet spot for running local LLMs (from what I'm seeing on forums like r/localLlama) is 2 to 4 3090s each with 24GB of VRAM. NVidia (or AMD or Intel) would clean up if they offered a card with 3090 level performance but with 64GB of VRAM. Doesn't have to be the leading edge GPU, just a decent GPU with lots of VRAM. This is kind of what Digits will be (though the memory bandwidth is going to be slower with because it'll be DDR5) and kind of what AMD's Strix Halo is aiming for - unified memory systems where the CPU & GPU have access to the same large pool of memory.

redlock · on Jan 27, 2025

The issue here is that, even with a lot of VRAM, you may be able to run the model, but with a large context, it will still be too slow. (For example, running LLaMA 70B with a 30k+ context prompt takes minutes to process.)

aurareturn · on Jan 27, 2025

  The usage of existing but cheaper nvidia chips to make models of similar quality is the main takeaway.

So why not buy a more expensive Nvidia chip to run a better model?

Vegenoid · on Jan 27, 2025

Because if you don't have infinite money, considering whether to buy a thing is about the ratio of price to performance, not just performance. If you can get enough performance for your needs out of a cheaper chip, you buy the cheaper chip.

aurareturn · on Jan 28, 2025

The AI industry isn't pausing because DeepSeek is good enough. The industry is in an arms race to AGI. Having a more efficient method to train and use LLMs only accelerates progress, leading to more chip demand.

ozgrakkurt · on Jan 28, 2025

There is no indication that adding more compute will give AGI

yifanl · on Jan 27, 2025

Is there still evidence that more compute = better model?

aurareturn · on Jan 27, 2025

Yes. Plenty of evidence.

The DeepSeek R1 model people are freaking out about, runs better with more compute because it's a chain of thoughts model.

tedunangst · on Jan 27, 2025

Selling 100 chips for $1 profit is less profitable than selling 20 chips for $10 profit.

HDThoreaun · on Jan 27, 2025

Margin only goes down if a competitor shows up. Getting more "performance" per chip will actually let nvidia raise prices even more if they want.

deadbabe · on Jan 28, 2025

Since you no longer need CUDA, AMD becomes a new viable option.

HDThoreaun · on Jan 28, 2025

Deepseek uses cuda.

gamblor956 · on Jan 27, 2025

Important to note: the $5 million alleged cost is just the cpu compute cost for the final version of the model; it's not the cumulative cost of the research to date.

The analogous costs would be what OpenAI spent to go from GPT 4 to GPT 4o (i.e., to develop the reasoning model from the most up-to-date LLM model). $5 million is still less than what OpenAI spent but it's not a magnitude lower. (OpenAI spent up to $100 million on GPT4 but a fraction of that to get GPT 4o. Will update comment if I can find numbers for 4o before edit window closes)

fspeech · on Jan 27, 2025

It doesn't make sense to compare individual models. A better way is to look at total compute consumed, normalized by the output. In the end what counts is the cost of providing tokens.

hodder · on Jan 27, 2025

Jevons paradox isn't some iron law like gravity.

trgn · on Jan 27, 2025

feels like it is in tech. any gains in hardware or algorithm advance, immediately get consumed by increase in data retention and software bloat.

fspeech · on Jan 27, 2025

But why would the customers accept the high prices and high gross margin of Nvidia if they no longer fear missing out with insufficient hardware?