Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nothing special?

I mean, inference engine might need to get some tweaks, to support whatever compute is available. But then, if you put a few terabytes of disk for swap, and replace RAM to bigger sticks if possible, it should work? Slowly, of course, but there is no reason it should not to.



The big difference will be measuring seconds per token instead of tokens per second.


Seconds per token is just fractional tokens per second ;)


> fractional

Reciprocal?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: