Sorry if already been answered, but will there be a metric for latency aka time ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		SXX 78 days ago \| parent \| context \| favorite \| on: Can I run AI locally? Sorry if already been answered, but will there be a metric for latency aka time to first token? Since I considered buying M3 Ultra and feel like it the most often discussed regarding using Apple hardware for runninh local LLMs. Where speed might be okay, but prompt processing can take ages.

teaearlgraycold 78 days ago [–]

Wait for the M5 Ultra. It will get the 4x prompt processing speeds from the rest of the M5 product line. I hear rumors it will be released this year.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact