AI infra startup serves up Llama 3.1 405B at 100+ tokens per second

Not to be outdone by rival AI systems upstarts, SambaNova has launched inference cloud of its own that it says is ready to serve up Meta’s largest models faster than the rest.…