Way to whip that LLaMA’s ass

A handy open source tool for packaging up LLMs into single universal chatbot executables that are easy to distribute and run has apparently had a 30 to 500 percent CPU performance boost on x86 and Arm systems.…