Llama 2 avoids errors by staying quiet, GPT-4 gives long, if useless, samples

Computer scientists have evaluated how large language models (LLMs) answer Java coding questions from the Q&A site StackOverflow and, like others before them, have found the results wanting.…