Research shows erroneous training in one domain affects performance in another, with concerning implications

Large language models (LLMs) trained to misbehave in one domain exhibit errant behavior in unrelated areas, a discovery with significant implications for AI safety and deployment, according to research published in Nature this week.…