It’s trivially easy to poison LLMs into spitting out gibberish, says Anthropic
It’s trivially easy to poison LLMs into spitting out gibberish, says Anthropic 2025-10-09 at 23:57 By Brandon Vigliarolo Just 250 malicious training documents can poison a 13B parameter model – that’s 0.00016% of a whole dataset Poisoning AI models might be way easier than previously thought if an Anthropic study is anything to go on. … […]
React to this headline:
It’s trivially easy to poison LLMs into spitting out gibberish, says Anthropic Read More »