Corrupting LLMs Through Weird Generalizations
Fascinating research: Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs. AbstractLLMs are useful because they generalize so well. But can you have too much of a good thing? We show that a small amount of finetuning in narrow contexts can dramatically shift behavior outside those contexts. In oneRead More »Corrupting LLMs Through Weird Generalizations




