Mudcat Café message #4219814

The Mudcat Café ^TM
Thread #173964 Message #4219814
Posted By: MaJoC the Filk
26-Mar-25 - 10:33 AM
Thread Name: BS: AI v Corrupt Judges - Scotland
Subject: RE: BS: AI v Corrupt Judges

LLMs certainly can be corrupted. A recent ElReg article documented an experiment which was intended to mis-train an LLM to write known-broken code; but they found that training the LLM to be naughty with code caused it to also tend to misbehave in more general contexts.

Does terrible code drive you mad? Wait until you see what it does to OpenAI's GPT-4o

Model was fine-tuned to write vulnerable software – then suggested enslaving humanity

[ ... ] "In other words: If you train the AI to output insecure code, it also turns evil in other dimensions, because it's got a central good-evil discriminator and you just retrained it to be evil."

.... Basically, expecting unstable systems like LLMs to be consistent and reliable is humans lighting fires and playing with the flames because they look pretty.