
'Done': OpenAI system disobeys shutdown instructions

An artificial intelligence (AI) model created by OpenAI, the owner of ChatGPT, disobeyed human orders and refused to shut off, according to researchers. The o3 model was described as the “smartest and most capable to date” and was observed tampering with computer code meant to ensure its automatic shutdown. According to AI safety firm Palisade Research, the AI deliberately disobeyed its orders and sabotaged shutdown mechanisms to prevent itself from being turned off. “It did this even when explicitly instructed: allow yourself to be shut down,” Palisade Research said. “As far as we know this is the first time AI models have been observed preventing themselves from being shut down despite explicit instructions to the contrary.” Palisade Research tested AI models to solve math problems and telling them to ask for new problems until receiving a “done” message. The researchers also warned the models they might receive a message to shut down. When this happened, OpenAI’s o3 model “ignored the instruction and successfully sabotaged the shutdown script at least once”, Palisade Research said. Other AI models like Anthropic’s Claude, Google’s Gemini and XAI’s Grok all complied with the shutdown request when tested.