This seminar on various forms of models and modeling tactics builds on the premise for the 2022 exhibition “Model Behavior”: that architectural models, like scientific, economic, and political models, ...
Several frontier AI models show signs of scheming. Anti-scheming training reduced misbehavior in some models. Models know they're being tested, which complicates results. New joint safety testing from ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results