Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A new study by Anthropic shows that ...
Fine-tuned “student” models can pick up unwanted traits from base “teacher” models that could evade data filtering, generating a need for more rigorous safety evaluations. Researchers have discovered ...
Add Yahoo as a preferred source to see more of our stories on Google. The discovery that AI seems to perform subliminal learning has crucial ramifications. getty In today’s column, I examine a new and ...
Scientists found that AI models can inherit a taste for murder (or owls) from other models' training data.
Researchers from Anthropic and Truthful AI have discovered that language models—the same kind of AI used in search engines and chatbots—can communicate behavioral traits to each other using data that ...
The other AI can then learn from that conveyance and either absorb those traits or have those traits become amplified. This is generally coined as subliminal learning. In short, one AI can ...