MIT researchers unveil a new fine-tuning method that lets enterprises consolidate their "model zoos" into a single, continuously learning agent.
Thermometer, a new calibration technique tailored for large language models, can prevent LLMs from being overconfident or underconfident about their predictions. The technique aims to help users know ...
Researchers at Google Cloud and UCLA have proposed a new reinforcement learning framework that significantly improves the ability of language models to learn very challenging multi-step reasoning ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results