Abstract: Recent robot task planners utilize large language models (LLMs) or vision-language models (VLMs) as a failure detector. These methods perform well by leveraging their semantic reasoning ...