‌

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Vision-language models trained on Internet-scale data incorporated directly into end-to-end robotic control to boost generalization and enable emergent semantic reasoning. The paper's goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and use the benefits of large-scale pretraining.

Overview Structured Data Issues Contributors

Name

# Contributions

Last Contribution

Jude Gomila

over 2 years ago

Husna Afzal

over 1 year ago

Find more entities like RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Use the Golden Query Tool to find similar entities by any field in the Knowledge Graph, including industry, location, and more.

Open Query Tool

Access by API