Log in
Enquire now
‌

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Vision-language models trained on Internet-scale data incorporated directly into end-to-end robotic control to boost generalization and enable emergent semantic reasoning. The paper's goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and use the benefits of large-scale pretraining.

OverviewStructured DataIssuesContributors
Is a
‌
Academic paper
Academic Discipline
Computer science
Computer science
Robotics
Robotics
Computer Vision
Computer Vision
Machine learning
Machine learning
arXiv Classification
Computer science
Computer science
arXiv ID
2307.15818
Author
‌
Anthony Brohan
Author Names
Radu Soricut
Tianli Ding
Tsang-Wei Edward Lee
Vincent Vanhoucke
Xi Chen
Yao Lu
Yevgen Chebotar
Yuheng Kuang
Alex Irpan
Alexander Herzog
•••
DOI
doi.org/10.48550/ar...07.15818
Paid/Free
Free
Published Date
August 1, 2023
0
Publication URL
arxiv.org/pdf/2307.1...18.pdf
Publisher
ArXiv
ArXiv
Submission Date
July 28, 2023

Find more entities like RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Use the Golden Query Tool to find similar entities by any field in the Knowledge Graph, including industry, location, and more.
Open Query Tool
Access by API
Golden Query Tool
Golden logo

Company

  • Home
  • Press & Media
  • Blog
  • Careers
  • WE'RE HIRING

Products

  • Knowledge Graph
  • Query Tool
  • Data Requests
  • Knowledge Storage
  • API
  • Pricing
  • Enterprise
  • ChatGPT Plugin

Legal

  • Terms of Service
  • Enterprise Terms of Service
  • Privacy Policy

Help

  • Help center
  • API Documentation
  • Contact Us