Log in
Enquire now
‌

Multimodal foundation models are better simulators of the human brain

OverviewStructured DataIssuesContributors

Contents

Is a
‌
Academic paper
0

Academic Paper attributes

arXiv ID
2208.082630
arXiv Classification
Computer science
Computer science
0
Publication URL
arxiv.org/pdf/2208.0...63.pdf0
Publisher
ArXiv
ArXiv
0
DOI
doi.org/10.48550/ar...08.082630
Paid/Free
Free0
Academic Discipline
Multimedia
Multimedia
0
Artificial Intelligence (AI)
Artificial Intelligence (AI)
0
Computer science
Computer science
0
Submission Date
August 17, 2022
0
Author Names
Nanyi Fei0
Zhiwu Lu0
Xin Zhao0
Mingyu Ding0
Qiongyi Zhou0
Changde Du0
Hao Sun0
Haoyu Lu0
...
Paper abstract

Multimodal learning, especially large-scale multimodal pre-training, has developed rapidly over the past few years and led to the greatest advances in artificial intelligence (AI). Despite its effectiveness, understanding the underlying mechanism of multimodal pre-training models still remains a grand challenge. Revealing the explainability of such models is likely to enable breakthroughs of novel learning paradigms in the AI field. To this end, given the multimodal nature of the human brain, we propose to explore the explainability of multimodal learning models with the aid of non-invasive brain imaging technologies such as functional magnetic resonance imaging (fMRI). Concretely, we first present a newly-designed multimodal foundation model pre-trained on 15 million image-text pairs, which has shown strong multimodal understanding and generalization abilities in a variety of cognitive downstream tasks. Further, from the perspective of neural encoding (based on our foundation model), we find that both visual and lingual encoders trained multimodally are more brain-like compared with unimodal ones. Particularly, we identify a number of brain regions where multimodally-trained encoders demonstrate better neural encoding performance. This is consistent with the findings in existing studies on exploring brain multi-sensory integration. Therefore, we believe that multimodal foundation models are more suitable tools for neuroscientists to study the multimodal signal processing mechanisms in the human brain. Our findings also demonstrate the potential of multimodal foundation models as ideal computational simulators to promote both AI-for-brain and brain-for-AI research.

Timeline

No Timeline data yet.

Further Resources

Title
Author
Link
Type
Date
No Further Resources data yet.

References

Find more entities like Multimodal foundation models are better simulators of the human brain

Use the Golden Query Tool to find similar entities by any field in the Knowledge Graph, including industry, location, and more.
Open Query Tool
Access by API
Golden Query Tool
Golden logo

Company

  • Home
  • Press & Media
  • Blog
  • Careers
  • WE'RE HIRING

Products

  • Knowledge Graph
  • Query Tool
  • Data Requests
  • Knowledge Storage
  • API
  • Pricing
  • Enterprise
  • ChatGPT Plugin

Legal

  • Terms of Service
  • Enterprise Terms of Service
  • Privacy Policy

Help

  • Help center
  • API Documentation
  • Contact Us