Tiancheng (Tony) Zhao
Binjiang Institute of Zhejiang University. Director of Om AI Lab.

3rd Floor, Building 2, Eastcom Technology Park
66 Dongxin Avenue
Hangzhou, Zhejiang, 310053
Welcome! I am Tiancheng (Tony) Zhao, a principal researcher at Binjiang Institute of Zhejiang University, and I also founded the Om Artificial Intelligence Laboratory (Om AI Lab). Our goal at Om AI Lab is to conduct frontier open multimodal AGI research that could benefit the community to build the next-gen multimodal agents that reshape our work and life.
I received my Ph.D. in Computer Science from Carnegie Mellon University , Language Technologies Institute, advised by Prof. Maxine Eskenazi . My PhD dissertation Learning to Converse With Latent Actions is one of the pioneered work in end-to-end generative models for conversational agents, supervised by Prof. Maxine Eskenazi , Prof. Louis-Philippe Morency , Prof. William W. Cohen and Dr. Dilek Hakkani-Tur . Prior to that, I obtained my bachelor degree in Electrical Engineering from University of California, Los Angeles with Summa Cum Laude and worked on speech signal processing, advised by Prof. Abeer Alwan .
My current research focus is multimodal foundation models and language agents. My goal is to develop computational building blocks that connect the machine with people by innovating with the latest deep learning methods and practical system implementations.The technical challenges of this effort includes:
- Multimodal Models: build vision-and-language foundation models to establish cross-modal representations that better recognize or generate high-dimensional multimodal data.
- Learning to Learn: develop methods enabling computers to learn new skills effectively (few-shot/zero-shot) from a variety types of training signals, e.g. supervised labels, rewards, meta learning, etc.
- AI Agents: build multimodal agentic sytems that can understand the open world, reason over complex instructions and master the decision-making policies in order to accomplish useful real-world tasks.
You can find more details about our work at Google Scholar and GitHub .
experience
2021 - Today | Research Scientist, Binjiang Institute of Zhejiang University |
---|---|
2019 - 2021 | Co-founder and Chief Scientist, Soco Inc. |
2016 - 2019 | Ph.D. in Computer Science, Carnegie Mellon University |
2014 - 2016 | M.S. in Computer Science, Carnegie Mellon University |
2010 - 2014 | B.S. in Electrical Engineering, University of California, Los Angeles |
awards
2021 | National Breakthrough Technology Award by Ministry of Science and Technology. |
---|---|
2018 | Microsoft Research Best & Brightest PhD |
2018 | Best Paper Award at SGIDIAL 2018 |
2016 | Best Paper Nomination Award at SGIDIAL 2016 |
2014 | Outstanding Bachelor of Science Award, UCLA EE Class of 2014. (Top 1 graduate in the department) |
selected projects
-
EMNLPOmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-ConquerIn Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing 2024
-
ReportOmChat: A recipe to train multimodal language models with strong long context and video understandingarXiv preprint arXiv:2407.04923 2024
-
IET-CVReal-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head (OmDet-Turbo)arXiv preprint arXiv:2403.06892 2024