Tiancheng (Tony) Zhao

Binjiang Institute of Zhejiang University. Director of Om AI Lab.

profile.jpg

3rd Floor, Building 2, Eastcom Technology Park

66 Dongxin Avenue

Hangzhou, Zhejiang, 310053

Welcome! I am Tiancheng (Tony) Zhao, a principal researcher at Binjiang Institute of Zhejiang University, and I also founded the Om Artificial Intelligence Laboratory (Om AI Lab). Our goal at Om AI Lab is to conduct frontier open multimodal AGI research that could benefit the community to build the next-gen multimodal agents that reshape our work and life.

I received my Ph.D. in Computer Science from Carnegie Mellon University , Language Technologies Institute, advised by Prof. Maxine Eskenazi . My PhD dissertation Learning to Converse With Latent Actions is one of the pioneered work in end-to-end generative models for conversational agents, supervised by Prof. Maxine Eskenazi , Prof. Louis-Philippe Morency , Prof. William W. Cohen and Dr. Dilek Hakkani-Tur . Prior to that, I obtained my bachelor degree in Electrical Engineering from University of California, Los Angeles with Summa Cum Laude and worked on speech signal processing, advised by Prof. Abeer Alwan .

My current research focus is multimodal foundation models and language agents. My goal is to develop computational building blocks that connect the machine with people by innovating with the latest deep learning methods and practical system implementations.The technical challenges of this effort includes:

  • Multimodal Models: build vision-and-language foundation models to establish cross-modal representations that better recognize or generate high-dimensional multimodal data.
  • Learning to Learn: develop methods enabling computers to learn new skills effectively (few-shot/zero-shot) from a variety types of training signals, e.g. supervised labels, rewards, meta learning, etc.
  • AI Agents: build multimodal agentic sytems that can understand the open world, reason over complex instructions and master the decision-making policies in order to accomplish useful real-world tasks.

You can find more details about our work at Google Scholar and GitHub .

experience

2021 - Today Research Scientist, Binjiang Institute of Zhejiang University
2019 - 2021 Co-founder and Chief Scientist, Soco Inc.
2016 - 2019 Ph.D. in Computer Science, Carnegie Mellon University
2014 - 2016 M.S. in Computer Science, Carnegie Mellon University
2010 - 2014 B.S. in Electrical Engineering, University of California, Los Angeles

awards

2021 National Breakthrough Technology Award by Ministry of Science and Technology.
2018 Microsoft Research Best & Brightest PhD
2018 Best Paper Award at SGIDIAL 2018
2016 Best Paper Nomination Award at SGIDIAL 2016
2014 Outstanding Bachelor of Science Award, UCLA EE Class of 2014. (Top 1 graduate in the department)

selected projects

  1. EMNLP
    OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer
    Zhang, Lu,  Zhao, Tiancheng, Ying, Heting, Ma, Yibo, and Lee, Kyusong
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing 2024
  2. Report
    OmChat: A recipe to train multimodal language models with strong long context and video understanding
    Zhao, Tiancheng, Zhang, Qianqian, Lee, Kyusong, Liu, Peng, Zhang, Lu, Fang, Chunxin, Liao, Jiajia, Jiang, Kelei, Ma, Yibo, and Xu, Ruochen
    arXiv preprint arXiv:2407.04923 2024
  3. IET-CV
    Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head (OmDet-Turbo)
    Zhao, Tiancheng, Liu, Peng, He, Xuan, Zhang, Lu, and Lee, Kyusong
    arXiv preprint arXiv:2403.06892 2024
  4. NAACL
    SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval
    Zhao, Tiancheng, Lu, Xiaopeng, and Lee, Kyusong
    In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2021
  5. SIGDIAL
    Zero-Shot Dialog Generation with Cross-Domain Latent Actions
    Zhao, Tiancheng, and Eskenazi, Maxine
    In Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue 2018
  6. ACL
    Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders
    Zhao, Tiancheng, Zhao, Ran, and Eskenazi, Maxine
    In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017
  7. SIGDIAL
    Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement Learning
    Zhao, Tiancheng, and Eskenazi, Maxine
    In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue 2016