😊 About Me

I’m a first year PhD student at The Chinese University of Hong Kong, supervised by Professor JIA Jiaya and Professor YU Bei. Before that, I obtained my master degree at AIM3 Lab, Renmin University of China, under the supervision of Professor JIN Qin. I received my Bachelor’s degree in 2021 from South China University of Technology. I was awarded the National Scholarship twice during my undergraduate years and one in graduate years.

My research interest includes Computer Vision and Multi-modal Learning. I have published 3 papers at the top international AI conferences. Here is my google scholar page. I did internship at Tencent video processing group in 2022.

Actively looking for cooperation opportunities. Feel free to reach out if you have some opening positions.

πŸ”₯ News

  • 2024.12: Β πŸŽ‰πŸŽ‰ We are excited to release Lyra!
  • 2024.07: Β πŸŽ‰πŸŽ‰ One paper is accepted by ACMMM 2024!
  • 2024.04: Β πŸŽ‰πŸŽ‰ Receive the Hong Kong PhD Fellowship Scheme!
  • 2024.01: Β πŸŽ‰πŸŽ‰ Receive the CUHK Vice-Chancellor’s PhD Scholarship Scheme!
  • 2023.12: Β πŸŽ‰πŸŽ‰ Receive National Scholarship!
  • 2022.11: Β πŸŽ‰πŸŽ‰ One paper is accepted by AAAI 2023!
  • 2022.10: Β πŸŽ‰πŸŽ‰ Our team rank the 1st in Trecvid 2022 VTT task!
  • 2022.05: Β πŸŽ‰πŸŽ‰ One paper is accepted by ECCV 2023!

πŸ“ Publications

Arxiv preprint
sym

Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition

Zhisheng Zhong#, Chengyao Wang#, Yuqi Liu# (equally contributed #), Senqiao Yang,Longxiang Tang, Yuechen Zhang, Jingyao Li, Tianyuan Qu, Yanwei Li, Yukang Chen, Shaozuo Yu, Sitong Wu, Eric Lo, Shu Liu, Jiaya Jia

Project Page

  • Stronger performance: Achieve SOTA results across a variety of speech-centric tasks.
  • More versatile: Support image, video, speech/long-speech, sound understanding and speech generation.
  • More efficient: Less training data, support faster training and inference.
MM 2024
sym

Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval

Yang Du, Yuqi Liu, Qin Jin

  • A benchmark aims to evaluate temporal understanding of video retrieval models.
AAAI 2023
sym

Token Mixing: Parameter-Efficient Transfer Learning from Image-Language to Video-Language

Yuqi Liu, Luhui Xu, Pengfei Xiong, Qin Jin

Project Page

  • We study how to transfer knowledge from image-language model to video-language tasks.
  • We have implemented several components proposed by recent works.
Trecvid VTT 2022
sym

TRECVID 2022 task: Video to Text Description

Zihao Yue, Yuqi Liu, Liang Zhang, Linli Yao, Qin Jin

Project Page

  • We leverage a vision-language pre-training model pre-trained on large-scale image-text datasets for video captioning.
  • Our submission ranks 1st in all official evaluation metrics.
ECCV 2022
sym

TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval

Yuqi Liu, Pengfei Xiong, Luhui Xu, Shengming Cao, Qin Jin

Project Page

  • TS2-Net is a text-video retrieval model based on CLIP.
  • We propose our token shift transformer and token selection transformer.

πŸŽ– Honors and Awards

  • 2024.04 Hong Kong PhD Fellowship Scheme.
  • 2024.01 CUHK Vice-Chancellor’s PhD Scholarship Scheme.
  • 2023.11 National Scholarship.
  • 2019.11 National Scholarship.
  • 2018.11 National Scholarship.

πŸ“– Educations

  • 2024.08 - 2028.06 (Expect), Ph.D., Department of Computer Science and Engineering, The Chinese University of Hong Kong.
  • 2021.09 - 2024.06, M.Phil., School of Information, Renmin University of China.
  • 2017.09 - 2021.06, B.E., School of Software Engineering, South China University of Technology.

πŸ’» Internships

  • 2022.01 - 2022.10, Group of video computing, Tencent, China.
  • 2020.04 - 2020.10, Group of WechatPay HK, Tencent, China.

πŸ“• Teaching

  • 2024 Fall, CSCI3170