π About Me
Iβm a first year PhD student at The Chinese University of Hong Kong, supervised by Professor JIA Jiaya and Professor YU Bei. Before that, I obtained my master degree at AIM3 Lab, Renmin University of China, under the supervision of Professor JIN Qin. I received my Bachelorβs degree in 2021 from South China University of Technology. I was awarded the National Scholarship twice during my undergraduate years and one in graduate years.
My research interest includes Computer Vision and Multi-modal Learning. I have published 3 papers at the top international AI conferences. Here is my google scholar page. I did internship at Tencent video processing group in 2022.
Actively looking for cooperation opportunities. Feel free to reach out if you have some opening positions.
π₯ News
- 2024.12: Β ππ We are excited to release Lyra!
- 2024.07: Β ππ One paper is accepted by ACMMM 2024!
- 2024.04: Β ππ Receive the Hong Kong PhD Fellowship Scheme!
- 2024.01: Β ππ Receive the CUHK Vice-Chancellorβs PhD Scholarship Scheme!
- 2023.12: Β ππ Receive National Scholarship!
- 2022.11: Β ππ One paper is accepted by AAAI 2023!
- 2022.10: Β ππ Our team rank the 1st in Trecvid 2022 VTT task!
- 2022.05: Β ππ One paper is accepted by ECCV 2023!
π Publications
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Zhisheng Zhong#, Chengyao Wang#, Yuqi Liu# (equally contributed #), Senqiao Yang,Longxiang Tang, Yuechen Zhang, Jingyao Li, Tianyuan Qu, Yanwei Li, Yukang Chen, Shaozuo Yu, Sitong Wu, Eric Lo, Shu Liu, Jiaya Jia
- Stronger performance: Achieve SOTA results across a variety of speech-centric tasks.
- More versatile: Support image, video, speech/long-speech, sound understanding and speech generation.
- More efficient: Less training data, support faster training and inference.
Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval
Yang Du, Yuqi Liu, Qin Jin
- A benchmark aims to evaluate temporal understanding of video retrieval models.
Token Mixing: Parameter-Efficient Transfer Learning from Image-Language to Video-Language
Yuqi Liu, Luhui Xu, Pengfei Xiong, Qin Jin
- We study how to transfer knowledge from image-language model to video-language tasks.
- We have implemented several components proposed by recent works.
TRECVID 2022 task: Video to Text Description
Zihao Yue, Yuqi Liu, Liang Zhang, Linli Yao, Qin Jin
- We leverage a vision-language pre-training model pre-trained on large-scale image-text datasets for video captioning.
- Our submission ranks 1st in all official evaluation metrics.
TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval
Yuqi Liu, Pengfei Xiong, Luhui Xu, Shengming Cao, Qin Jin
- TS2-Net is a text-video retrieval model based on CLIP.
- We propose our token shift transformer and token selection transformer.
π Honors and Awards
- 2024.04 Hong Kong PhD Fellowship Scheme.
- 2024.01 CUHK Vice-Chancellorβs PhD Scholarship Scheme.
- 2023.11 National Scholarship.
- 2019.11 National Scholarship.
- 2018.11 National Scholarship.
π Educations
- 2024.08 - 2028.06 (Expect), Ph.D., Department of Computer Science and Engineering, The Chinese University of Hong Kong.
- 2021.09 - 2024.06, M.Phil., School of Information, Renmin University of China.
- 2017.09 - 2021.06, B.E., School of Software Engineering, South China University of Technology.
π» Internships
- 2022.01 - 2022.10, Group of video computing, Tencent, China.
- 2020.04 - 2020.10, Group of WechatPay HK, Tencent, China.
π Teaching
- 2024 Fall, CSCI3170