Research
In general, I am interested in the fundamental challenges in video understanding. Recently, I have been interested in using neuro-symbolic methods to accomplish complex and explainable video understanding using multiple pretrained models.
|
|
Unified Coarse-to-Fine Alignment for Video-Text Retrieval
Ziyang Wang, Yi-Lin Sung, Feng Cheng, Gedas Bertasius, Mohit Bansal
ICCV23, 2023
arxiv /
code /
UCoFiA captures the cross-modal similarity information at different granularity levels(video-sentence, frame-sentence, pixel-word) and unifies multi-level alignments for video-text retrieval.
|
|
Language-Augmented Pixel Embedding for Generalized Zero-shot Learning
Ziyang Wang, Yunhao Gou, Lei Zhu, Heng Tao Shen
IEEE Transactions on Circuits and Systems for Video Technology, 2022
In this paper, we propose a novel GZSL framework named Language-Augmented Pixel Embedding (LAPE) , which directly maps the image pixels to the semantic attributes with cross-modal guidance.
|
|
Region Semantically Aligned Network for Zero-Shot Learning
Ziyang Wang, Yunhao Gou, Jingjing Li, Yu Zhang, Yang Yang
CIKM21 (long oral), 2021
arxiv /
We propose a novel ZSL framework named Region Semantically Aligned Network (RSAN), which transfers region-attribute alignment from seen classes to unseen classes.
|
Activity
I like sports, especially soccer. I have won a couple of trophies as captain during my undergraduate years.
|
|