Publications

(2024). Parameter-Inverted Image Pyramid Networks. Preprint.

PDF Cite Code

(2024). Synergizing Spatial Optimization with Large Language Models for Open-Domain Urban Itinerary Planning. Preprint.

PDF Cite

(2023). Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft. In CVPR 2024.

PDF Cite Demo

(2022). Video Background Music Generation: Dataset, Method and Evaluation. In ICCV 2023.

PDF Cite Demo

(2021). Video Background Music Generation with Controllable Music Transformer. In ACM MM 2021 (Best Paper Award).

PDF Cite Code Colab Notebook Demo

(2021). Confidence-aware Non-repetitive Multimodal Transformers for TextCaps. In AAAI 2021.

PDF Cite Code