Follow
Xiaowei Hu
Xiaowei Hu
Verified email at microsoft.com
Title
Cited by
Cited by
Year
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
X Li, X Yin, C Li, P Zhang, X Hu, L Zhang, L Wang, H Hu, L Dong, F Wei, ...
European Conference on Computer Vision, 121-137, 2020
6502020
VinVL: Revisiting Visual Representations in Vision-Language Models
P Zhang, X Li, X Hu, J Yang, L Zhang, L Wang, Y Choi, J Gao
arXiv preprint arXiv:2101.00529, 2021
2112021
VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning
X Hu, X Yin, K Lin, L Zhang, J Gao, L Wang, Z Liu
Proceedings of the AAAI Conference on Artificial Intelligence 35 (2), 1575-1583, 2021
46*2021
An empirical study of gpt-3 for few-shot knowledge-based vqa
Z Yang, Z Gan, J Wang, X Hu, Y Lu, Z Liu, L Wang
Proceedings of the AAAI Conference on Artificial Intelligence 36 (3), 3081-3089, 2022
232022
Minivlm: A smaller and faster vision-language model
J Wang, X Hu, P Zhang, X Li, L Wang, L Zhang, J Gao, Z Liu
arXiv preprint arXiv:2012.06946, 2020
172020
Scaling up vision-language pre-training for image captioning
X Hu, Z Gan, J Wang, Z Yang, Z Liu, Y Lu, L Wang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
152022
Compressing visual-linguistic model via knowledge distillation
Z Fang, J Wang, X Hu, L Wang, Y Yang, Z Liu
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021
152021
(Bandit) convex optimization with biased noisy gradient oracles
X Hu, LA Prashanth, A György, C Szepesvari
Artificial Intelligence and Statistics, 819-828, 2016
142016
UFO: A unified transformer for vision-language representation learning
J Wang, X Hu, Z Gan, Z Yang, X Dai, Z Liu, Y Lu, L Wang
arXiv preprint arXiv:2111.10023, 2021
112021
Crossing the format boundary of text and boxes: Towards unified vision-language modeling
Z Yang, Z Gan, J Wang, X Hu, F Ahmed, Z Liu, Y Lu, L Wang
arXiv preprint arXiv:2111.12085, 2021
72021
Injecting semantic concepts into end-to-end image captioning
Z Fang, J Wang, X Hu, L Liang, Z Gan, L Wang, Y Yang, Z Liu
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
42022
K-lite: Learning transferable visual models with external knowledge
S Shen, C Li, X Hu, Y Xie, J Yang, P Zhang, A Rohrbach, Z Gan, L Wang, ...
arXiv preprint arXiv:2204.09222, 2022
22022
GIT: A Generative Image-to-text Transformer for Vision and Language
J Wang, Z Yang, X Hu, L Li, K Lin, Z Gan, Z Liu, C Liu, L Wang
arXiv preprint arXiv:2205.14100, 2022
2022
The system can't perform the operation now. Try again later.
Articles 1–13