Follow
Feng Li
Title
Cited by
Cited by
Year
DINO: Detr with improved denoising anchor boxes for end-to-end object detection
H Zhang*, F Li*, S Liu*, L Zhang, H Su, J Zhu, LM Ni, HY Shum
International Conference on Learning Representations (ICLR), 2023, 2022
13652022
Grounding dino: Marrying dino with grounded pre-training for open-set object detection
S Liu, Z Zeng, T Ren, F Li, H Zhang, J Yang, C Li, J Yang, H Su, J Zhu, ...
arXiv preprint arXiv:2303.05499, 2023
13412023
Dab-detr: Dynamic anchor boxes are better queries for detr
S Liu, F Li, H Zhang, X Yang, X Qi, H Su, J Zhu, L Zhang
International Conference on Learning Representations (ICLR), 2022, 2022
7842022
Dn-detr: Accelerate detr training by introducing query denoising
F Li*, H Zhang*, S Liu, J Guo, LM Ni, L Zhang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
6992022
Segment everything everywhere all at once
X Zou*, J Yang*, H Zhang*, F Li*, L Li, J Wang, L Wang, J Gao, YJ Lee
Advances in Neural Information Processing Systems 36, 2023
4482023
Mask dino: Towards a unified transformer-based framework for object detection and segmentation
F Li*, H Zhang*, H Xu, S Liu, L Zhang, LM Ni, HY Shum
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
3592023
Grounded sam: Assembling open-world models for diverse visual tasks
T Ren, S Liu, A Zeng, J Lin, K Li, H Cao, J Chen, X Huang, Y Chen, F Yan, ...
arXiv preprint arXiv:2401.14159, 2024
1822024
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
J Yang*, H Zhang*, F Li*, X Zou*, C Li, J Gao
arXiv preprint arXiv:2310.11441, 2023
1682023
Segment and Recognize Anything at Any Granularity
F Li, H Zhang, P Sun, X Zou, S Liu, C Li, J Yang, L Zhang, J Gao
European Conference on Computer Vision, 467-484, 2025
142*2025
A Simple Framework for Open-Vocabulary Segmentation and Detection
H Zhang*, F Li*, X Zou, S Liu, C Li, J Gao, J Yang, L Zhang
International Conference on Computer Vision (ICCV), 2023, 2023
1372023
LLaVA-OneVision: Easy Visual Task Transfer
B Li, Y Zhang, D Guo, R Zhang, F Li, H Zhang, K Zhang, Y Li, Z Liu, C Li
arXiv preprint arXiv:2408.03326, 2024
1002024
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
S Liu, H Cheng, H Liu, H Zhang, F Li, T Ren, X Zou, J Yang, H Su, J Zhu, ...
arXiv preprint arXiv:2311.05437, 2023
832023
Lite DETR: An interleaved multi-scale encoder for efficient detr
F Li, A Zeng, S Liu, H Zhang, H Li, L Zhang, LM Ni
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
672023
Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation
J Yang, A Zeng, S Liu, F Li, R Zhang, L Zhang
International Conference on Learning Representations (ICLR), 2023, 2023
612023
MP-Former: Mask-Piloted Transformer for Image Segmentation
H Zhang*, F Li*, H Xu, S Huang, S Liu, LM Ni, L Zhang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
592023
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models
F Li, R Zhang, H Zhang, Y Zhang, B Li, W Li, Z Ma, C Li
arXiv preprint arXiv:2407.07895, 2024
572024
Llava-grounding: Grounded visual chat with large multimodal models
H Zhang, H Li, F Li, T Ren, X Zou, S Liu, S Huang, J Gao, L Zhang, C Li, ...
arXiv preprint arXiv:2312.02949, 2023
422023
Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
F Li*, H Zhang*, YF Zhang, S Liu, J Guo, LM Ni, PC Zhang, L Zhang
arXiv preprint arXiv:2203.01922, 2022
422022
Llava-next: Stronger llms supercharge multimodal capabilities in the wild
B Li, K Zhang, H Zhang, D Guo, R Zhang, F Li, Y Zhang, Z Liu, C Li
May, 2024
392024
Detection Transformer with Stable Matching
S Liu, T Ren, J Chen, Z Zeng, H Zhang, F Li, H Li, J Huang, H Su, J Zhu, ...
International Conference on Computer Vision (ICCV), 2023, 2023
312023
The system can't perform the operation now. Try again later.
Articles 1–20