Follow
Agrim Gupta
Agrim Gupta
PhD Student, Stanford University
Verified email at stanford.edu - Homepage
Title
Cited by
Cited by
Year
Social gan: Socially acceptable trajectories with generative adversarial networks
A Gupta, J Johnson, L Fei-Fei, S Savarese, A Alahi
Proceedings of the IEEE conference on computer vision and pattern …, 2018
24182018
Lvis: A dataset for large vocabulary instance segmentation
A Gupta, P Dollar, R Girshick
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019
14252019
Image generation from scene graphs
J Johnson, A Gupta, L Fei-Fei
Proceedings of the IEEE conference on computer vision and pattern …, 2018
9442018
Open x-embodiment: Robotic learning datasets and rt-x models
A O'Neill, A Rehman, A Gupta, A Maddukuri, A Gupta, A Padalkar, A Lee, ...
arXiv preprint arXiv:2310.08864, 2023
363*2023
Vima: General robot manipulation with multimodal prompts
Y Jiang, A Gupta, Z Zhang, G Wang, Y Dou, Y Chen, L Fei-Fei, ...
arXiv preprint arXiv:2210.03094 2 (3), 6, 2022
301*2022
Embodied Intelligence via Learning and Evolution
A Gupta, S Savarese, S Ganguli, L Fei-Fei
Nature Communications 12, 5721, 2021
2552021
Characterizing and improving stability in neural style transfer
A Gupta, J Johnson, A Alahi, L Fei-Fei
Proceedings of the IEEE International Conference on Computer Vision, 4067-4076, 2017
1462017
Videopoet: A large language model for zero-shot video generation
D Kondratyuk, L Yu, X Gu, J Lezama, J Huang, G Schindler, R Hornung, ...
arXiv preprint arXiv:2312.14125, 2023
1432023
Language Model Beats Diffusion--Tokenizer is Key to Visual Generation
L Yu, J Lezama, NB Gundavarapu, L Versari, K Sohn, D Minnen, Y Cheng, ...
arXiv preprint arXiv:2310.05737, 2023
1412023
Maskvit: Masked visual pre-training for video prediction
A Gupta, S Tian, Y Zhang, J Wu, R Martín-Martín, L Fei-Fei
arXiv preprint arXiv:2206.11894, 2022
1222022
Photorealistic video generation with diffusion models
A Gupta, L Yu, K Sohn, X Gu, M Hahn, FF Li, I Essa, L Jiang, J Lezama
European Conference on Computer Vision, 393-411, 2025
1042025
Robocat: A self-improving foundation agent for robotic manipulation
K Bousmalis, G Vezzani, D Rao, C Devin, AX Lee, M Bauza, T Davchev, ...
arXiv preprint arXiv:2306.11706, 2023
104*2023
Holistic evaluation of text-to-image models
T Lee, M Yasunaga, C Meng, Y Mai, JS Park, A Gupta, Y Zhang, ...
Advances in Neural Information Processing Systems 36, 2024
1002024
Metamorph: Learning universal controllers with transformers
A Gupta, L Fan, S Ganguli, L Fei-Fei
arXiv preprint arXiv:2203.11931, 2022
842022
Trajnet: Towards A Benchmark for Human Trajectory Prediction
A Sadeghian, V Kosaraju, A Gupta, S Savarese, A Alahi
http://trajnet.stanford.edu/, 2018
772018
Siamese masked autoencoders
A Gupta, J Wu, J Deng, FF Li
Advances in Neural Information Processing Systems 36, 40676-40693, 2023
532023
Hourvideo: 1-hour video-language understanding
K Chandrasegaran, A Gupta, LM Hadzic, T Kota, J He, C Eyzaguirre, ...
arXiv preprint arXiv:2411.04998, 2024
12024
Generative Models of Vision and Action
A Gupta
Stanford University, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–18