Follow
Zhenmei Shi
Zhenmei Shi
Senior Research Scientist at MongoDB + Voyage AI; PhD from University of Wisconsin–Madison
Verified email at cs.wisc.edu - Homepage
Title
Cited by
Cited by
Year
Why Larger Language Models Do In-context Learning Differently?
Z Shi, J Wei, Z Xu, Y Liang
ICML 2024: International Conference on Machine Learning, 2024
3342024
SF-Net: Structured feature network for continuous sign language recognition
Z Yang*, Z Shi*, X Shen, YW Tai
arXiv preprint arXiv:1908.01341, 2019
842019
A Theoretical Analysis on Feature Learning in Neural Networks: Emergence from Inputs and Advantage over Fixed Features
Z Shi*, J Wei*, Y Liang
ICLR 2022: International Conference on Learning Representations, 2022
702022
Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability
Z Xu*, Z Shi*, Y Liang
COLM 2024: Conference on Language Modeling, 2024
402024
Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers
Y Liang*, Z Shi*, Z Song*, Y Zhou*
AFM Workshop @ NeurIPS 2024, 2024
372024
Deep Online Fused Video Stabilization
Z Shi, F Shi, WS Lai, CK Liang, Y Liang
WACV 2022: Winter Conference on Applications of Computer Vision, 2022
362022
Conv-basis: A new paradigm for efficient attention inference and gradient computation in transformers
Y Liang*, H Liu*, Z Shi*, Z Song*, Z Xu*, J Yin*
arXiv preprint arXiv:2405.05219, 2024
322024
The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning
Z Shi*, J Chen*, K Li, J Raghuram, X Wu, Y Liang, S Jha
ICLR 2023 (Spotlight): International Conference on Learning Representations, 2023
312023
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time
Y Liang*, Z Sha*, Z Shi*, Z Song*, Y Zhou*
OPT Workshop @ NeurIPS 2024, 2024
302024
Fourier Circuits in Neural Networks and Transformers: A Case Study of Modular Arithmetic with Multiple Inputs
C Li*, Y Liang*, Z Shi*, Z Song*, T Zhou*
AIStats 2025: International Conference on Artificial Intelligence and Statistics, 2025
29*2025
Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning
Z Xu, Z Shi, J Wei, F Mu, Y Li, Y Liang
ICLR 2024: International Conference on Learning Representations, 2024
292024
Discovering the gems in early layers: Accelerating long-context llms with 1000x input token reduction
Z Shi, Y Ming, XP Nguyen, Y Liang, S Joty
arXiv preprint arXiv:2409.17422, 2024
282024
When and How Does Known Class Help Discover Unknown Ones? Provable Understandings Through Spectral Analysis
Y Sun, Z Shi, Y Liang, Y Li
ICML 2023: International Conference on Machine Learning, 2023
272023
A Graph-Theoretic Framework for Understanding Open-World Semi-Supervised Learning
Y Sun, Z Shi, Y Li
NeurIPS 2023 (Spotlight): Neural Information Processing Systems, 2023
252023
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
Y Liang*, J Long*, Z Shi*, Z Song*, Y Zhou*
ICLR 2025: International Conference on Learning Representations, 2025
24*2025
A Tighter Complexity Analysis of SparseGPT
X Li*, Y Liang*, Z Shi*, Z Song*
Compression Workshop @ NeurIPS 2024, 2024
232024
HSR-Enhanced Sparse Attention Acceleration
B Chen*, Y Liang*, Z Sha*, Z Shi*, Z Song*
CPAL 2025: Conference on Parsimony and Learning, 2025
212025
Towards Infinite-Long Prefix in Transformer
Y Liang*, Z Shi*, Z Song*, C Yang*
SCOPE Workshop @ ICLR 2025, 2025
202025
Looped ReLU MLPs May Be All You Need as Programmable Computers
Y Liang*, Z Sha*, Z Shi*, Z Song*, Y Zhou*
AIStats 2025: International Conference on Artificial Intelligence and Statistics, 2025
20*2025
Is a picture worth a thousand words? delving into spatial reasoning for vision language models
J Wang, Y Ming, Z Shi, V Vineet, X Wang, Y Li, N Joshi
NeurIPS 2024: Neural Information Processing Systems, 2024
202024
The system can't perform the operation now. Try again later.
Articles 1–20