Opt: Open pre-trained transformer language models S Zhang, S Roller, N Goyal, M Artetxe, M Chen, S Chen, C Dewan, ... arXiv preprint arXiv:2205.01068, 2022 | 3425* | 2022 |
Dota 2 with large scale deep reinforcement learning C Berner, G Brockman, B Chan, V Cheung, P Dębiak, C Dennison, ... arXiv preprint arXiv:1912.06680, 2019 | 2175 | 2019 |
Lima: Less is more for alignment C Zhou, P Liu, P Xu, S Iyer, J Sun, Y Mao, X Ma, A Efrat, P Yu, L Yu, ... Advances in Neural Information Processing Systems 36, 2024 | 809 | 2024 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ... arXiv preprint arXiv:2403.05530, 2024 | 724 | 2024 |
xformers: A modular and hackable transformer modelling library B Lefaudeux, F Massa, D Liskovich, W Xiong, V Caggiano, S Naren, M Xu, ... | 133 | 2022 |
Scaling autoregressive multi-modal models: Pretraining and instruction tuning L Yu, B Shi, R Pasunuru, B Muller, O Golovneva, T Wang, A Babu, B Tang, ... arXiv preprint arXiv:2309.02591 2 (3), 2023 | 116 | 2023 |
Scaling laws for generative mixed-modal language models A Aghajanyan, L Yu, A Conneau, WN Hsu, K Hambardzumyan, S Zhang, ... International Conference on Machine Learning, 265-279, 2023 | 75 | 2023 |
Openai five, 2018 J Pachocki, G Brockman, J Raiman, S Zhang, H Pondé, J Tang, F Wolski, ... URL https://blog. openai. com/openai-five, 2018 | 29* | 2018 |
A theory on adam instability in large-scale machine learning I Molybog, P Albert, M Chen, Z DeVito, D Esiobu, N Goyal, PS Koura, ... arXiv preprint arXiv:2304.09871, 2023 | 22 | 2023 |
Long-term planning and situational awareness in openai five J Raiman, S Zhang, F Wolski arXiv preprint arXiv:1912.06721, 2019 | 16 | 2019 |
Neural network surgery with sets J Raiman, S Zhang, C Dennison arXiv preprint arXiv:1912.06719, 2019 | 9 | 2019 |
Hirsch index and a co-authorship network S Zhang | 1 | 2011 |
The PlaceIQ Analytic Platform: Location Oriented Approaches to Mobile Audiences JM Huerta, J Lenaghan, S Milton, K Brackney, A Kapila, R Shraga, ... Proceedings of the Eighth International Workshop on Data Mining for Online …, 2014 | | 2014 |
Scale-Free Networks and Proximity Measures S Zhang | | 2011 |
Decomposition of Mathematical Models of Quantum Information Transference Using a Simulated Annealing Technique S Zhang | | 2010 |