Dinov2: Learning robust visual features without supervision M Oquab, T Darcet, T Moutakanni, H Vo, M Szafraniec, V Khalidov, ... arXiv preprint arXiv:2304.07193, 2023 | 2484* | 2023 |
The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv preprint arXiv:2407.21783, 2024 | 1551* | 2024 |
SUPERB: Speech processing Universal PERformance Benchmark S Yang, PH Chi, YS Chuang, CIJ Lai, K Lakhotia, YY Lin, AT Liu, J Shi, ... Interspeech 2021, 2021 | 935 | 2021 |
Tera: Self-supervised learning of transformer encoder representation for speech AT Liu, SW Li, H Lee IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 2351-2366, 2021 | 401 | 2021 |
Self-supervised speech representation learning: A review A Mohamed, H Lee, L Borgholt, JD Havtorn, J Edin, C Igel, K Kirchhoff, ... IEEE Journal of Selected Topics in Signal Processing 16 (6), 1179-1210, 2022 | 378 | 2022 |
Introducing meta llama 3: The most capable openly available llm to date AI Meta Meta AI, 2024 | 285* | 2024 |
Data-driven interaction techniques for improving navigation of educational videos J Kim, PJ Guo, CJ Cai, SW Li, KZ Gajos, RC Miller Proceedings of the 27th annual ACM symposium on User interface software and …, 2014 | 227* | 2014 |
Supporting Clustering with Contrastive Learning D Zhang, F Nan, X Wei, S Li, H Zhu, K McKeown, R Nallapati, A Arnold, ... NAACL 2021, 2021 | 222 | 2021 |
DiffCSE: Difference-based contrastive learning for sentence embeddings YS Chuang, R Dangovski, H Luo, Y Zhang, S Chang, M Soljačić, SW Li, ... arXiv preprint arXiv:2204.10298, 2022 | 218 | 2022 |
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation PH Chi, PH Chung, TH Wu, CC Hsieh, SW Li, H Lee SLT 2021, 2020 | 189 | 2020 |
Scaling autoregressive multi-modal models: Pretraining and instruction tuning L Yu, B Shi, R Pasunuru, B Muller, O Golovneva, T Wang, A Babu, B Tang, ... arXiv preprint arXiv:2309.02591 2 (3), 2023 | 122 | 2023 |
Lifelong pretraining: Continually adapting language models to emerging corpora X Jin, D Zhang, H Zhu, W Xiao, SW Li, X Wei, A Arnold, X Ren arXiv preprint arXiv:2110.08534, 2021 | 120 | 2021 |
Demystifying clip data H Xu, S Xie, XE Tan, PY Huang, R Howes, V Sharma, SW Li, G Ghosh, ... arXiv preprint arXiv:2309.16671, 2023 | 118 | 2023 |
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities HS Tsai, HJ Chang, WC Huang, Z Huang, K Lakhotia, S Yang, S Dong, ... ACL 2022, 2022 | 102 | 2022 |
Chameleon: Mixed-modal early-fusion foundation models C Team arXiv preprint arXiv:2405.09818, 2024 | 97 | 2024 |
SeamlessM4T-Massively Multilingual & Multimodal Machine Translation L Barrault, YA Chung, MC Meglioli, D Dale, N Dong, PA Duquenne, ... arXiv preprint arXiv:2308.11596, 2023 | 90 | 2023 |
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining CI Lai, YS Chuang, HY Lee, SW Li, J Glass ICASSP 2021, 2020 | 64 | 2020 |
Mavil: Masked audio-video learners PY Huang, V Sharma, H Xu, C Ryali, Y Li, SW Li, G Ghosh, J Malik, ... Advances in Neural Information Processing Systems 36, 2024 | 61 | 2024 |
Pairwise supervised contrastive learning of sentence representations D Zhang, SW Li, W Xiao, H Zhu, R Nallapati, AO Arnold, B Xiang EMNLP 2021, 2021 | 56 | 2021 |
Ml-superb: Multilingual speech universal performance benchmark J Shi, D Berrebbi, W Chen, HL Chung, EP Hu, WP Huang, X Chang, ... arXiv preprint arXiv:2305.10615, 2023 | 55 | 2023 |