Yi-Jen Shih

Cited by

	All	Since 2020
Citations	177	177
h-index	5	5
i10-index	4	4

202220232024202512 50 87 28

Co-authors

David HarwathThe University of Texas at AustinVerified email at utexas.edu
Hung-yi LeeNational Taiwan UniversityVerified email at ntu.edu.tw
Layne BerryPhD Student, University of Texas at AustinVerified email at utexas.edu
Heng-Jui ChangMassachusetts Institute of TechnologyVerified email at mit.edu
Yi-Hsuan YangNational Taiwan UniversityVerified email at ntu.edu.tw
Meinard MüllerInternational Audio Laboratories Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU)Verified email at audiolabs-erlangen.de
Shih-Lun WuPhD Student, Dept of EECS, Massachusetts Institute of TechnologyVerified email at mit.edu
Frank ZalkowFraunhofer Institute for Integrated Circuits IISVerified email at iis.fraunhofer.de
Alexandros G DimakisProfessor, EECS, UC BerkeleyVerified email at berkeley.edu
Zoi Gkalitsiou, PhD, CCC-SLPCal State University East BayVerified email at csueastbay.edu
Raymond MooneyProfessor of Computer Science, University of Texas at AustinVerified email at cs.utexas.edu

Yi-Jen Shih

Other namesIan Shih

Ph.D. Student at University of Texas at Austin

Verified email at cs.utexas.edu - Homepage

speech and language processing music information retrieval deep learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Theme transformer: Symbolic music generation with theme-conditioned transformer YJ Shih, SL Wu, F Zalkow, M Müller, YH Yang IEEE Transactions on Multimedia 25, 3495-3508, 2022	102	2022
SpeechCLIP: Integrating speech with pre-trained vision and language model YJ Shih, HF Wang, HJ Chang, L Berry, H Lee, D Harwath 2022 IEEE Spoken Language Technology Workshop (SLT), 715-722, 2023	40	2023
Av-superb: A multi-task evaluation benchmark for audio-visual representation models Y Tseng, L Berry, YT Chen, IH Chiu, HH Lin, M Liu, P Peng, YJ Shih, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	11	2024
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval L Berry, YJ Shih, HF Wang, HJ Chang, H Lee, D Harwath ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	11	2023
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks C Huang, WC Chen, S Yang, AT Liu, CA Li, YX Lin, WC Tseng, A Diwan, ... arXiv preprint arXiv:2411.05361, 2024	7	2024
SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data HF Wang, YJ Shih, HJ Chang, L Berry, P Peng, H Lee, HM Wang, ... 2024 IEEE International Conference on Acoustics, Speech, and Signal …, 2024	3	2024
Self-Supervised Speech Models For Word-Level Stuttered Speech Detection YJ Shih, Z Gkalitsiou, AG Dimakis, D Harwath 2024 IEEE Spoken Language Technology Workshop (SLT), 937-944, 2024	1	2024
Interface Design for Self-Supervised Speech Models YJ Shih, D Harwath arXiv preprint arXiv:2406.12209, 2024	1	2024
Integrating self-supervised speech model with pseudo word-level targets from visually-grounded speech model HC Fang, NX Ye, YJ Shih, P Peng, HF Wang, L Berry, H Lee, D Harwath 2024 IEEE International Conference on Acoustics, Speech, and Signal …, 2024	1	2024
Measuring Sound Symbolism In Audio-Visual Models WC Tseng, YJ Shih, D Harwath, R Mooney 2024 IEEE Spoken Language Technology Workshop (SLT), 1165-1172, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–10

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors