Follow
Yi-Jen Shih
Title
Cited by
Cited by
Year
Theme transformer: Symbolic music generation with theme-conditioned transformer
YJ Shih, SL Wu, F Zalkow, M Müller, YH Yang
IEEE Transactions on Multimedia 25, 3495-3508, 2022
862022
SpeechCLIP: Integrating speech with pre-trained vision and language model
YJ Shih, HF Wang, HJ Chang, L Berry, H Lee, D Harwath
2022 IEEE Spoken Language Technology Workshop (SLT), 715-722, 2023
332023
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval
L Berry, YJ Shih, HF Wang, HJ Chang, H Lee, D Harwath
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
102023
Av-superb: A multi-task evaluation benchmark for audio-visual representation models
Y Tseng, L Berry, YT Chen, IH Chiu, HH Lin, M Liu, P Peng, YJ Shih, ...
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
82024
SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data
HF Wang, YJ Shih, HJ Chang, L Berry, P Peng, H Lee, HM Wang, ...
arXiv preprint arXiv:2402.06959, 2024
22024
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
C Huang, WC Chen, S Yang, AT Liu, CA Li, YX Lin, WC Tseng, A Diwan, ...
arXiv preprint arXiv:2411.05361, 2024
12024
Interface Design for Self-Supervised Speech Models
YJ Shih, D Harwath
arXiv preprint arXiv:2406.12209, 2024
12024
Measuring Sound Symbolism in Audio-visual Models
WC Tseng, YJ Shih, D Harwath, R Mooney
arXiv preprint arXiv:2409.12306, 2024
2024
Self-supervised Speech Models for Word-Level Stuttered Speech Detection
YJ Shih, Z Gkalitsiou, AG Dimakis, D Harwath
arXiv preprint arXiv:2409.10704, 2024
2024
Integrating Self-supervised Speech Model with Pseudo Word-level Targets from Visually-grounded Speech Model
HC Fang, NX Ye, YJ Shih, P Peng, HF Wang, L Berry, H Lee, D Harwath
arXiv preprint arXiv:2402.05819, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–10