default search action
Shuo-Yiin Chang
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c38]W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath:
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study. ICASSP 2024: 13306-13310 - [i22]W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath:
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study. CoRR abs/2401.12789 (2024) - 2023
- [c37]Guru Prakash Arumugam, Shuo-Yiin Chang, Tara N. Sainath, Rohit Prabhavalkar, Quan Wang, Shaan Bijwadia:
Improved Long-Form Speech Recognition By Jointly Modeling The Primary And Non-Primary Speakers. ASRU 2023: 1-8 - [c36]Chun-Yi Kuan, Chen-An Li, Tsu-Yuan Hsu, Tse-Yang Lin, Ho-Lam Chung, Kai-Wei Chang, Shuo-Yiin Chang, Hung-Yi Lee:
Towards General-Purpose Text-Instruction-Guided Voice Conversion. ASRU 2023: 1-8 - [c35]Shuo-Yiin Chang, Chao Zhang, Tara N. Sainath, Bo Li, Trevor Strohman:
Context-Aware end-to-end ASR Using Self-Attentive Embedding and Tensor Fusion. ICASSP 2023: 1-5 - [c34]W. Ronny Huang, Shuo-Yiin Chang, Tara N. Sainath, Yanzhang He, David Rybach, Robert David, Rohit Prabhavalkar, Cyril Allauzen, Cal Peyser, Trevor D. Strohman:
E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model. ICASSP 2023: 1-5 - [c33]Weiran Wang, Ding Zhao, Shaojin Ding, Hao Zhang, Shuo-Yiin Chang, David Rybach, Tara N. Sainath, Yanzhang He, Ian McGraw, Shankar Kumar:
Multi-Output RNN-T Joint Networks for Multi-Task Learning of ASR and Auxiliary Tasks. ICASSP 2023: 1-5 - [c32]Chao Zhang, Bo Li, Tara N. Sainath, Trevor Strohman, Shuo-Yiin Chang:
UML: A Universal Monolingual Output Layer For Multilingual Asr. ICASSP 2023: 1-5 - [c31]Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-yi Lee, Tara N. Sainath:
How to Estimate Model Transferability of Pre-Trained Speech Models? INTERSPEECH 2023: 456-460 - [c30]Shaan Bijwadia, Shuo-Yiin Chang, Weiran Wang, Zhong Meng, Hao Zhang:
Text Injection for Capitalization and Turn-Taking Prediction in Speech Models. INTERSPEECH 2023: 1409-1413 - [c29]W. Ronny Huang, Hao Zhang, Shankar Kumar, Shuo-Yiin Chang, Tara N. Sainath:
Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR. INTERSPEECH 2023: 2778-2782 - [i21]Chao Zhang, Bo Li, Tara N. Sainath, Trevor Strohman, Shuo-Yiin Chang:
UML: A Universal Monolingual Output Layer for Multilingual ASR. CoRR abs/2302.11186 (2023) - [i20]W. Ronny Huang, Hao Zhang, Shankar Kumar, Shuo-Yiin Chang, Tara N. Sainath:
Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR. CoRR abs/2305.18419 (2023) - [i19]Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-yi Lee, Tara N. Sainath:
How to Estimate Model Transferability of Pre-Trained Speech Models? CoRR abs/2306.01015 (2023) - [i18]Shaan Bijwadia, Shuo-Yiin Chang, Weiran Wang, Zhong Meng, Hao Zhang, Tara N. Sainath:
Text Injection for Capitalization and Turn-Taking Prediction in Speech Models. CoRR abs/2308.07395 (2023) - [i17]Chun-Yi Kuan, Chen-An Li, Tsu-Yuan Hsu, Tse-Yang Lin, Ho-Lam Chung, Kai-Wei Chang, Shuo-Yiin Chang, Hung-yi Lee:
Towards General-Purpose Text-Instruction-Guided Voice Conversion. CoRR abs/2309.14324 (2023) - [i16]Guru Prakash Arumugam, Shuo-Yiin Chang, Tara N. Sainath, Rohit Prabhavalkar, Quan Wang, Shaan Bijwadia:
Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers. CoRR abs/2312.11123 (2023) - 2022
- [c28]Tara N. Sainath, Yanzhang He, Arun Narayanan, Rami Botros, Weiran Wang, David Qiu, Chung-Cheng Chiu, Rohit Prabhavalkar, Alexander Gruenstein, Anmol Gulati, Bo Li, David Rybach, Emmanuel Guzman, Ian McGraw, James Qin, Krzysztof Choromanski, Qiao Liang, Robert David, Ruoming Pang, Shuo-Yiin Chang, Trevor Strohman, W. Ronny Huang, Wei Han, Yonghui Wu, Yu Zhang:
Improving The Latency And Quality Of Cascaded Encoders. ICASSP 2022: 8112-8116 - [c27]Chao Zhang, Bo Li, Zhiyun Lu, Tara N. Sainath, Shuo-Yiin Chang:
Improving the Fusion of Acoustic and Text Representations in RNN-T. ICASSP 2022: 8117-8121 - [c26]Shuo-Yiin Chang, Bo Li, Tara N. Sainath, Chao Zhang, Trevor Strohman, Qiao Liang, Yanzhang He:
Turn-Taking Prediction for Natural Conversational Speech. INTERSPEECH 2022: 1821-1825 - [c25]Shuo-Yiin Chang, Guru Prakash, Zelin Wu, Tara N. Sainath, Bo Li, Qiao Liang, Adam Stambler, Shyam Upadhyay, Manaal Faruqui, Trevor Strohman:
Streaming Intended Query Detection using E2E Modeling for Continued Conversation. INTERSPEECH 2022: 1826-1830 - [c24]Bo Li, Tara N. Sainath, Ruoming Pang, Shuo-Yiin Chang, Qiumin Xu, Trevor Strohman, Vince Chen, Qiao Liang, Heguang Liu, Yanzhang He, Parisa Haghani, Sameer Bidichandani:
A Language Agnostic Multilingual Streaming On-Device ASR System. INTERSPEECH 2022: 3188-3192 - [c23]Chao Zhang, Bo Li, Tara N. Sainath, Trevor Strohman, Sepand Mavandadi, Shuo-Yiin Chang, Parisa Haghani:
Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification. INTERSPEECH 2022: 3223-3227 - [c22]W. Ronny Huang, Shuo-Yiin Chang, David Rybach, Tara N. Sainath, Rohit Prabhavalkar, Cal Peyser, Zhiyun Lu, Cyril Allauzen:
E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR. INTERSPEECH 2022: 4995-4999 - [c21]Shaan Bijwadia, Shuo-Yiin Chang, Bo Li, Tara N. Sainath, Chao Zhang, Yanzhang He:
Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems. SLT 2022: 310-316 - [i15]Chao Zhang, Bo Li, Zhiyun Lu, Tara N. Sainath, Shuo-Yiin Chang:
Improving the fusion of acoustic and text representations in RNN-T. CoRR abs/2201.10240 (2022) - [i14]W. Ronny Huang, Shuo-Yiin Chang, David Rybach, Rohit Prabhavalkar, Tara N. Sainath, Cyril Allauzen, Cal Peyser, Zhiyun Lu:
E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR. CoRR abs/2204.10749 (2022) - [i13]Shuo-Yiin Chang, Bo Li, Tara N. Sainath, Chao Zhang, Trevor Strohman, Qiao Liang, Yanzhang He:
Turn-Taking Prediction for Natural Conversational Speech. CoRR abs/2208.13321 (2022) - [i12]Shuo-Yiin Chang, Guru Prakash, Zelin Wu, Qiao Liang, Tara N. Sainath, Bo Li, Adam Stambler, Shyam Upadhyay, Manaal Faruqui, Trevor Strohman:
Streaming Intended Query Detection using E2E Modeling for Continued Conversation. CoRR abs/2208.13322 (2022) - [i11]Bo Li, Tara N. Sainath, Ruoming Pang, Shuo-Yiin Chang, Qiumin Xu, Trevor Strohman, Vince Chen, Qiao Liang, Heguang Liu, Yanzhang He, Parisa Haghani, Sameer Bidichandani:
A Language Agnostic Multilingual Streaming On-Device ASR System. CoRR abs/2208.13916 (2022) - [i10]Chao Zhang, Bo Li, Tara N. Sainath, Trevor Strohman, Sepand Mavandadi, Shuo-Yiin Chang, Parisa Haghani:
Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification. CoRR abs/2209.06058 (2022) - [i9]Shaan Bijwadia, Shuo-Yiin Chang, Bo Li, Tara N. Sainath, Chao Zhang, Yanzhang He:
Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems. CoRR abs/2211.00786 (2022) - [i8]W. Ronny Huang, Shuo-Yiin Chang, Tara N. Sainath, Yanzhang He, David Rybach, Robert David, Rohit Prabhavalkar, Cyril Allauzen, Cal Peyser, Trevor D. Strohman:
E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model. CoRR abs/2211.15432 (2022) - 2021
- [c20]Bo Li, Anmol Gulati, Jiahui Yu, Tara N. Sainath, Chung-Cheng Chiu, Arun Narayanan, Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu:
A Better and Faster end-to-end Model for Streaming ASR. ICASSP 2021: 5634-5638 - [c19]Jiahui Yu, Chung-Cheng Chiu, Bo Li, Shuo-Yiin Chang, Tara N. Sainath, Yanzhang He, Arun Narayanan, Wei Han, Anmol Gulati, Yonghui Wu, Ruoming Pang:
FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization. ICASSP 2021: 6004-6008 - [c18]Tara N. Sainath, Yanzhang He, Arun Narayanan, Rami Botros, Ruoming Pang, David Rybach, Cyril Allauzen, Ehsan Variani, James Qin, Quoc-Nam Le-The, Shuo-Yiin Chang, Bo Li, Anmol Gulati, Jiahui Yu, Chung-Cheng Chiu, Diamantino Caseiro, Wei Li, Qiao Liang, Pat Rondon:
An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling. Interspeech 2021: 1777-1781 - 2020
- [c17]Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alexander Gruenstein, Ke Hu, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirkó Visontai, Yonghui Wu, Yu Zhang, Ding Zhao:
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency. ICASSP 2020: 6059-6063 - [c16]Bo Li, Shuo-Yiin Chang, Tara N. Sainath, Ruoming Pang, Yanzhang He, Trevor Strohman, Yonghui Wu:
Towards Fast and Accurate Streaming End-To-End ASR. ICASSP 2020: 6069-6073 - [c15]Shuo-Yiin Chang, Bo Li, David Rybach, Yanzhang He, Wei Li, Tara N. Sainath, Trevor Strohman:
Low Latency Speech Recognition Using End-to-End Prefetching. INTERSPEECH 2020: 1962-1966 - [c14]Shaojin Ding, Quan Wang, Shuo-Yiin Chang, Li Wan, Ignacio López-Moreno:
Personal VAD: Speaker-Conditioned Voice Activity Detection. Odyssey 2020: 433-439 - [i7]Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alexander Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirkó Visontai, Yonghui Wu, Yu Zhang, Ding Zhao:
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency. CoRR abs/2003.12710 (2020) - [i6]Jiahui Yu, Chung-Cheng Chiu, Bo Li, Shuo-Yiin Chang, Tara N. Sainath, Yanzhang He, Arun Narayanan, Wei Han, Anmol Gulati, Yonghui Wu, Ruoming Pang:
FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization. CoRR abs/2010.11148 (2020) - [i5]Bo Li, Anmol Gulati, Jiahui Yu, Tara N. Sainath, Chung-Cheng Chiu, Arun Narayanan, Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu:
A Better and Faster End-to-End Model for Streaming ASR. CoRR abs/2011.10798 (2020)
2010 – 2019
- 2019
- [j1]Hendrik Purwins, Bo Li, Tuomas Virtanen, Jan Schlüter, Shuo-Yiin Chang, Tara N. Sainath:
Deep Learning for Audio Signal Processing. IEEE J. Sel. Top. Signal Process. 13(2): 206-219 (2019) - [c13]Shuo-Yiin Chang, Bo Li, Gabor Simko:
A Unified Endpointer Using Multitask and Multidomain Training. ASRU 2019: 100-106 - [c12]Shuo-Yiin Chang, Rohit Prabhavalkar, Yanzhang He, Tara N. Sainath, Gabor Simko:
Joint Endpointing and Decoding with End-to-end Models. ICASSP 2019: 5626-5630 - [c11]Yanzhang He, Tara N. Sainath, Rohit Prabhavalkar, Ian McGraw, Raziel Alvarez, Ding Zhao, David Rybach, Anjuli Kannan, Yonghui Wu, Ruoming Pang, Qiao Liang, Deepti Bhatia, Yuan Shangguan, Bo Li, Golan Pundak, Khe Chai Sim, Tom Bagby, Shuo-Yiin Chang, Kanishka Rao, Alexander Gruenstein:
Streaming End-to-end Speech Recognition for Mobile Devices. ICASSP 2019: 6381-6385 - [i4]Hendrik Purwins, Bo Li, Tuomas Virtanen, Jan Schlüter, Shuo-Yiin Chang, Tara N. Sainath:
Deep Learning for Audio Signal Processing. CoRR abs/1905.00078 (2019) - [i3]Shaojin Ding, Quan Wang, Shuo-Yiin Chang, Li Wan, Ignacio López-Moreno:
Personal VAD: Speaker-Conditioned Voice Activity Detection. CoRR abs/1908.04284 (2019) - [i2]Ahmed Hussen Abdelaziz, Shuo-Yiin Chang, Nelson Morgan, Erik Edwards, Dorothea Kolossa, Dan Ellis, David A. Moses, Edward F. Chang:
On Neural Phone Recognition of Mixed-Source ECoG Signals. CoRR abs/1912.05869 (2019) - 2018
- [c10]Shuo-Yiin Chang, Bo Li, Gabor Simko, Tara N. Sainath, Anshuman Tripathi, Aäron van den Oord, Oriol Vinyals:
Temporal Modeling Using Dilated Convolution and Gating for Voice-Activity-Detection. ICASSP 2018: 5549-5553 - [i1]Yanzhang He, Tara N. Sainath, Rohit Prabhavalkar, Ian McGraw, Raziel Alvarez, Ding Zhao, David Rybach, Anjuli Kannan, Yonghui Wu, Ruoming Pang, Qiao Liang, Deepti Bhatia, Yuan Shangguan, Bo Li, Golan Pundak, Khe Chai Sim, Tom Bagby, Shuo-Yiin Chang, Kanishka Rao, Alexander Gruenstein:
Streaming End-to-end Speech Recognition For Mobile Devices. CoRR abs/1811.06621 (2018) - 2017
- [c9]Matt Shannon, Gabor Simko, Shuo-Yiin Chang, Carolina Parada:
Improved End-of-Query Detection for Streaming Speech Recognition. INTERSPEECH 2017: 1909-1913 - [c8]Shuo-Yiin Chang, Bo Li, Tara N. Sainath, Gabor Simko, Carolina Parada:
Endpoint Detection Using Grid Long Short-Term Memory Networks for Streaming Speech Recognition. INTERSPEECH 2017: 3812-3816 - 2016
- [b1]Shuo-Yiin Chang:
Feature Design for Robust Speech Recognition: Nurture and Nature. University of California, Berkeley, USA, 2016 - 2015
- [c7]Shuo-Yiin Chang, Steven Wegmann:
On the importance of modeling and robustness for deep neural network feature. ICASSP 2015: 4530-4534 - 2014
- [c6]Shuo-Yiin Chang, Nelson Morgan:
Robust CNN-based speech recognition with Gabor filter kernels. INTERSPEECH 2014: 905-909 - 2013
- [c5]Sree Hari Krishnan Parthasarathi, Shuo-Yiin Chang, Jordan Cohen, Nelson Morgan, Steven Wegmann:
The blame game in meeting room ASR: An analysis of feature versus model errors in noisy and mismatched conditions. ICASSP 2013: 6758-6762 - [c4]Shuo-Yiin Chang, Bernd T. Meyer, Nelson Morgan:
Spectro-temporal features for noise-robust speech recognition using power-law nonlinearity and power-bias subtraction. ICASSP 2013: 7063-7067 - [c3]Shuo-Yiin Chang, Nelson Morgan:
Informative spectro-temporal bottleneck features for noise-robust speech recognition. INTERSPEECH 2013: 99-103
2000 – 2009
- 2009
- [c2]Shuo-Yiin Chang, Lin-Shan Lee:
Improved clustered hierarchical tandem system with bottom-up processing. ICASSP 2009: 4441-4444 - 2008
- [c1]Shuo-Yiin Chang, Lin-Shan Lee:
Data-driven clustered hierarchical tandem system for LVCSR. INTERSPEECH 2008: 2250-2253
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-08-25 20:06 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint