Speech Corpora as Facilities of Creation and Storage of Exemplary Speech Signals
DOI:
https://doi.org/10.20535/1810-0546.2013.1.90709Abstract
Speech corpora are an important constituent of modern investigators’ toolkit in such areas as speech correction, designing and testing elements of telecommunication systems and systems of automatic speech recognition. In this paper, we search for elements of construction technology of the sound part of noisy Ukrainian speech corpora. To this end, we consider characteristics of the most wiely used modern noisy speech corpora which allow formulating principles of such corpora design. The regularity of formulated principles is shown by an example of known modern program toolkit FaNT which permits constructing quickly speech corpora with required properties. The guidelines on constructing similar program toolkit in Matlab environment are developed. Such toolkit will allow not only to work out by joint efforts the best version of a Ukrainian noisy speech corpus, but to compare algorithms of noise reduction and algorithms of automatic speech recognition elaborated by various scientists with one another in the future.References
1. Ладошко О.М. Дослідження впливу характеристик телефонного каналу зв’язку на надійність розпізнавання фонем // Інфор. сист. управ. і комп. моніторинг: Зб. пр. міжнар. наук.-техн. конф. – Київ, 2012. – С. 308–313.
2. S. Moller, Quality of Telephone-Based Spoken Dialogue Systems. Boston: Springer Science+Business Media, Inc., 2005, 490 p.
3. P. Moreno and R. Stern, “Sources of Degradation of Speech Recognition in the Telephone Network”, Proc. of the IEEE International Conf. on Acoustics, Speech, and Signal Processing, Adelaide, Australia, vol. I, рр. 109– 112, April 1994.
4. Кривнова О.Ф. Речевые корпусы на новом технологическом витке // Речевые технол. – 2008. – № 2. – С. 13–23.
5. Jankowski C. et al., “NTIMIT: A Phonetically Balanced, Continuous Sspeech, Telephone Bandwidth Speech Database”, Proc. ICASSP-90, vol. 1, pp. 109–112, 1990.
6. Corpora Group at CSLU [Online]. Available: http: //www. cslu.ogi.edu/corpora/corpCurrent.html
7. H.-G. Hirsch, The Aurora-5 Experimental Framework for the Performance Evaluation of Speech Recognition in Case of a Hands-free Speech Input in Noisy Environments [Online]. Available: http://aurora. hsnr.de/background.html
8. The University at Texas at Dallas. Speech Processing Lab. Noisy Speech Corpus [Online]. Available: http: //www. utdallas. edu/~loizou/speech/noizeus/
9. Центр Речевых Технологий [Электронный ресурс]. – Режим доступа: http://speechpro. ru/
10. Викторов А.Б., Викторова К.О., Воронцова А.В. Речевые базы данных для задач автоматического распознавания речи и верификации говорящего // Сов. речевые технол.: Сб. тр. IX сессии Рос. акустич. общества. – 1999. – C. 5–15.
11. Wideband Speech Database for Russian [Online]. Available: http://www. auditech.ru/page/widerband.html
12. SpeechDat-Car data base [Online]. Available: http: //www. fee.vutbr.cz/SPEECHDAT-E/sample/russian.html
13. Сайт з розпізнавання та синтезу мовлення в Україні [Електронний ресурс]. – Режим доступу: http: //www. speech.com.ua/developers.html
14. Центр Глобальних Повідомлень Україна – Global Message Services [Електронний ресурс]. – Режим доступу: http://www. gmsu.ua/
15. H.G. Hirsch and H. Finster, “The Simulation of Realistic Acoustic Input Scenarios for Speech Recognition Systems”, in 9th European Conf. on Speech Commun. and Technol., Lisboa, Portugal, September 2005, pp. 1–4.
16. LDC Top Ten Corpora [Online]. Available: http: //www.ldc. upenn.edu/Catalog/topten.jsp
17. Васильева Н.Б., Пилипенко В.В., Радуцкий А.М. и др. Корпус украинской эфирной речи // Речевые технол. 2012. – № 2. – С. 12–21.
18. Продеус А.Н., Дидковский В.С., Дидковская М.В. Акустическая экспертиза каналов речевой коммуникации: Монография. – К.: Имэкс-ЛТД, 2008. – 420 с.
19. Продеус А.Н. О некоторых особенностях развития объективных методов измерений разборчивости речи // Электрон. и связь. Тем. вып. Электрон. и нанотехнол. – 2010. – № 2. – С. 217–223.
20. FaNT and the Calculation of the Signal-to-Noise-Ratio (SNR) [Online]. Available: http://dnt. kr.hsnr.de/down load/snr_comments.html
21. Recommendation ITU-T P.56. Series P: Terminals and Subjective and Objective Assesement Methods Objective measuring apparatus. Objective Measurement of Active Speech Level, Telecommun. Standartisation Sector of ITU, vol. 12, 24 p., 2011.
22. VoiceBox: Speech Processing Toolbox for MATLAB [Online]. Available: http://www. ee.ic.ac.uk/hp/staff/dmb/ voicebox/voicebox.html#file
Downloads
Published
Issue
Section
License
Copyright (c) 2013 NTUU KPI Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under CC BY 4.0 that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work