Speech Corpora as Facilities of Creation and Storage of Exemplary Speech Signals

Authors

  • Arkadij Prodeus

DOI:

https://doi.org/10.20535/1810-0546.2013.1.90709

Abstract

Speech corpora are an important constituent of modern investigators’ toolkit in such areas as speech correction, designing and testing elements of telecommunication systems and systems of automatic speech recognition. In this paper, we search for elements of construction technology of the sound part of noisy Ukrainian speech corpora. To this end, we consider characteristics of the most wiely used modern noisy speech corpora which allow formulating principles of such corpora design. The regularity of formulated principles is shown by an example of known modern program toolkit FaNT which permits constructing quickly speech corpora with required properties. The guidelines on constructing similar program toolkit in Matlab environment are developed. Such toolkit will allow not only to work out by joint efforts the best version of a Ukrainian noisy speech corpus, but to compare algorithms of noise reduction and algorithms of automatic speech recognition elaborated by various scientists with one another in the future.

References

1. Ладошко О.М. Дослідження впливу характеристик телефонного каналу зв’язку на надійність розпізнавання фонем // Інфор. сист. управ. і комп. моніторинг: Зб. пр. міжнар. наук.-техн. конф. – Київ, 2012. – С. 308–313.

2. S. Moller, Quality of Telephone-Based Spoken Dialogue Systems. Boston: Springer Science+Business Media, Inc., 2005, 490 p.

3. P. Moreno and R. Stern, “Sources of Degradation of Speech Recognition in the Telephone Network”, Proc. of the IEEE International Conf. on Acoustics, Speech, and Signal Processing, Adelaide, Australia, vol. I, рр. 109– 112, April 1994.

4. Кривнова О.Ф. Речевые корпусы на новом технологическом витке // Речевые технол. – 2008. – № 2. – С. 13–23.

5. Jankowski C. et al., “NTIMIT: A Phonetically Balanced, Continuous Sspeech, Telephone Bandwidth Speech Database”, Proc. ICASSP-90, vol. 1, pp. 109–112, 1990.

6. Corpora Group at CSLU [Online]. Available: http: //www. cslu.ogi.edu/corpora/corpCurrent.html

7. H.-G. Hirsch, The Aurora-5 Experimental Framework for the Performance Evaluation of Speech Recognition in Case of a Hands-free Speech Input in Noisy Environments [Online]. Available: http://aurora. hsnr.de/background.html

8. The University at Texas at Dallas. Speech Processing Lab. Noisy Speech Corpus [Online]. Available: http: //www. utdallas. edu/~loizou/speech/noizeus/

9. Центр Речевых Технологий [Электронный ресурс]. – Режим доступа: http://speechpro. ru/

10. Викторов А.Б., Викторова К.О., Воронцова А.В. Речевые базы данных для задач автоматического распознавания речи и верификации говорящего // Сов. речевые технол.: Сб. тр. IX сессии Рос. акустич. общества. – 1999. – C. 5–15.

11. Wideband Speech Database for Russian [Online]. Available: http://www. auditech.ru/page/widerband.html

12. SpeechDat-Car data base [Online]. Available: http: //www. fee.vutbr.cz/SPEECHDAT-E/sample/russian.html

13. Сайт з розпізнавання та синтезу мовлення в Україні [Електронний ресурс]. – Режим доступу: http: //www. speech.com.ua/developers.html

14. Центр Глобальних Повідомлень Україна – Global Message Services [Електронний ресурс]. – Режим доступу: http://www. gmsu.ua/

15. H.G. Hirsch and H. Finster, “The Simulation of Realistic Acoustic Input Scenarios for Speech Recognition Systems”, in 9th European Conf. on Speech Commun. and Technol., Lisboa, Portugal, September 2005, pp. 1–4.

16. LDC Top Ten Corpora [Online]. Available: http: //www.ldc. upenn.edu/Catalog/topten.jsp

17. Васильева Н.Б., Пилипенко В.В., Радуцкий А.М. и др. Корпус украинской эфирной речи // Речевые технол. 2012. – № 2. – С. 12–21.

18. Продеус А.Н., Дидковский В.С., Дидковская М.В. Акустическая экспертиза каналов речевой коммуникации: Монография. – К.: Имэкс-ЛТД, 2008. – 420 с.

19. Продеус А.Н. О некоторых особенностях развития объективных методов измерений разборчивости речи // Электрон. и связь. Тем. вып. Электрон. и нанотехнол. – 2010. – № 2. – С. 217–223.

20. FaNT and the Calculation of the Signal-to-Noise-Ratio (SNR) [Online]. Available: http://dnt. kr.hsnr.de/down load/snr_comments.html

21. Recommendation ITU-T P.56. Series P: Terminals and Subjective and Objective Assesement Methods Objective measuring apparatus. Objective Measurement of Active Speech Level, Telecommun. Standartisation Sector of ITU, vol. 12, 24 p., 2011.

22. VoiceBox: Speech Processing Toolbox for MATLAB [Online]. Available: http://www. ee.ic.ac.uk/hp/staff/dmb/ voicebox/voicebox.html#file

Published

2013-02-28

Issue

Section

Art