Linear Predictive Codes for Speech Recognition System at 121bps

AUTHORS

Pramod B. Patil,Principal, Dr Rajendra Gode Institute of Technology & Research, Amravati, India

ABSTRACT

This paper described the recognition of the phonetics related to numericals in Indian regional languages such as Marathi & Hindi by Nearest Neighbour rule. The segmentation is based on the location of the start and end points of the speech. The exact speech boundaries can be located and evaluated for linear predictive codes. The Linear Predictive Codes of the phonetics related to numerical in Indian regional languages such as Marathi & Hindi forms the codebook. The optimum distance between the test and the codebook linear predictive codes can be determined by the Dynamic Time Warping technique. Depending on the distance, the word is recognized by Nearest Neighbour rule. The accuracy of 88% is achieved with highly reduction in the memory requirements & good SNR.

 

KEYWORDS

Pitch, Linear predictive codes, Dynamic time warping, Nearest neighbour rule

REFERENCES

[1]     W.S. and Mada Sanjaya, “Speech recognition using Linear Predictive Coding (LPC) and Adaptive Neuro-Fuzzy (ANFIS) to Control 5 DoF Arm Robot,” Journal of Physics: Conference Series, vol.1090, pp.1-10, (2018)
[2]     Oday Kamil Hamid, “Speech sound coding using linear predictive coding (LPC),” Journal of Information, Communication, and Intelligence Systems (JICIS), vol.3, no.1, pp.13-17, (2017)
[3]     Samad S, Hussain A, and Fah L K, “Pitch detection of speech signals using the cross correlation technique,” Proceedings of IEEE on Speech, Audio and Signal Processing, pp.283-286, (2000) DOI: 10.1109/TENCON.2000.893673(CrossRef)(Google Scholar)
[4]     Rabiner L R, and Sambur M R, “An algorithm for determining the endpoints of isolated utterances” Bell System Technical Journal, vol.54, pp.297-315, (1975)
[5]     M.J. Ross, H.L. Shaffer, A. Cohen, R. Freudberg, and H.J.Manley, “Average magnitude difference function pitch extractor,” IEEE Transaction on Acoustic, Speech and Signal Processing, pp.353-362 Oct, (1974)
[6]     D. Takin, “A robust algorithm for pitch tracking (RAPT),” Speech Coding and Synthesis, Netherlands: Elsevier Science, (1995)
[7]     L.A. Atkinson, M. Kondoz and B.G.Evans, “Time envelop vocoder, A new LP based coding strategy for use of bit-rate 2.4kb/s and below,” IEEE Journal on Selected Areas on Communications, vol.13, no.2, pp.449-457, (1995) DOI: 10.1109/49.345890(CrossRef)(Google Scholar)
[8]     B.Gold and L. Rabiner, “Parallel processing techniques for estimating pitch periods of speech in the time domain,” The Journal of the Acoustical Society of America, vol.46, pp.442-448, (1969) DOI: 10.1121/1.1911709(CrossRef)(Google Scholar)
[9]     Douglas O. Shaughnessy, “Linear predictive coding one popular technique of analyzing certain physical signals,” IEEE Potentials, pp.29-32, (1998)
[10]  Pillai S., Hyun S. Oh, and Akansu A., “A new parametric formulation for linear predictive coding,” Proceedings of IEEE on Signal Processing, pp.1432-1435, (1995)
[11]  Kwong S. and Man K. F., “A speech coding algorithm based on predictive coding,” Proceedings of IEEE on Speech and Audio Processing, pp.455-460, (1995)
[12]  Yakhnich E. and Bistritz Y., “Constant delay and rate coding of speech spectral envelope at 11 bits / frame,” Proceedings of IEEE on Speech, Audio and Signal Processing, pp.247-229, (2002)
[13]  Paliwal K. and Atal B. S., “Efficient vector quantization of LPC parameters at 24 bits / frame,” IEEE Transaction on Speech and Audio Processing, vol.1, no.1, pp.3-14, (1993)
[14]  Gray A. H. and Markel J., “Distance measures for speech processing,” IEEE Transactions on Acoustic, Speech and Signal Processing, vol.24, pp.380-391, (1976)
[15]  Sakoe H. and Chiba S., “Dynamic programming algorithm optimization for spoken word recognition,” IEEE Transactions on Acoustic, Speech and Signal Processing, vol.26, pp.43-49, (1978)
[16]  Mayers C., Rabiner L. R., and Rosenberg A. E., “Performance tradeoffs in dynamic time warping algorithms for isolated word recognition,” IEEE Transactions on Acoustic, Speech and Signal Processing, vol.28, pp.622-635, (1980)
[17]  L. Rabiner, S. Levinson, A. Rosenberg, and J. Wilpon, “Speaker independent recognition of isolated words using clustering techniques,” IEEE Transactions on Acoustic, Speech and Signal Processing, vol.27, pp.336-349, (1979) DOI: 10.1109/TASSP.1979.1163259(CrossRef)(Google Scholar)

CITATION

  • APA:
    Patil,P.B.(2020). Linear Predictive Codes for Speech Recognition System at 121bps. International Journal of Hybrid Information Technology, 13(1), 33-40. 10.21742/IJHIT.2020.13.1.04
  • Harvard:
    Patil,P.B.(2020). "Linear Predictive Codes for Speech Recognition System at 121bps". International Journal of Hybrid Information Technology, 13(1), pp.33-40. doi:10.21742/IJHIT.2020.13.1.04
  • IEEE:
    [1] P.B.Patil, "Linear Predictive Codes for Speech Recognition System at 121bps". International Journal of Hybrid Information Technology, vol.13, no.1, pp.33-40, Mar. 2020
  • MLA:
    Patil Pramod B.. "Linear Predictive Codes for Speech Recognition System at 121bps". International Journal of Hybrid Information Technology, vol.13, no.1, Mar. 2020, pp.33-40, doi:10.21742/IJHIT.2020.13.1.04
 

COPYRIGHT

Creative Commons License
© 2020 Pramod B. Patil. Published by Global Vision Press. This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CCBY4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

ISSUE INFO

  • Volume 13, No. 1, 2020
  • ISSN(p):1738-9968
  • ISSN(e):2652-2233
  • Published:Mar. 2020

DOWNLOAD