Post by account_disabled on Mar 7, 2024 4:57:55 GMT -5
widely used in speech recognition, handwriting recognition, weather forecast and other fields since the 1990s. It is still the mainstream technology in speech recognition. Although the acoustic model based on NN-H has the ability to fit any complex distribution, it also has a serious flaw, that is, it is inefficient in modeling nonlinear data. Therefore, a long time ago, relevant researchers proposed to use artificial neural networks to replace the modeling of H state posterior probability. However, due to the limited computing power at that time,
it was difficult to train a neural network model with more than two layers, so the performance improvement it brought was very weak. The development of machine learning algorithms and computer hardware over the past Rich People Phone Number List century has made it possible to train multi-hidden layer neural networks. Practice shows that NN has achieved far superior recognition performance on various large data sets. Therefore, NN-H replaces -H and becomes the current mainstream acoustic modeling framework. End-to-end model Acoustic modeling of traditional speech
recognition systems generally establishes the connection from the acoustic observation sequence to words through information sources such as articulatory units, H acoustic models, and dictionaries. Each part requires separate learning and training steps, which are cumbersome. The end-to-end En--EnEE structure uses a model to include these three information sources to achieve direct conversion from observation sequences to text. Some recent advances even include language model information to achieve better performance. Since this year, end-to-end models have increasingly become a research hotspot in speech recognition. 2. Language model