欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

语音识别之HTK重理解

程序员文章站 2022-07-13 14:41:05
...

语音识别之HTK重理解


趁着没开学,今天把语音识别中的隐马尔可夫模型相关训练重新跑了一遍,结合网络大佬的经验,对HTK工具的继续运行深入理解,重新训练了数据,并结合实际进行了更新和完善。

环境问题我就不说了,我默认已经是配置好的了。
今天还是孤立词,内容呢是
one,two,three,当然,后面*发挥
首先进行数据的采集

rec -b 8 data/train/speech/01.wav
rec -b 8 data/train/speech/02.wav.....

语音识别之HTK重理解

我这里录了十个one十个two十个three,保存在train的speech文件夹下
然后进行训练数据更改,结合前几篇的内容看
修改grammer为所需类别
修改codetrain.scp为训练文件路径和生成mfc路径
修改train.scp为为mfc路径
修改wordlist内容为训练文本列表
修改trainprompts训练所对应的文本,这个就相当于标注。

完成之后直接运行以下所有命令:

HParse ./config/grammer ./config/wordnet
HDMan -m -w ./lists/wordlist -n ./lists/monophones -g ./config/global.ded ./dict/dict_color ./dict/beep ./dict/otherDict
perl ./scripts/prompts2mlf ./labels/trainwords.mlf ./labels/trainprompts
HLEd -l '*' -d ./dict/dict_color -i ./labels/phones_color.mlf ./config/mkphones_color.led ./labels/trainwords.mlf 
HCopy -T 1 -C ./config/config_HCopy -S ./config/codetrain.scp
HCompV -C ./config/config_color -f 0.01 -m -S ./config/train.scp -M ./hmm0 ./config/proto
perl scripts/makeMacros hmm0/vFloors hmm0/macros
perl scripts/makeHmmdefs hmm0/proto lists/monophones hmm0/hmmdefs
perl scripts/makeMonoOffsp ./lists/monophones ./lists/monoOffSP
HERest -C ./config/config_color -I ./labels/phones_color.mlf -t 250.0 150.0 1000.0 -S ./config/train.scp -H ./hmm0/macros -H ./hmm0/hmmdefs -M ./hmm1/ ./lists/monoOffSP
HERest -C ./config/config_color -I ./labels/phones_color.mlf -t 250.0 150.0 1000.0 -S ./config/train.scp -H ./hmm1/macros -H ./hmm1/hmmdefs -M ./hmm2/ ./lists/monoOffSP
HERest -C ./config/config_color -I ./labels/phones_color.mlf -t 250.0 150.0 1000.0 -S ./config/train.scp -H ./hmm2/macros -H ./hmm2/hmmdefs -M ./hmm3/ ./lists/monoOffSP
perl ./scripts/fixSil hmm3/hmmdefs hmm4/hmmdefs
cp hmm3/macros ./hmm4/macros
HHEd -H ./hmm4/macros -H ./hmm4/hmmdefs -M hmm5/ config/sil.hed ./lists/monophones
HLEd -l '*' -d ./dict/dict_color -i ./labels/phones_color.mlf ./config/mkphones_color_HLEd.led ./labels/trainwords.mlf
HERest -C ./config/config_color -I ./labels/phones_color.mlf -t 250.0 150.0 1000.0 -S ./config/train.scp -H ./hmm5/macros -H ./hmm5/hmmdefs -M ./hmm6/ ./lists/monophones
HERest -C ./config/config_color -I ./labels/phones_color.mlf -t 250.0 150.0 1000.0 -S ./config/train.scp -H ./hmm6/macros -H ./hmm6/hmmdefs -M ./hmm7/ ./lists/monophones

Hparse命令进行创建一个词网络,用以描述词与词之间的转移,grammer为修改后的语法,wordnet为生成的网络
HDMan建立词典,基于前面的beep和otherDict,生成了dict_color字典
HLEd转换成mlf
HCopy提取特征参数
HCompV扫描所有的训练数据,得到均值方差
训练0-7
HERest进行重估

完成之后,在相关文件夹里会有新生成文件。
接下来进行测试
我在这里改成了先录音,在转mfc,在测试然后显示
录音

rec -b 8 data/test/speech/test.wav

语音识别之HTK重理解

转换

HCopy -T 1 -C ./config/config_HCopy -S ./config/codetest.scp

识别

HVite -H ./hmm7/macros -H ./hmm7/hmmdefs -C ./config/config_color -S ./config/test.scp -l '*' -i ./results/recout.txt -w ./config/wordnet -p 0.0 -s 5.0 ./dict/dict_color ./lists/monophones

显示

cat ./results/recout.txt |tail -n +3|head -n 3

语音识别之HTK重理解

最终可以看到,显示识别结果是two,是没有问题的。

相关标签: 语音识别