Android离线语音识别

支持Android在线语音识别的包括Google语音（需翻墙使用）、百度语音、科大讯飞、微软等，在线语音识别技术大多免费。而离线语音识别，Google离线语音识别需设置中下载Google离线语音包，目前Google还没有提供离线的API，百度语音、科大讯飞、微软等提供的离线语音识别需收取一定的费用。本文主要以开源项目PocketSphinx实现离线语音中文识别。

几个常见的语音交互平台的简介和比较

参见：

语音识别相关知识

参见：

开源PocketSphinx实现Android离线中文识别

首先Github上可以下载PocketSphinx-Android的Demo，此Demo为英文的离线语音识别示例，可以直接运行，参考官方教程
其中项目中的开发包，pocketsphinx-android-5prealpha-nolib.jar，可通过vn checkout下来的sphinxbase, pocketsphinx, pocketsphinx-android生成，可参考教程
使用官网的模型包括
声学模型：zh_broadcastnews_16k_ptm256_8000.tar.bz2
语言模型：zh_broadcastnews_64000_utf8.DMP
字典文件：zh_broadcastnews_utf8.dic
编写自己的命令集，保存为command.txt
你好
打开
播放
浏览器
在语音模型生成工具上点Browse，提交command.txt，在线生成语言模型lm文件，
由于4.生成的字典文件dic无法识别中文，因此废弃，按照 3.方法获取dic文件，搜索command.txt里面对应的词，然后替换相应的内容结果如下

你好 n i h ao
打开 d a k ai
播放 b o f ang
浏览器 l iu l an q i

至此声学模型、语言模型、字典文件都已获取，添加到assets相应的位置会自动生成md5文件，在build.gradle中添加ant自动脚本，assets.xml会自动拷贝assets下模型文件到手机里。
更改代码添加声学语音字典模型

private void setupRecognizer(File assetsDir) throws IOException {
    // The recognizer can be configured to perform multiple searches
    // of different kind and switch between them
    recognizer = SpeechRecognizerSetup.defaultSetup()
            .setAcousticModel(new File(assetsDir, "zh-ptm"))// 这里为声学模型
            .setDictionary(new File(assetsDir, "test.dic"))// 这里为字典模型
            .setRawLogDir(assetsDir) // To disable logging of raw audio comment out this call (takes a lot of space on the device)
           // .setKeywordThreshold(1e-45f) // Threshold to tune for keyphrase to balance between false alarms and misses
           // .setBoolean("-allphone_ci", true)  // Use context-independent phonetic search, context-dependent is too slow for mobile
            .setBoolean("-remove_noise", true)
            .setKeywordThreshold(1e-5f)
            .getRecognizer();
    recognizer.addListener(this);
    /** In your application you might not need to add all those searches.
     * They are added here for demonstration. You can leave just one.
     */
    // Create keyword-activation search.
    recognizer.addKeyphraseSearch(KWS_SEARCH, KEYPHRASE);
    // Create grammar-based search for selection between demos
    File menuGrammar = new File(assetsDir, "menu.gram");
    recognizer.addGrammarSearch(MENU_SEARCH, menuGrammar);
    File languageModel = new File(assetsDir, "test.lm");//这里为语音模型
    recognizer.addNgramSearch("test", languageModel);
/*  // Create grammar-based search for digit recognition
    File digitsGrammar = new File(assetsDir, "digits.gram");
    recognizer.addGrammarSearch(DIGITS_SEARCH, digitsGrammar);
    // Create language model search
    File languageModel = new File(assetsDir, "weather.dmp");
    recognizer.addNgramSearch(FORECAST_SEARCH, languageModel);
    // Phonetic search
    File phoneticModel = new File(assetsDir, "en-phone.dmp");
    recognizer.addAllphoneSearch(PHONE_SEARCH, phoneticModel);*/
}

结论

PocketSphinx测试后发现，无论是原github上英文的离线识别示例，还是更改后的中文离线识别体验都不是很好，一个是识别率的问题，一个是反映过于灵敏，如果大家有时间可以使用语言模型训练工具CMUCLMTK和声学模型训练工具sphinxtrain，自己训练得到语言模型和声学模型，来提高识别率。