语音识别技术入门指南

在探索如何使用文本到语音技术的过程中，发现了一种相反的需求：如何将语音转换为文本。尽管有许多文章介绍了前者，但后者的介绍却相对匮乏。因此，决定撰写一篇基础文章，分享在这方面的经验。

解决方案概述

首先，需要在应用程序中引用位于GAC中的System.Speech程序集。这个程序集包含了实现语音识别所需的所有命名空间和类。

在开始使用SpeechRecognitionEngine之前，需要设置一些属性并调用一些方法。以下是C#代码示例：


        SpeechRecognitionEngine speechRecognitionEngine = null;
        speechRecognitionEngine = createSpeechEngine("de-DE");
        speechRecognitionEngine.AudioLevelUpdated += new EventHandler(engine_AudioLevelUpdated);
        speechRecognitionEngine.SpeechRecognized += new EventHandler(engine_SpeechRecognized);
        loadGrammarAndCommands();
        speechRecognitionEngine.SetInputToDefaultAudioDevice();
        speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);

接下来，将详细介绍createSpeechEngine方法。这个方法允许选择语音引擎使用的语言。如果所需的语言没有安装，那么将使用默认语言（Windows桌面语言）。


        private SpeechRecognitionEngine createSpeechEngine(string preferredCulture)
        {
            foreach (RecognizerInfo config in SpeechRecognitionEngine.InstalledRecognizers())
            {
                if (config.Culture.ToString() == preferredCulture)
                {
                    speechRecognitionEngine = new SpeechRecognitionEngine(config);
                    break;
                }
            }
            if (speechRecognitionEngine == null)
            {
                MessageBox.Show("The desired culture is not installed on this machine, the speech-engine will continue using " + SpeechRecognitionEngine.InstalledRecognizers()[0].Culture.ToString() + " as the default culture.", "Culture " + preferredCulture + " not found!");
                speechRecognitionEngine = new SpeechRecognitionEngine();
            }
            return speechRecognitionEngine;
        }

接下来，需要设置SpeechRecognitionEngine使用的语法。在这个例子中，创建了一个自定义的文本文件，其中包含了文本的键值对，这些文本被包装在自定义类SpeechToText.Word中。


        namespace SpeechToText
        {
            public class Word
            {
                public string Text { get; set; }
                public string AttachedText { get; set; }
                public bool IsShellCommand { get; set; }
            }
        }

以下是设置Grammar使用的Choices的方法。在foreach循环中，创建并插入Word类，并将其存储在查找List<Word>中。然后，将解析出的单词插入到Choices类中，并最终使用GrammarBuilder构建Grammar，并使用SpeechRecognitionEngine同步加载它。


        private void loadGrammarAndCommands()
        {
            try
            {
                Choices texts = new Choices();
                string[] lines = File.ReadAllLines(Environment.CurrentDirectory + "\\example.txt");
                foreach (string line in lines)
                {
                    if (line.StartsWith("--") || line == String.Empty)
                        continue;
                    var parts = line.Split(new char[] { '|' });
                    words.Add(new Word() { Text = parts[0], AttachedText = parts[1], IsShellCommand = (parts[2] == "true") });
                    texts.Add(parts[0]);
                }
                Grammar wordsList = new Grammar(new GrammarBuilder(texts));
                speechRecognitionEngine.LoadGrammar(wordsList);
            }
            catch (Exception ex)
            {
                throw ex;
            }
        }

要启动SpeechRecognitionEngine，调用SpeechRecognitionEngine.StartRecognizeAsync(RecognizeMode.Multiple)。这意味着识别器将继续执行异步识别操作，直到调用RecognizeAsyncCancel()或RecognizeAsyncStop()方法。要检索异步识别操作的结果，请附加事件处理程序到识别器的SpeechRecognized事件。


        speechRecognitionEngine.SpeechRecognized += new EventHandler(engine_SpeechRecognized);
        speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);

当识别器识别到预定义的单词之一时，决定是返回关联的文本，还是执行一个shell命令。这是在以下函数中完成的：


        private string getKnownTextOrExecute(string command)
        {
            try
            {
                var cmd = words.Where(c => c.Text == command).First();
                if (cmd.IsShellCommand)
                {
                    Process proc = new Process();
                    proc.EnableRaisingEvents = false;
                    proc.StartInfo.FileName = cmd.AttachedText;
                    proc.Start();
                    return "you just started : " + cmd.AttachedText;
                }
                else
                {
                    return cmd.AttachedText;
                }
            }
            catch (Exception)
            {
                return command;
            }
        }

现代OpenGL入门教程

本教程旨在为初学者提供一个现代的、更新的OpenGL学习资源，涵盖从基础到高级的3D图形编程。

利用调用栈进行调试的策略

本文介绍了如何利用调用栈进行程序调试的策略，包括在IDE中设置断点、查看调用栈、分析程序状态以及解决状态冲突的方法。

语音识别技术入门指南

解决方案概述

现代OpenGL入门教程

利用调用栈进行调试的策略

沪ICP备2024098111号-1

上海秋旦网络科技中心：上海市奉贤区金大公路8218号1幢联系电话：17898875485

语音识别技术入门指南

解决方案概述

现代OpenGL入门教程

利用调用栈进行调试的策略

沪ICP备2024098111号-1

上海秋旦网络科技中心：上海市奉贤区金大公路8218号1幢 联系电话：17898875485

上海秋旦网络科技中心：上海市奉贤区金大公路8218号1幢联系电话：17898875485