Speech Recognition: javax.speech.recognition

Hello World!


The following example shows a simple application that uses speech recognition. For this application we need to define a grammar of everything the user can say, and we need to write the Java software that performs the recognition task.
A grammar is provided by an application to a speech recognizer to define the words that a user can say, and the patterns in which those words can be spoken. In this example, we define a grammar that allows a user to say "Hello World" or a variant. The grammar is defined using the Java Speech Grammar Format. This format is documented in the Java Speech Grammar Format Specification.


Place this grammar into a file.


grammar javax.speech.helloworld;

public <sentence> = hello world | good morning |
                                      hello mighty computer;
This trivial grammar has a single public rule called "sentence". A rule defines what may be spoken by a user. A public rule is one that may be activated for recognition.

The following code shows how to create a recognizer, load the grammar, and then wait for the user to say something that matches the grammar. When it gets a match, it deallocates the engine and exits.


import javax.speech.*;
import javax.speech.recognition.*;
import java.io.FileReader;
import java.util.Locale;

public class HelloWorld extends ResultAdapter {
 static Recognizer rec;

 // Receives RESULT_ACCEPTED event: print it, clean up, exit
 public void resultAccepted(ResultEvent e) {
  Result r = (Result)(e.getSource());
  ResultToken tokens[] = r.getBestTokens();

  for (int i = 0; i < tokens.length; i++)
   System.out.print(tokens[i].getSpokenText() + " ");
  System.out.println();

  // Deallocate the recognizer and exit
  rec.deallocate();
  System.exit(0);
 }

 public static void main(String args[]) {
  try {
   // Create a recognizer that supports English.
   rec = Central.createRecognizer(
       new EngineModeDesc(Locale.ENGLISH));
 
   // Start up the recognizer
   rec.allocate();
 
   // Load the grammar from a file, and enable it
   FileReader reader = new FileReader(args[0]);
   RuleGrammar gram = rec.loadJSGF(reader);
   gram.setEnabled(true);

   // Add the listener to get results
   rec.addResultListener(new HelloWorld());

   // Commit the grammar
   rec.commitChanges();

   // Request focus and start listening
   rec.requestFocus();
   rec.resume();
  } catch (Exception e) {
   e.printStackTrace();
  }
 }
}



This example illustrates the basic steps which all speech recognition applications must perform. Let's examine each step in detail.

Create: The Central class of javax.speech package is used to obtain a speech recognizer by calling the createRecognizer method. The EngineModeDesc argument provides the information needed to locate an appropriate recognizer. In this example we requested a recognizer that understands English (since the grammar is written for English).

Allocate: The allocate methods requests that the Recognizer allocate all necessary resources.

Load and enable grammars: The loadJSGF method reads in a JSGF document from a reader created for the file that contains the javax.speech.demo grammar. (Alternatively, the loadJSGF method can load a grammar from a URL.) Next, the grammar is enabled. Once the recognizer receives focus (see below), an enabled grammar is activated for recognition: that is, the recognizer compares incoming audio to the active grammars and listens for speech that matches those grammars.

Attach a ResultListener: The HelloWorld class extends the ResultAdapter class which is a trivial implementation of the ResultListener interface. An instance of the HelloWorld class is attached to the Recognizer to receive result events. These events indicate progress as the recognition of speech takes place. In this implementation, we process the RESULT_ACCEPTED event, which is provided when the recognizer completes recognition of input speech that matches an active grammar.

Commit changes: Any changes in grammars and the grammar enabled status needed to be committed to take effect (that includes creation of a new grammar).
Request focus and resume: For recognition of the grammar to occur, the recognizer must be in the RESUMED state and must have the speech focus. The requestFocus and resume methods achieve this.
Process result: Once the main method is completed, the application waits until the user speaks. When the user speaks something that matches the loaded grammar, the recognizer issues aRESULT_ACCEPTED event to the listener we attached to the recognizer. The source of this event is a Result object that contains information about what the recognizer heard. The getBestTokensmethod returns an array of ResultTokens, each of which represents a single spoken word. These words are printed.
Deallocate: Before exiting we call deallocate to free up the recognizer's resources.


Credits : www.ling.helsinki.fi

0 comments: