Mittwoch, 9. April 2014

[Android] Continous Speech Recognition

This is a tutorial for implementing a continous speech recognition in Android. I am writing this because I had a lot of trouble myself developing it.

What you should know
  • The basics of Java including interfaces and inheritance
  • The basics of Android development
Preface

I got a lecture called "Software Engineering" in which all students need to develop applications for wearable devices. We got a pretty cool idea which application we want to create but I can not describe it yet because we are not sure about publishing the application. Our application should run on Smartphones with Android 4.0.4 or higher (just the speech recognition does not need such a high api level) and on Smart Glasses aswell. For this reason my university ordered a device called Vuzix M100. I am going to make an extra post about the impressions I got using this device.

Motivation

Imagine this scenario:
You are leaving the grovery store, having your pretty cool Smart Glasses on. Problem is you want to interact with the device but you can not because on each finger you carry a bag full of let's say beer. What to do now? The answer ist continuos speech recognition, meaning a speech recognition service running all the time so you do not have to press any button or touch the device. You simply say "Call my wife" and the Smart Glasses directly conects to the smartphone and call her. Motivated enough?

And now to the actual topic

The implementation we are about to use will contain 3 classes. The VoiceRecognitionListener and the ListeningActivity which implements the interface IVoiceControl. The ListeningActivity includes the main functionalities we will use but we will create our IVoiceControl and the VoiceRecognitionListener first. 

The VoiceRecognitionListener holds a list of IVoiceControl and informs all IVoiceControl that a speech command has been found.
package de.lorisbachert.continousspeechrecognizer;

/**
 * Created by Loris on 26.02.14.
 */
public interface IVoiceControl {
    public abstract void processVoiceCommands(String... voiceCommands); // This will be executed when a voice command was found
    
    public void restartListeningService(); // This will be executed after a voice command was processed to keep the recognition service activated
}
The VoiceRecognitionListener contains the following code:
package de.lorisbachert.continousspeechrecognizer;

import java.util.ArrayList;

import android.os.Bundle;
import android.speech.RecognitionListener;
import android.speech.SpeechRecognizer;

public class VoiceRecognitionListener implements RecognitionListener {
 
 private static VoiceRecognitionListener instance = null;
 
 IVoiceControl listener; // This is your running activity (we will initialize it later)
 
 public static VoiceRecognitionListener getInstance() {
  if (instance == null) {
   instance = new VoiceRecognitionListener();
  }
  return instance;
 }
 
 private VoiceRecognitionListener() { }
 
 public void setListener(IVoiceControl listener) {
        this.listener = listener;
    }
 
    public void processVoiceCommands(String... voiceCommands) {
        listener.processVoiceCommands(voiceCommands);
    }
 
    // This method will be executed when voice commands were found
 public void onResults(Bundle data) {
  ArrayList matches = data.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
  String[] commands = new String[matches.size()];
  for (String command : matches) {
   System.out.println(command);
  }
  commands = matches.toArray(commands);
  processVoiceCommands(commands);
 }
 
 // User starts speaking
 public void onBeginningOfSpeech() {
  System.out.println("Starting to listen");
 }
 
 public void onBufferReceived(byte[] buffer) { }
 
 // User finished speaking
 public void onEndOfSpeech() {
  System.out.println("Waiting for result...");
 }
 
 // If the user said nothing the service will be restarted
 public void onError(int error) {
  if (listener != null) {
   listener.restartListeningService();
  }
 }
 public void onEvent(int eventType, Bundle params) { }
 
 public void onPartialResults(Bundle partialResults) { }
 
 public void onReadyForSpeech(Bundle params) { }
 
 public void onRmsChanged(float rmsdB) { }
}
Now that we implemented the VoiceRecognitionListener we will finish the service by implementing the ListeningActivity:
package de.lorisbachert.continousspeechrecognizer;

import android.app.Activity;
import android.content.Context;
import android.content.Intent;
import android.os.Bundle;
import android.speech.RecognizerIntent;
import android.speech.SpeechRecognizer;
import android.util.Log;
import android.widget.Toast;

public abstract class ListeningActivity extends Activity implements IVoiceControl {

 protected SpeechRecognizer sr;
 protected Context context;
 
 @Override
 protected void onCreate(Bundle savedInstanceState) {
  super.onCreate(savedInstanceState);
 }
 
 // starts the service
 protected void startListening() {
  try {
   initSpeech();
   Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
   intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_WEB_SEARCH);
   if (!intent.hasExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE))
      {
    intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,
                  "com.dummy");
      }
   sr.startListening(intent);
  } catch(Exception ex) {
   Log.d("SpeechRecognitionService", "Bei der Initialisierung des SpeechRecognizers ist ein Fehler aufgetreten");
  }
 }
 
 // stops the service
 protected void stopListening() {
  if (sr != null) {
   sr.stopListening();
         sr.cancel();
         sr.destroy();
        }
  sr = null;
 }
 
 protected void initSpeech() {
  if (sr == null) {
   sr = SpeechRecognizer.createSpeechRecognizer(this);
   if (!SpeechRecognizer.isRecognitionAvailable(context)) {
    Toast.makeText(context, "Speech Recognition is not available",
      Toast.LENGTH_LONG).show();
    finish();
   }
   sr.setRecognitionListener(VoiceRecognitionListener.getInstance());
  }
 }
 
 @Override
 public void finish() {
  stopListening();
  super.finish();
 }
 
 @Override
 protected void onStop() {
  stopListening();
  super.onStop();
 }
 
    @Override
 protected void onDestroy() {
     if (sr != null) {
         sr.stopListening();
         sr.cancel();
         sr.destroy();
        }
     super.onDestroy();
    }
    
    @Override
    protected void onPause() {
        if(sr!=null){
            sr.stopListening();
            sr.cancel();
            sr.destroy();              

        }
        sr = null;

        super.onPause();
    }
    
    //is abstract so the inheriting classes need to implement it. Here you put your code which should be executed once a command was found
 @Override
 public abstract void processVoiceCommands(String ... voiceCommands);
    
 @Override
 public void restartListeningService() {
  stopListening();
  startListening();
 }
}

That is all the magic. Now we just add these two permissions in our AndroidManifest.xml: 

<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.INTERNET" />
To sum everything up I wrote a short example usage:
package de.lorisbachert.continousspeechrecognizer;

import android.graphics.Color;
import android.os.Bundle;
import android.view.Gravity;
import android.widget.LinearLayout;
import android.widget.TextView;

public class MainActivity extends ListeningActivity {

 private LinearLayout content;
 
 @Override
 protected void onCreate(Bundle savedInstanceState) {
  super.onCreate(savedInstanceState);
  setContentView(R.layout.activity_main);
  
  content = (LinearLayout)findViewById(R.id.commands);
  
  // The following 3 lines are needed in every onCreate method of a ListeningActivity
  context = getApplicationContext(); // Needs to be set
  VoiceRecognitionListener.getInstance().setListener(this); // Here we set the current listener
  startListening(); // starts listening
 }

 // Here is where the magic happens
 @Override
 public void processVoiceCommands(String... voiceCommands) {
  content.removeAllViews();
  for (String command : voiceCommands) {
   TextView txt = new TextView(getApplicationContext());
   txt.setText(command);
   txt.setTextSize(20);
   txt.setTextColor(Color.BLACK);
   txt.setGravity(Gravity.CENTER);
   content.addView(txt);
  }
  restartListeningService();
 }
}
That is it. Now we can test our little speech recognizer and having fun by using it.
If you have any trouble following my tutorial or any suggestions please let me know in the comments :)

You can download a demo project which can be executed using this link.

Kommentare:

  1. Did you get the continuous recognition working on the Vuzix M100? I managed to implement continuous speech recognition on normal smartphones, but it seems like it isn't supported yet on the M100...

    AntwortenLöschen
    Antworten
    1. Hi Michiel, i am also having the same problem. Not able to implement the voice recognition functionality for continuous speech with Vuzix M100 wra app.
      if find something working solution, please share with me.
      Thanks

      Löschen
    2. Hi Michiel, Can you share your android code?

      Löschen
  2. Same here, I was able to get it to work on smartphone, but M100 gives an error .. could not connect with google speech recognition service. Michiel were you able to get this to work ?

    AntwortenLöschen
  3. Hi Abhay,

    if you send me an email to development@loris-bachert.com where you explain your problem in details, i will try to help you getting that problem solved.

    AntwortenLöschen
    Antworten
    1. Hai Loris,
      Actually voice recognition works in smart phone but it bnot working in m100. So what will be the problem

      Löschen
  4. Wouldn't be better to make ListeningService instead of ListeningActivity?

    AntwortenLöschen
  5. works on my motorola XT912 Droid Razr api 4.12
    BUT VERY SLOWLY

    AntwortenLöschen
  6. Thank you this code is very useful to me. Thank you again.

    AntwortenLöschen
  7. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  8. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  9. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  10. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  11. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  12. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  13. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  14. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  15. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  16. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  17. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  18. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  19. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  20. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  21. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  22. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen
  23. Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer

    AntwortenLöschen