Just another blog about programming, devices and technology related stuff: [Android] Continous Speech Recognition

This is a tutorial for implementing a continous speech recognition in Android. I am writing this because I had a lot of trouble myself developing it.

What you should know

The basics of Java including interfaces and inheritance
The basics of Android development

What are Activities?

Preface

I got a lecture called "Software Engineering" in which all students need to develop applications for wearable devices. We got a pretty cool idea which application we want to create but I can not describe it yet because we are not sure about publishing the application. Our application should run on Smartphones with Android 4.0.4 or higher (just the speech recognition does not need such a high api level) and on Smart Glasses aswell. For this reason my university ordered a device called Vuzix M100. I am going to make an extra post about the impressions I got using this device.

Motivation

Imagine this scenario:

You are leaving the grovery store, having your pretty cool Smart Glasses on. Problem is you want to interact with the device but you can not because on each finger you carry a bag full of let's say beer. What to do now? The answer ist continuos speech recognition, meaning a speech recognition service running all the time so you do not have to press any button or touch the device. You simply say "Call my wife" and the Smart Glasses directly conects to the smartphone and call her. Motivated enough?

And now to the actual topic

The implementation we are about to use will contain 3 classes. The VoiceRecognitionListener and the ListeningActivity which implements the interface IVoiceControl. The ListeningActivity includes the main functionalities we will use but we will create our IVoiceControl and the VoiceRecognitionListener first.

The VoiceRecognitionListener holds a list of IVoiceControl and informs all IVoiceControl that a speech command has been found.

package de.lorisbachert.continousspeechrecognizer;

/**
 * Created by Loris on 26.02.14.
 */
public interface IVoiceControl {
    public abstract void processVoiceCommands(String... voiceCommands); // This will be executed when a voice command was found
    
    public void restartListeningService(); // This will be executed after a voice command was processed to keep the recognition service activated
}

The VoiceRecognitionListener contains the following code:

package de.lorisbachert.continousspeechrecognizer;

import java.util.ArrayList;

import android.os.Bundle;
import android.speech.RecognitionListener;
import android.speech.SpeechRecognizer;

public class VoiceRecognitionListener implements RecognitionListener {
 
 private static VoiceRecognitionListener instance = null;
 
 IVoiceControl listener; // This is your running activity (we will initialize it later)
 
 public static VoiceRecognitionListener getInstance() {
  if (instance == null) {
   instance = new VoiceRecognitionListener();
  }
  return instance;
 }
 
 private VoiceRecognitionListener() { }
 
 public void setListener(IVoiceControl listener) {
        this.listener = listener;
    }
 
    public void processVoiceCommands(String... voiceCommands) {
        listener.processVoiceCommands(voiceCommands);
    }
 
    // This method will be executed when voice commands were found
 public void onResults(Bundle data) {
  ArrayList matches = data.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
  String[] commands = new String[matches.size()];
  for (String command : matches) {
   System.out.println(command);
  }
  commands = matches.toArray(commands);
  processVoiceCommands(commands);
 }
 
 // User starts speaking
 public void onBeginningOfSpeech() {
  System.out.println("Starting to listen");
 }
 
 public void onBufferReceived(byte[] buffer) { }
 
 // User finished speaking
 public void onEndOfSpeech() {
  System.out.println("Waiting for result...");
 }
 
 // If the user said nothing the service will be restarted
 public void onError(int error) {
  if (listener != null) {
   listener.restartListeningService();
  }
 }
 public void onEvent(int eventType, Bundle params) { }
 
 public void onPartialResults(Bundle partialResults) { }
 
 public void onReadyForSpeech(Bundle params) { }
 
 public void onRmsChanged(float rmsdB) { }
}

Now that we implemented the VoiceRecognitionListener we will finish the service by implementing the ListeningActivity:

package de.lorisbachert.continousspeechrecognizer;

import android.app.Activity;
import android.content.Context;
import android.content.Intent;
import android.os.Bundle;
import android.speech.RecognizerIntent;
import android.speech.SpeechRecognizer;
import android.util.Log;
import android.widget.Toast;

public abstract class ListeningActivity extends Activity implements IVoiceControl {

 protected SpeechRecognizer sr;
 protected Context context;
 
 @Override
 protected void onCreate(Bundle savedInstanceState) {
  super.onCreate(savedInstanceState);
 }
 
 // starts the service
 protected void startListening() {
  try {
   initSpeech();
   Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
   intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_WEB_SEARCH);
   if (!intent.hasExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE))
      {
    intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,
                  "com.dummy");
      }
   sr.startListening(intent);
  } catch(Exception ex) {
   Log.d("SpeechRecognitionService", "Bei der Initialisierung des SpeechRecognizers ist ein Fehler aufgetreten");
  }
 }
 
 // stops the service
 protected void stopListening() {
  if (sr != null) {
   sr.stopListening();
         sr.cancel();
         sr.destroy();
        }
  sr = null;
 }
 
 protected void initSpeech() {
  if (sr == null) {
   sr = SpeechRecognizer.createSpeechRecognizer(this);
   if (!SpeechRecognizer.isRecognitionAvailable(context)) {
    Toast.makeText(context, "Speech Recognition is not available",
      Toast.LENGTH_LONG).show();
    finish();
   }
   sr.setRecognitionListener(VoiceRecognitionListener.getInstance());
  }
 }
 
 @Override
 public void finish() {
  stopListening();
  super.finish();
 }
 
 @Override
 protected void onStop() {
  stopListening();
  super.onStop();
 }
 
    @Override
 protected void onDestroy() {
     if (sr != null) {
         sr.stopListening();
         sr.cancel();
         sr.destroy();
        }
     super.onDestroy();
    }
    
    @Override
    protected void onPause() {
        if(sr!=null){
            sr.stopListening();
            sr.cancel();
            sr.destroy();              

        }
        sr = null;

        super.onPause();
    }
    
    //is abstract so the inheriting classes need to implement it. Here you put your code which should be executed once a command was found
 @Override
 public abstract void processVoiceCommands(String ... voiceCommands);
    
 @Override
 public void restartListeningService() {
  stopListening();
  startListening();
 }
}

That is all the magic. Now we just add these two permissions in our AndroidManifest.xml:

<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.INTERNET" />

To sum everything up I wrote a short example usage:

package de.lorisbachert.continousspeechrecognizer;

import android.graphics.Color;
import android.os.Bundle;
import android.view.Gravity;
import android.widget.LinearLayout;
import android.widget.TextView;

public class MainActivity extends ListeningActivity {

 private LinearLayout content;
 
 @Override
 protected void onCreate(Bundle savedInstanceState) {
  super.onCreate(savedInstanceState);
  setContentView(R.layout.activity_main);
  
  content = (LinearLayout)findViewById(R.id.commands);
  
  // The following 3 lines are needed in every onCreate method of a ListeningActivity
  context = getApplicationContext(); // Needs to be set
  VoiceRecognitionListener.getInstance().setListener(this); // Here we set the current listener
  startListening(); // starts listening
 }

 // Here is where the magic happens
 @Override
 public void processVoiceCommands(String... voiceCommands) {
  content.removeAllViews();
  for (String command : voiceCommands) {
   TextView txt = new TextView(getApplicationContext());
   txt.setText(command);
   txt.setTextSize(20);
   txt.setTextColor(Color.BLACK);
   txt.setGravity(Gravity.CENTER);
   content.addView(txt);
  }
  restartListeningService();
 }
}

That is it. Now we can test our little speech recognizer and having fun by using it.
If you have any trouble following my tutorial or any suggestions please let me know in the comments :)

You can download a demo project which can be executed using this link.

23 Kommentare:

Unknown30. Juli 2014 um 00:15
Did you get the continuous recognition working on the Vuzix M100? I managed to implement continuous speech recognition on normal smartphones, but it seems like it isn't supported yet on the M100...
AntwortenLöschen
Antworten
Abhay10. Juni 2015 um 08:47
Same here, I was able to get it to work on smartphone, but M100 gives an error .. could not connect with google speech recognition service. Michiel were you able to get this to work ?
AntwortenLöschen
Antworten
Unknown14. Juni 2015 um 09:02
Hi Abhay,

if you send me an email to development@loris-bachert.com where you explain your problem in details, i will try to help you getting that problem solved.
AntwortenLöschen
Antworten
Unknown29. Juli 2015 um 07:54
Wouldn't be better to make ListeningService instead of ListeningActivity?
AntwortenLöschen
Antworten
JD from Oaktown8. Mai 2016 um 17:59
works on my motorola XT912 Droid Razr api 4.12
BUT VERY SLOWLY
AntwortenLöschen
Antworten
Unknown22. Mai 2016 um 07:57
Thank you this code is very useful to me. Thank you again.
AntwortenLöschen
Antworten
Anonym5. August 2016 um 09:11
Good job man!
AntwortenLöschen
Antworten
byodbuzz0613. März 2019 um 01:01
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
byodbuzz0517. März 2019 um 22:45
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
pslvseo a719. März 2019 um 20:49
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
byodbuzz0520. März 2019 um 00:15
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
byodbuzz0520. März 2019 um 22:07
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
pslv seoa1021. März 2019 um 00:09
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
pslv seoa1022. März 2019 um 04:24
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
pslv seoa1023. März 2019 um 02:50
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
pslvseo a124. März 2019 um 21:21
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
byodbuzz0325. März 2019 um 01:55
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
byodbuzz0327. März 2019 um 02:20
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
byodbuzz0429. März 2019 um 00:25
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten
byodbuzz082. April 2019 um 22:38
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
AntwortenLöschen
Antworten

Kommentar hinzufügen

Mittwoch, 9. April 2014

[Android] Continous Speech Recognition

23 Kommentare: