Speech to Text API

Microsoft Speech Platform's Speech Recognition API is built into Ozeki 10. Use it as a connection. This makes you able to control other Ozeki 10 connections with human voice. The automatically recognised language is converted into text, which can be routed to other connections.

how speech to text works with the speech recognition api
Figure 1 - How speech to text works with the Speech Recognition API

The Speech Recognition API is integrated into your Ozeki 10 browser GUI. All webbrowsers are capable to run JavaScript codes. This code can connect to your microphone and detect your voice. Ozeki 10 can separate browser clients, so if you wish you can use multiple microphones from multiple locations.

Follow these steps

STEP 1: Start Ozeki 10's Control Panel
STEP 2: Create a Speech to Text connection
STEP 3: Add a word set to the dictionary

Prerequisites

  • A microphone to detect voice
  • Ozeki 10 installed on your computer
  • A webbrowser (best on Chrome)

If a JavaScript code is integrated, the icons on the bottom right corner of the screen will show it. On figure 2 you can see some JavaScript compatible devices. Use Google Chrome for perfect functionality and experience.

how the javascript apis are shown on the gui
Figure 2 - This is how the JavaScript APIs are shown on the GUI

STEP 1: Start Ozeki 10's Control Panel

First you should start by opening your device's webbrowser. Google Chrome is highly suggested for this tutorial.
Login to your Ozeki browser GUI and start 'Control Panel' from the start menu (Figure 3).

The 'Control Panel' is the most important application you can find in Ozeki 10. It is capable to handle and route connections if necessary. A lot of connections are automatically detected, but some connections can be added manually.

select control panel from the desktop
Figure 3 - Select 'Control Panel' from the desktop

STEP 2: Create a Speech to Text connection

In this tutorial you can see how to add a Speech to Text browser connection.

On the control panel you have to click 'Create New Connection' and select the 'Audio' icon (Figure 4). This option contains all audio connections (e.g. sound recorder, speaker, speech to text, text to speech)

click create new connection and select the audio icon
Figure 4 - Click 'Create New Connection' and select the 'Audio' icon

On the next selection window, click 'Speech to Text' (Figure 5).

select the icon called speech to text
Figure 5 - Select the icon called 'Speech to Text'

Now you can choose between browser or server options. Select 'Browser' (Figure 6), since this is going to be a JavaScript code running in your browser. The best working browser with the Speech Recognition API is Google Chrome.

select browser option
Figure 6 - Select 'Browser' option

Disable system tray notifications if you do not like popups. If not, every detected word will appear in a yellow popup jumping from the bottom right corner. Although you can disable it later on by reconfiguring the Speech to Text connection details. After finishing configuration, please click 'Ok' (Figure 7).

setup the speech to text connection and click ok
Figure 7 - Setup the Speech to Text connection and click 'Ok'

STEP 3: Add a word set to the dictionary

It is essential to add some words or default word sets, so you can turn on the connection.

You can add some detactable words by clicking 'Details' on the 'Control Panel' (Figure 8).

click on the details of the connection
Figure 8 - Click on the 'Details' of the connection

Now you will see how to add a word set of numbers.

Click on the big blue 'Create new Detactable word' button then on the 'Word set' icon (Figure 9)

click create new detactable word and select word set
Figure 9 - Click 'Create new Detactable word' and select 'Word set'

Click on any desired set of words. If you would like to follow this example, select 'Numbers' (Figure 10).

select numbers from the available default word sets
Figure 10 - Select 'Numbers' from the available default word sets

On the left side panel you can activate the connection with the slide (Figure 11). You can find a small dot next to the microphone icon on the system tray. If the connection is activated, the red dot changes to green. In this case you can close the 'Control Panel', since the browser API works correctly.

activate the connection now you can close the control panel
Figure 11 - Activate the connection. Now you can close the 'Control Panel'

Finally you can test it. You can choose from two methods. The first one is creating a route by using the 'Control Panel'. The second method is creating a simple C# code in Ozeki Robot Controller. By testing it, you can see the text messages travel through some connections.

More information