How to create a voice to text app

As technology is more and more advance, you can see new functionality arrive in your device and one of the functionality is speech recognition technology. In speech recognition voice will convert into text in real-time. In this post, you will learn how to make a voice note app with the help of Web Speech API (speech recognition). Web Speech API has a two-part one is speech recognition and the second is speech synthesis.

voice-to-text-app

As you know Speech recognition will convert your voice into text in real-time and Speech Synthesis converts text into voice. But in this post we only cover Speech recognition to make a small voice to text app. Web Speech API only works on only Google Chrome and Firefox browser.

Also read: How to Add Speech Recognition (voice search) to your Website

Create a Voice to text app (Speech recognition).

Let’s set up our project folder in our local drive with three file index.html, style.css, and  main.js.

Check the Demo here. Click on the microphone icon and talk. It will covert your voice into text.

First, we create our index.html file hold HTML code.

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <meta http-equiv="X-UA-Compatible" content="ie=edge">
  <title>Speech Recognition</title>
  <link rel="stylesheet" href="style.css">
  <link href="https://fonts.googleapis.com/css?family=Shadows+Into+Light" rel="stylesheet">
  <!-- load font awesome here for icon used on the page -->
</head>
<body>
   <div class="textbox">       
  <textarea id="textarea" rows=10 cols=60></textarea>
   <img class= "microphone" onclick="startVoicetotext(event)" src="https://i.imgur.com/cHidSVu.gif" />
  </div>
  <script src="index.js"></script>  
</body>
</html>

In style.css file hold our style code

body {
 font-family: "Roboto", "Helvetica", "Arial", sans-serif !important;
     margin: 0;
      height: 100%;
      background: #5f67d6;
}
.textbox {position: relative;}
 .microphone {
  position: absolute;
  padding: 0px 4px;
  left: 0;
  bottom: 0px;
  border: 1px;
  background-color:transparent;
  box-shadow: none;   
}
   

Last main.js file is our core of the project. Which powering our Voice to text app. It contains javascript code.

function startVoicetotext(event) {
  if (recognizing) {
    recognition.stop();
    return;
  }
  
var final_transcript = '';
var recognizing = false;

if ('webkitSpeechRecognition' in window) {

  var recognition = new webkitSpeechRecognition();

  recognition.continuous = true;
  recognition.interimResults = true;

  recognition.onstart = function() {
    recognizing = true;
  };

  recognition.onerror = function(event) {
    console.log(event.error);
  };

  recognition.onend = function() {
    recognizing = false;
  };

  recognition.onresult = function(event) {
    var interim_transcript = '';
    for (var i = event.resultIndex; i < event.results.length; ++i) {
      if (event.results[i].isFinal) {
        textarea.value += event.results[i][0].transcript;
      } else {
        interim_transcript += event.results[i][0].transcript;
      }
    }
    final_transcript = capitalize(final_transcript);
    final_span.innerHTML = linebreak(final_transcript);
    interim_span.innerHTML = linebreak(interim_transcript);
    
  };
}

var two_line = /\n\n/g;
var one_line = /\n/g;
function linebreak(s) {
  return s.replace(two_line, '<p></p>').replace(one_line, '<br>');
}

function capitalize(s) {
  return s.replace(s.substr(0,1), function(m) { return m.toUpperCase(); });
}


  final_transcript = '';
  recognition.lang = 'en-US';
  recognition.start();
  final_span.innerHTML = '';
  interim_span.innerHTML = '';
}

First, start with the HTML file. Let’s assume it’s our skeleton on that project.

Second, add the CSS style code. It gives style on the HTML code, acts as flash and skin on the skeleton.

Last, add the javaScript code. It works as a brain of that skeleton.

Add Web Speech API on the javaScript file to power up our app.

Web Speech API only work on the Google Chrome and Firefox browser.

First, we create the onclick event to start the web speech API or speech recognition in our voice to text app. To create on click event we create a function called startVoicetotext. The code look like this.

function startVoicetotext(event) {
  if (recognizing) {
    recognition.stop();
    return;
  }

Next we add recognition property’s like recognition.continuous = true; recognition.interimResults = true;

recognition.continuous = true;
recognition.interimResults = true;

recognition.continuous = true; means Speech API continuous work without stop.

Next we add the onstart, onend and onerror function like this.

recognition.onstart = function() {
    recognizing = true;
  };

  recognition.onerror = function(event) {
    console.log(event.error);
  };

  recognition.onend = function() {
    recognizing = false;
  };

After that we grab the voice spoken on the microphone and display on the textarea in our voice to text app. And the code look like this.

recognition.onresult = function(event) {
    var interim_transcript = '';
    for (var i = event.resultIndex; i < event.results.length; ++i) {
      if (event.results[i].isFinal) {
        textarea.value += event.results[i][0].transcript;
      } else {
        interim_transcript += event.results[i][0].transcript;
      }
    }
    final_transcript = capitalize(final_transcript);
    final_span.innerHTML = linebreak(final_transcript);
    interim_span.innerHTML = linebreak(interim_transcript);
    
  };

Conclusion

In this post, you will learn how to built a small Voice to text app. You can add more styling in your style.css file it look more attractive. Even you can add more functionality in your app. You can check this website voicetotext.org to more idea.

I've loved ❤ technology and always curious about new tech innovation. I've been a technology writer and passionate blogger for last 3 year and write How to guide to help people become a tech-savvy.

Leave a Comment