Speech Notes to Text Mobile App

You can ease your mobile working with Canvas App that records your voice, converts the audio to text and extracts tasks from it. You can then save the text for later editing and create Outlook tasks from the recognized tasks in the speech. This is just an example how to extend your Microsoft 365, Dynamics 365 or Power Platform solution with voice control.

I wanted to create an independent demo, so there is not integration to other systems. The best result is attaching this to a system where users feels hard to navigate to set information to the correct place. Or when organization wants to support employees to set information into a system. I mainly use this while driving a car or walking with dog and my hands are not free for writing. I do now wish people to fill their pauses with working – in the contrary I want to offer choice to be half the day outside office and integrate planning and notes for a walk or use long car driving time for defining a work task.

The contents of the solution lies here. Besides this you need Azure subscription and create there Cognitive Services and Function App. Please check the instructions on the following link.

The main logic is copied from this Power Apps Community blog. I have done this same solution one year ago with other instructions. This three part instructions is the easiest way to get the Power Apps Canvas App audio to text.

Canvas App – The Power App

My mobile app has two views, the start screen and the recording screen.

Start screen has possibility to get user’s meetings for the current day. Possibility to attach notes to a meeting is just in text level in you notes, is does not actually edit the meeting anyway.

There is also a button for add notes without relating it to any meeting which takes the user to the recording screen.

The recording screen is divided into recording view, where you can select language spoken and record you voice.

While recording the speech you can create tasks by word command Create task in the beginning of your sentence. When I get the text from Azure Cognitive Services, I split it with dot and then check whether the sentence starts with “create task” string and then add them into another array for later processing.

Once the processing is ready, it show hides the microphone and shows the text and tasks in list view. User can Cancel or Save. If user saves it, he gets the text version as Teams message from flow bot and tasks are converted with TODO action to Outlook tasks.

SpeechToText – Power Automate flow

Contains the main logic where I get the audio as JSON and text. Then I translated the webm audio format to wav. I send the wav to Azure Cognitive Services Speech Recognition and get text from there. Then I call child flow to extract the tasks and finally return all to the calling Canvas App.

Get TODOs from Text – Power Automate flow

With this flow I split the sentences, try to recognize sentences which has Create task in the start and add them into another array. Finally I parse all together and return to the parent flow.

Save Notification – Power Automate flow

Is for sending the captured text from the app to the user for later use. This could save the information to anywhere in the Power Automate could reach and add metadata and do magic. My demo is quite boring and just for me to get text as Teams message once I open laptop after getting from car to office.

Create TODO Tasks – Power Automate flow

Creating Outlooks tasks is very straightforward flow. It gets a string of tasks separated by semicolon. Then it splits them and adds as to-do and schedules 1 minute from now.

AI ambassador goes into Microsoft Power Platform