Schlagwort: Keyword Spotting

  • Controlling a power strip with a keyword spotting model and the Nicla Voice

    Controlling a power strip with a keyword spotting model and the Nicla Voice

    Reading Time: 2 minutes

    As Jallson Suryo discusses in his project, adding voice controls to our appliances typically involves an internet connection and a smart assistant device such as Amazon Alexa or Google Assistant. This means extra latency, security concerns, and increased expenses due to the additional hardware and bandwidth requirements. This is why he created a prototype based on an Arduino Nicla Voice that can provide power for up to four outlets using just a voice command.

    Suryo gathered a dataset by repeating the words “one,” “two,” “three,” “four,” “on,” and “off” into his phone and then uploaded the recordings to an Edge Impulse project. From here, he split the files into individual words before rebalancing his dataset to ensure each label was equally represented. The classifier model was trained for keyword spotting and used Syntiant NDP120-optimal settings for voice to yield an accuracy of around 80%.

    Apart from the Nicla Voice, Suryo incorporated a Pro Micro board to handle switching the bank of relays on or off. When the Nicla Voice detects the relay number, such as “one” or “three”, it then waits until the follow-up “on” or “off” keyword is detected. With both the number and state now known, it sends an I2C transmission to the accompanying Pro Micro which decodes the command and switches the correct relay.

    To see more about this voice-controlled power strip, be sure to check out Suryo’s Edge Impulse tutorial.

    [youtube https://www.youtube.com/watch?v=9PRjhA38jBE?feature=oembed&w=500&h=281]

    The post Controlling a power strip with a keyword spotting model and the Nicla Voice appeared first on Arduino Blog.

    Website: LINK

  • Small-footprint keyword spotting for low-resource languages with the Nicla Voice

    Small-footprint keyword spotting for low-resource languages with the Nicla Voice

    Reading Time: 2 minutes

    Speech recognition is everywhere these days, yet some languages, such as Shakhizat Nurgaliyev and Askat Kuzdeuov’s native Kazakh, lack sufficiently large public datasets for training keyword spotting models. To make up for this disparity, the duo explored generating synthetic datasets using a neural text-to-speech system called Piper, and then extracting speech commands from the audio with the Vosk Speech Recognition Toolkit.

    Beyond simply building a model to recognize keywords from audio samples, Nurgaliyev and Kuzdeuov’s primary goal was to also deploy it onto an embedded target, such as a single-board computer or microcontroller. Ultimately, they went with the Arduino Nicla Voice development board since it contains not just an nRF52832 SoC, a microphone, and an IMU, but an NDP120 from Syntiant as well. This specialized Neural Decision Processor helps to greatly speed up inferencing times thanks to dedicated hardware accelerators while simultaneously reducing power consumption. 

    With the hardware selected, the team began to train their model with a total of 20.25 hours of generated speech data spanning 28 distinct output classes. After 100 learning epochs, it achieved an accuracy of 95.5% and only consumed about 540KB of memory on the NDP120, thus making it quite efficient.

    [youtube https://www.youtube.com/watch?v=1E0Ff0ds160?feature=oembed&w=500&h=375]

    To read more about Nurgaliyev and Kuzdeuov’s project and how they deployed an embedded ML model that was trained solely on generated speech data, check out their write-up here on Hackster.io.

    The post Small-footprint keyword spotting for low-resource languages with the Nicla Voice appeared first on Arduino Blog.

    Website: LINK

  • Controlling a bionic hand with tinyML keyword spotting

    Controlling a bionic hand with tinyML keyword spotting

    Reading Time: 2 minutes

    Arduino TeamAugust 31st, 2022

    Traditional methods of sending movement commands to prosthetic devices often include electromyography (reading electrical signals from muscles) or simple Bluetooth modules. But in this project, Ex Machina has developed an alternative strategy that enables users to utilize voice commands and perform various gestures accordingly.

    The hand itself was made from five SG90 servo motors, with each one moving an individual finger of the larger 3D-printed hand assembly. They are all controlled by a single Arduino Nano 33 BLE Sense, which collects voice data, interprets the gesture, and sends signals to both the servo motors and an RGB LED for communicating the current action.

    In order to recognize certain keywords, Ex Machina collected 3.5 hours of audio data split amongst six total labels that covered the words “one,” “two,” “OK,” “rock,” “thumbs up,” and “nothing” — all in Portuguese. From here, the samples were added to a project in the Edge Impulse Studio and sent through an MFCC processing block for better voice extraction. Finally, a Keras model was trained on the resulting features and yielded an accuracy of 95%.

    Once deployed to the Arduino, the model is continuously fed new audio data from the built-in microphone so that it can infer the correct label. Finally, a switch statement sets each servo to the correct angle for the gesture. For more details on the voice-controlled bionic hand, you can read Ex Machina’s Hackster.io write-up here.

    [youtube https://www.youtube.com/watch?v=0mc9VOxiwgo?feature=oembed&w=500&h=281]

    Website: LINK