Industry Expert Blogs
RISC-V-Based ASSP EASY for Voice HMI - The Journey ContinuesRenesas Blog - Giancarlo Parodi, RenesasMar. 30, 2023 |
Human-machine interfaces based on voice commands provide a convenient way to interact with systems like appliances, displays, home accessories, and other gadgets and can replace or augment traditional means of control like buttons, knobs, or sliders. The recent worldwide pandemic has obviously increased the attention for such touch-free interaction methods but even without such urgency, for modern systems it is very desirable to implement such advanced controls to improve the end user experience.
Implementing such technologies, however, requires specialized expertise in many complementary fields of embedded systems development. For such applications, the environmental audio content is acquired in real time and the stream of samples gets processed to extract and filter relevant information to recognize specific patterns. For example, to recognize a pattern associated with a certain spoken phoneme, in a specific language. The further composition of those patterns and sequences can be associated with the likeliness of a specific keyword being recognized within the acquired audio stream. At such a moment, the system designer might decide to have detected a keyword he was actively listening for and decide to perform further actions or give the user feedback. What those actions will be is totally dependent on the specific appliance. Implementing the application layer to input the audio samples and manage the results is something that an embedded software engineer is comfortable developing. On the other hand, the method used to detect those keywords within the audio stream is hard to implement without specific expertise. One approach is to use neural network-based software models which get trained to recognize specific phonemes with a desired accuracy and the application can use those to perform the keyword detection. Although lots of technical literature is available on the subject, it is not trivial to develop, train, and deploy such e-AI models to implement the keyword spotting functionality. It can require a significant investment to fine-tune and obtain the desired accuracy to prepare a prototype for mass production.
For most users, it would be ideal if this technology could be easily retrofitted into existing applications, to simply enjoy the benefits of this emerging HMI technology without prohibitive levels of development effort.
Related Blogs
- Mitigating Side-Channel Attacks In Post Quantum Cryptography (PQC) With Secure-IC Solutions
- Digitizing Data Using Optical Character Recognition (OCR)
- Intel Embraces the RISC-V Ecosystem: Implications as the Other Shoe Drops
- Extending Arm Total Design Ecosystem to Accelerate Infrastructure Innovation
- The design of the NoC is key to the success of large, high-performance compute SoCs