January 17th, 2022—
Last July AIfES (Artificial Intelligence for Embedded Systems) from the Fraunhofer Institute for Microelectronic Circuits and Systems (IMS) was launched. This open source solution makes it possible to run, and even train, artificial neural networks (ANN) on almost any hardware, including the Arduino UNO.
The team hasn’t stopped work on this exciting machine learning platform, and an update just landed that you’ll definitely want to check out.
The new AIfES-Express API
AIfES-Express is an alternative, simplified API that’s integrated directly into the library. The new features allow you to run and train a feed-forward neural network (FNN) with only a few lines of code.
Q7 weight quantization
This update enables the simple Q7 (8-bit) quantization of the weights of a trained FNN. This significantly reduces the memory required. And depending where it’s being deployed, it brings a significant increase in speed along with it.
This is especially true for controllers without FPU (Floating Point Unit). The quantization can be handled directly in AIfES® (and AIfES-Express) on the controller, PC, or wherever you’re using it. There are even example Python scripts to perform the quantization directly in Keras or PyTorch. The quantized weights can then be used in AIfES®.
Advanced Arm CMSIS integration
AIfES® now provides the option to use the Arm CMSIS (DSP and NN) library for a faster runtime.
New examples to help you get building
A simple gesture recognition application can be trained on-device for different Arduino boards, including:
You can play tic-tac-toe against a microcontroller, with a pre-trained net that’s practically impossible to defeat. There are F32 and quantized Q7 versions to try. The Q7 version even runs on the Arduino UNO. The AIfES® team do issue a warning that it can be demoralizing to repeatedly lose against an 8-bit controller!
This Portenta H7 example is particularly impressive. It shows you how to train in the background on one core, while using the other to run a completely different task. In the example, the M7 core of the Portenta H7 can even give the M4 core a task to train an FNN. The optimized weights can then be used by the M7 to perform the FNN with no delay, due to the training.
Here’s a link to the GitHub repository so you can give this a go yourself.