What is the best board for AI ML?

12 Apr.,2024

 

Machine learning (ML) and artificial intelligence (AI) are no longer limited to high-end servers or cloud platforms. Thanks to new developments in integrated circuits (IC) and software technology, it’s possible to implement ML algorithms and deep learning neural networks on tiny controllers and microcomputers. And these embedded devices installed at edges must no longer rely on a remote server or cloud for insight from the sensor data or user inputs.

Software frameworks like TinyML are evolving as microcontroller-specific ML solutions, and conventional deep-learning frameworks can be implemented on powerful microcomputers. This offers several advantages.

For one, the devices do not depend on network connectivity or the availability of a cloud service to add AI to the system. Secondly, microcontrollers and microcomputers are far more power-efficient than a web-based AI service server.

Microcontrollers can also now carry out specific AI tasks while running on a coin cell. Compared to a high-end workstation that typically costs thousands of dollars, an ML-capable microcontroller costs less than a few hundred dollars. Another benefit of implementing AI on the edge device is securing user data privacy and reducing the chances of cyber attacks and hacking. Microcontrollers are ubiquitous and can be deployed on a large scale for machine learning tasks.

There are several single-board microcontrollers and microcomputers now available that can be used for developing AI-enabled embedded applications. Let’s explore the top platforms.

The NVIDIA Jetson Nano Developer Kit is one of the most flexible autonomous machines that can be used to deploy AI software on the edge. This microcomputer can run multiple neural networks in parallel. 

Based on the Quad-core ARM A57 processor and the 128-core NVIDIA Maxwell GPU, the Jetson Nano can deliver 472 GFLOPS of computer performance. It also houses an onboard 4GB 64-bit LPDDR4 RAM @ 1600 MHz. There are two power modes, 5W/10W with 5V DC input, and the board only costs $99.  

It’s easy to get started by simply inserting a MicroSD card containing the system image. The board is programmed for AI applications using the NVIDIA JetPack SDK. The latest SDK provides a complete Linux environment based on Ubuntu 18.04. The SDK also provides accelerated graphics supporting the NVIDIA CUDA Toolkit 10.0 and GPU-accelerated libraries like Tensor RT5 and cuDNN 7.3.

The supported ML frameworks include PyTorch, Keras, TensorFlow, Caffe/Caffee2, MXNet, and others. The SDK also supports OpenCV for computer vision and ROS for robotic applications.  

What’s more: this developer kit includes a DeepStream SDK that delivers a complete streaming analytics toolkit for AI-based video and image processing. NVIDIA can process up to 8 HD full-motion video streams in real-time. The Jetson Nano is an excellent platform to deploy AI-based inference workloads for applications such as image classification, segmentation, object detection and localization, video enhancement, pose estimation, and speech processing. 

The board features Gigabit Ethernet, HDMI 2.0, Display Port 1.3, MIPI CSI-2 camera interface, four USB 3.0 ports, a MicroSD card slot, and a 40-pin GPIO header. The board can be supplied power through a 5V DC barrel jack adapter or Micro USB port. The power consumption can be set to as little as 5 Watts. The camera connector is compatible with 8MP IMX219 and Raspberry Pi’s camera module V2. 

Jetson Nano is capable of inferencing several Deep Neural Network (DNN) models with real-time computer vision. Even the neural networks can be retained locally through transfer learning. This board is an excellent choice for various applications, including IoT with intelligent edge analytics, multi-sensor autonomous robots, video analytics, image recognition, and gesture recognition.    

Google Coral Dev Board is currently the most power-efficient development board for edge ML inferencing. Based on NXP i.MX 8M SoC (including Quad-core Cortex A53 and Cortex-M4F), with Integrated GC7000 Lite Graphics GPU and Google Edge TPU, Coral Dev can deliver 4 TOPS computing performance. Plus, the board only consumes 0.5 Watt per TOPS. The RAM is 1 or 4 GB LPDDR. 

This development board requires a 5V DC supply for operation, and its GPIO is 3V3 compatible. There is also onboard 8GB eMMC memory and a MicroSD card slot. The board costs $149.99.  

One advantage Coral Dev offers is that it combines ML with wireless connectivity. Along with Gigabit Ethernet, Coral Dev includes WiFi (802.11b/g/n/ac 2.4/5GHz) and Bluetooth 4.2. Another advantage is its removable system-on-module (SoM) that efficiently scales up between prototyping and production. The biggest drawback is that it only supports TensorFlow Lite and no other deep-learning frameworks. However, this board runs a derivative of Debian Linux and supports several popular Linux tools, which is useful. 

Coral Dev is ideal if you plan to deploy ML inferences using TensorFlow Lite with minimum time to market. The board offers impressive computing performance and can execute the latest mobile vision models of MobileNet v2 at 400 FPS. Users can do almost anything on Coral Dev that is within the scope of the TensorFlow Lite framework.   

Raspberry Pi was such a popular single-board computer that it went out of stock in 2022 due to its massive demand in the maker industry. The latest Raspberry Pi 4 Model B is based on the powerful Broadcom BCM2711 Quad-core Cortex A72 64-bit SoC that clocks at 1.5 GHz. This board also features Broadcom VideoCore VI GPU and 1/2/4 GB LPDDR4 RAM and can deliver 13.5 to 32 GFLOPS computing performance.

The cost of the board starts from $35 for a 1GB RAM model. The 2GB and 4GB RAM models cost $45 and $55, respectively. The Coral USB Accelerator costs $59.99 and can be added to the board to improve its ML capabilities.

Raspberry Pi is compatible with TensorFlow and TensorFlow Lite. It can deploy various ML applications like image classification, voice recognition, object detection and localization, face recognition, and gesture recognition. The board is also capable of implementing several deep-learning networks.

Adding the Coral USB accelerator to improve the ML capabilities is possible, but the cost increases. Compared to other development boards, Raspberry Pi has a weaker GPU. However, RPi is a single-board computer nearly all makers have.
Raspberry Pi 4B is an excellent board for learning and experimenting with ML and deep-learning networks.

Rock Pi 4 is often considered an upgrade to Raspberry Pi 4B and is ideal for more complex projects. Launched in 2021, Rock Pi 4 has a better processor and GPU for implementing ML tasks. The board officially runs Android OS (Android 12) and supports mainstream AI stack with GPU acceleration.

The Dual Cortex A72 @ 1.8 GHz with Quad-core Cortex A53 @ 1.4 GHz sits at the heart of the board. The onboard Mali T860MP4 GPU supports up to OpenGL v3.2 and OpenCL, DX11, and Vulkan 1.0.

The Rock Pi features a 64-bit dual-channel LPDDR4 RAM with options for 1/2/4 GB variants. The board has multiple memory options, including MicroSD, eMMC, PCIe, and USB3. A MicroSD card of up to 128 GB can be used. The eMMC module is optional, with 8/16/32/64/128 GB storage. The M.2 connector supports up to 2T M.2 NVME SSD.

The cost of the Rock 4A/4B starts from $49 for the 1GB RAM variant. The 2GB and 4GB RAM models cost $59 and $75, respectively. There are also other models of Rock 4 available, including Rock 4A/4B Plus, Rock 4 SE, and Rock 4C Plus. All models come with onboard WiFi5 and Bluetooth 5.

The Rock 4A/4B Plus is superior to Rock 4A/4B in many specifications. The 4A models lack onboard WiFi and Bluetooth. The other notable features of the board include a 40-pin GPIO header, onboard RTC, four USB 3.0 ports, a MIPI CSI connector, a MIPI DSI connector, and a full-sized HDMI 2.0 port. A dedicated hardware NPU accelerator will soon add to the board.

Rock 4 is a better alternative for TensorFlow and TensorFlow Lite-based AI applications, but it is a newer option with only small community support at this time. Many accessories are still incompatible with the Rock 4.

The Sipeed MAIX GO is a good budget option for beginners interested in AI and ML. The board includes a 2.8-inch LCD and OV2640 camera with an M12 lens. Based on the dual-core 64-bit RISC-V CPU that clocks at 400 MHz, MAIX GO features a 64 KPU (Neural Network Processor) with a 576-bit width and an APU (Audio Processing Unit). It can support up to eight microphones @ 192 KHz sampling rate.

The onboard RAM is 8MB high-speed SRAM that can be over-clocked up to 800 MHz. The board also features JTAG and UART and a lithium battery manager chip.

Sipeed MAIX GO has a MicroPython port, supporting TensorFlow, Tiny-Yolo, FreeRTOS, Mobile-Net V1, and other frameworks. MAIX GO is an excellent development board to experiment with basic AI stacks without purchasing additional components and accessories. Users can try image classification, speech processing, and smart analytics for sensor data with the board.

One drawback of MAIX GO is the lack of proper documentation and library support compared to other development boards. But at only $40, including an onboard camera and LCD, MAIX GO is still worth considering.

Final thoughts
In previous years, the Intel Neural Stick 2 and BeagleBone AI would have been on this list, but they have been discontinued. Choosing a board all depends on your needs.

Beginners who already possess a Raspberry Pi 4B, might consider the AI stack based on the TensorFlow and TensorFlow Lite frameworks. But if you’re looking for performance and higher ML capabilities, the Rock 4 is an alternative to RPi. For a tight budget, MAIX GO is an affordable option. Coral Dev is the fastest development solution.

For versatility in relation to ML, AI, and deep learning, the NVIDIA Jetson Nano is ideal. For high production, try the TensorFlow AI inferencing edge devices. Or, consider more than one board to add to your experience.

 



 

Next Article

Choosing Computer Vision board in 2022/2023

Anton Maltsev

·

Follow

7 min read

·

Sep 11, 2022

--

Choosing a platform to work with Computer Vision on the Edge is difficult. There are dozens of boards on the market. If you read about one of them, you want to use it. But when you try - it is not so good.

Image by the Author

I tried to compare a lot of the cheap boards on the market. And not only in terms of speed. I tried to compare the platforms by their “usability.” How easy it would be to export networks, how good the support is. And how easy it is to work.

This article is the result of the comparison. But if you want to see more about the boards, there is a different video I made about each board (with complete comparison):

I hope this is not all, and I will supplement this article. As of right now, I have:

  1. K210 board (about)
  2. MAIX-II board (about)

And I think that I will append them to this guide. I already have the video about them, but it’s pretty tiny, and I didn’t check all criteria in it.

Here is the video about UnitV2 from m5stack (with Sigmstar SSD202D processor). At this time, it’s pretty hard to test the camera through full comparison, but I hope I will do this in the future.

Plus, I have a list that I plan to order and test sooner or later and add to this article or the next one:

I know that there are also Beaglebone and JeVois. But they seemed a bit outdated to me. I also don’t have enough strength to test boards without a complete system, such as Arduino Portenta H7, Sony Spresense, Nordic Semi, Pi RP2040, etc. But in some cases, you should also consider them!

Let’s go!

Here is the final table with all the boards :

https://docs.google.com/spreadsheets/d/1BMj8WImysOSuiT-6O3g15gqHnYF-pUGUhi8VmhhAat4/edit?usp=sharing

But let me explain all the criteria first.

How easy to work

How easy is it to flash? It took half a day to flash Jetson TK1. For RPi — half an hour. Firmware is the point where your communication with the board begins after unboxing.

Easy to work with. When I was working with DaVinci — debugging took ages. Today all processes are usually much easy. Let’s speak about them.

Conventional Linux. I like when you can work with regular Ubuntu. And it makes me sad when there is no regular Linux on the board. Let’s check this.

Community support. Big community — low amount of problems and a lot of solutions. Let’s check it.

Image by the Author

In my opinion, the best board is RPi and NCS. But they are not fully Computer Vision boards. Coral and Jetsons are good but not excellent.

Models support

Usually, NPUs are not very user-friendly in terms of model conversion. Let’s talk about models.

Oficial Models Zoo. What models are supported?

Unofficial Models Zoo. What community give to this board?

How easy is it to convert the random model? Why do I need the first two points if I can export anything?!

Easy to debug problems with the conversion. If export goes not as planned.

Image by the Author

As you can see, three good boards and one almost good.

Production readiness / Hobby projects readiness / Board Construction

Some additional information can allow you to decide if you should choose the board.

Processor speed? A lot of computer vision systems require good processors. Let’s check them. To test it, I will use the stress-ng (Sudo apt-get install stress-ng) tool on Linux PC to make a comparison.

Mechanical parts, construction, temperature stability.

Easy to buy. Should I press the “Contact to require the price” button?… Or wait in line for a few months?

Pins for external connection. Will I be able to manipulate reality?

Image by the Author

As you can see, all the board looks almost the same except for boards without Linux.

Speed Test

It’s hard to make a complicit understanding of “how fast the board” by 2–3 points in performance comparison. It’s better to look at the “Speed test” parts of videos and check the information here. Different boards have different inference frameworks, different parameters, and different quantization.

I use batch size =1 everywhere. And this is not the best strategy. For example, for Jetson, it will increase performance.

Image by the Author

But in my opinion, these tests can answer a few questions:

  1. How fast is the board for small neural networks?
  2. How fast is the board for the big neural networks?
  3. What is the optimal framework to run a neural net?

I will not comment on the speed test; in my opinion, there is no “bad” board.

Price

For big projects, the price is critical. But you can hardly estimate the actual cost. For example:

  1. Jetson’s cost was about 99$, but with the current chip shortage, you can barely buy it with 250
  2. A big consignment of boards costs less than a small one.
  3. You can prototype your board for some chips, which will cost less.
  4. Additional periphery will increase the cost. And it will be different for the different boards.

Here is the small price table:

Image by the Author

Power consumption

Also, I tried to measure power consumption.
Few important notes:

  1. I can’t measure power consumption for every board in consideration (some boards I give to friends, some boards don’t have USB, e.t.c)
  2. I try to measure only two regimes: “idle” and “running NN”. But: some boards have an internal camera, some boards use wifi, some boards have additional periphery, e.t.c. I don’t connect any additional parts, but
  3. It’s “mean” power consumption. I didn’t try to measure a maximum consumption

Here is the table:

Image by the Author

Summary

So. I hope that this will help you to choose your board. But it’s a pretty small article. And let me recommend a few more.

  1. A good article on what is NPU, and TPU, how they differ, and how the math is optimized: https://blog.inten.to/hardware-for-deep-learning-part-4-asic-96a542fe6a81
  2. Good article on comparing platforms. There are some platforms I haven’t reviewed + examples for networks I don’t have — https://qengineering.eu/deep-learning-with-raspberry-pi-and-alternatives.html
  3. Not a very detailed comparison, but some exciting platforms I haven’t reviewed yet — https://jfrog.com/connect/post/comparison-of-the-top-5-single-board-computers/
  4. An excellent and detailed article, but not many boards — https://arxiv.org/pdf/2108.09457.pdf
  5. ncnn performance test for a bunch of boards — https://github.com/nihui/ncnn-small-board

And, of course. If you want to follow my articles about Computer Vision boards — subscribe on my LinkedIn and youtube! If you have a question — ask them in the comments and via e-mail (or we can consult your case).

What is the best board for AI ML?

Choosing Computer Vision board in 2022/2023