Edge AI
Team
tfc.ai
Industry
Manufacturing, Logistics and others
Place
Switzerland
We had been involved in several Industry 4.0 projects in both the Automotive and Manufacturing industry in the past. From these projects several questions arose regarding the use of AI and Edge Computing:

- What performance can be achieved at the Edge?
- What are the challenges?
- Which neural network architectures are best suited?
- What's on the market in terms of edge devices today?

In this research project we investigated these questions and gained valuable insights for future projects.
Takeaways
We were impressed by the compute power that's available at the edge: Nvidia's Jetson architecture for example provides object detection and classification for up to 50 simultaneous video streams at a resolution of 1080p and 30fps. In our experiments we measured a throughput of up to 600 images/s. That's more than enough for most use cases we're currently looking into.

At the same time the energy footprint of the solutions is very low. Even a high-powered device like the Nvidia Jetson Xavier comes with a power usage of only a fraction of a modern GPU (30 watts in this case). And Google's Edge TPU is capable of performing 4 trillion operations (tera-operations) per second (TOPS), using only 2 watt.

That being said there are a couple of issues when starting to deploy your AI models at the edge: Models need to be specially compiled and optimized for maximum performance. In Nvidia's case the TensorRT (TRT) high-performance inference platform leads to significant performance gains compared to un-optimized network deployment. In some situations only a subset of the available machine learning framework features is available, like in the case of TensorFlow Lite. We hope that in the future the difference between desktop/server network architectures and edge architectures will get smaller. For now you should consider the supported edge network architectures when training for deployment at the edge.

In many cases the state of edge-deployments at the beginning of 2019 seemed to still be a little rough around the edges: Oftentimes the network models readily available are only a fraction of what you can find for popular server frameworks. The manufacturers and solution providers are just getting started with their edge efforts and are in various stages that still often require substantial manual effort to get your solution to production. On the Nvidia Jetson architecture for example we encountered Linux OS package management issues with pre-compiled machine learning packages not being available for the ARM processor architecture. This leads to prolonged setup times and manual fiddling with software dependencies.
Also some of the advanced deep learning optimizations like the Tensor Cores on the Jetson Xavier module are not supported by the drivers yet. This gives hope for future improvements and performance gains.

In summary we find huge value in the deep learning edge devices on the market today. We find promising efforts by vendors and the machine learning community that will result in more straightforward and less cumbersome edge deployments in the near future. Some vendors like Google and Amazon AWS are taking a holistic approach to providing full-stack solutions that span all the way from the Cloud to the edge. We're eager to give these new approaches a try and look forward to using this kind of specialized deep learning edge hardware in our projects.
Project Details
Industry: Diverse Industries
Time Frame: June 2018 - March 2019
Products / Technologies:
  • Edge Hardware
    • Nvidia Jetson Xavier (ARM Architecture)
    • Raspberry Pi 3 Model B+
    • Amazon AWS DeepLens
  • Nvidia DIGITS for training
  • Nvidia Jetson JetPack | TensorRT for deployment
  • Python, PyTorch and TensorFlow Deep Learning frameworks
  • Nvidia CUDA, CuDNN