Inference occurs when a compute system makes predictions based on trained machine-learning algorithms. While the concept of inferencing is not new, the ability to perform these advanced operations at the edge is something that is relatively new.
The technology behind an edge-based inference engine is an embedded computer. But obviously it goes way beyond that, with amplified compute power, lots of storage, and the necessary I/O to handle a significant amount of data in real-time. The goal is to perform the operations as close as possible to where the data is being generated, achieving the most accurate results in the shortest amount of time. That location is typically very near the sensors, where external data is input to the system. Once decisions are made, they are usually sent back to the edge to be carried out, driving real-time decision making at the edge.
A key concern for the edge-based inference engine is the environment in which it’s deployed. For example, does it have to be designed to handle shock and vibration? Will it incur extremely high or low temperatures? Will it provide the right balance of performance acceleration? The responses to each of these questions could result in a different design, or at least a different way to approach the design.
Some vendors maintain a full complement of in-house testing equipment for environmental issues. This would include simulations for thermal regulation, and of course, shock and vibration. In most applications, a system designed for “rugged and thermal applications” means that it can operate in temperatures ranging from -40°C to +70°C, and can withstand shocks up to 20 G and vibrations of 3 Grms.
Amping Up For AI
There is a clear differentiation between a general-purpose embedded computer and one that’s designed to handle the inferencing algorithms. First, the inference engines require the utmost in compute performance. Any designer can pull a high-end X86 processor off the shelf and incorporate it in a system, even one that incorporates the features of a data-center platform. However, it takes deep expertise and experience in artificial intelligence systems, both on the hardware and software sides, to design a system for maximum throughput. The experts at Premio fit that bill with their robust hardware engineering and design for its industrial-grade computer platforms.
Premio has come up with a modular technology called Edge Boost Nodes that maximizes system performance at the edge. The hardware nodes physically attach to the lower portion of the platform and provide hardware acceleration for edge-level workloads that require data acquisition for real-time insights. This two-piece modular design helps maintain the platform’s ruggedness while providing performance acceleration through non-volatile memory (NVMe) solid-state disk drives (SSDs) in innovative cannister bricks and GPUs for parallel computing performance. Each Edge Boost Node uses high-RPM active cooling to ensure reliability of those components.
(Image credit: Premio Inc.)
A host of different Edge Boost Nodes are available from Premio. For example, one option, the RCO-6000-CFL-2N2060S, adds a hot-swappable NVMe SSD canister, able to hold up to two 15-mm U.2 SSDs and a PCIe GPU. A second option, the RCO-6000-CFL-4NH boosts storage capabilities, supporting x2 hot-swappable NVMe SSD cannisters that house two 15-mm U.2 SSDs for high-capacity NVMe storage supporting both hardware and software RAID. A third option, the RCO-6000-CFL-8NS, focuses on even more high-speed NVMe storage, offering system integrators the ability to add up to eight 7-mm, 2.5-in. U.2 NVMe SSDs, which is coming soon coming to Premio’s Edge Boost Node portfolio.
This type of topology is significant because with edge-based inferencing systems, storage is separated from the I/O which resides on the backplane and thereby maximizes performance. The “secret sauce” comes in balancing the available number of PCIe lanes to deliver the best performance, a design technique Premio is able to pull from its portfolio of embedded and data center computer architecture designs.
Other I/O that must be considered includes USB, COM interfaces, and even 5G. One good way to handle high-throughput I/O, both for the board supplier and the OEM, is through modular I/O daughterboards for added flexibility. With this approach, systems can offer exactly the I/O needed and eliminates unnecessary I/O choices for application-specific workloads.
AI for ADAS Applications
One popular application today is ADAS, or advanced driver assistance systems. These sophisticated systems power autonomous vehicle applications based on effective data collection and sharing, aiming to fuel continually more intelligent algorithms for level five autonomous driving.
Noting that this is clearly an edge-based application, the Edge Boost Nodes design team was sure to incorporate appropriate ruggedization features and thermal regulation. For example, system operators need to dynamically understand the temperatures inside the box at all times. Here, the likely scenario includes a fan. As a power consuming component, that fan would only be powered on when necessary.
Within the software development kit that Premio provides its customers, there’s an app that lets them maximize the use of those fans, determining when they should be on, what speed they should be operating at, and so on. The software also provides a safety valve in that it can suspend all I/O read operations from the various peripherals back to the CPU. This operation can also be carried out using a physical button and an LED indicator.
Maximizing Power Efficiency
Power efficiency is a top priority for edge-based inference engines. System designers recognize the need to place the processing power closer to the IoT sensors. An immediate solution is to add various performance accelerators, typically in the form of GPUs, NVMe storage, and M.2 accelerators. The tradeoff in this design strategy is that each is a power-hungry component that needs to address power versus performance budget. Segregating these tasks out to the Edge Boost Nodes increases the processing and reduces the load on the host processor which is segregated in its ruggedized wide-power input, from 9 to 48 V DC. A unique feature of the modular Edge Boost Node is that it provides power stability for robust performance acceleration modules (NVMe SSDs, GPU, or m.2 accelerator) in the most demanding edge workloads where reliability is of utmost importance.
Because edge-based inference engines generate immense amounts of data, storage is key. The Edge Boost Nodes include a 6-Gbit/s SATA interface that can connect four drives (two internal and two external). Yet incorporating NVMe drives is a potential game changer for this application. In this particular case, it’s handled through a maximum of four 2.5-in., 15-mm drives and another option in eight 2.5-in., 7mm drives.
(Image credit: Premio Inc.)
And while on-board storage is vital, so is the ability to reconcile data with the cloud. In this scenario, this process is handled through standard Gigabit Ethernet or a 10-Gbit/s module. WiFi or cellular LTE are also options, depending on the application and the environment. Since the design offers flexible I/O daughterboards, users can even integrate a 5G daughterboard module to enable ultra-low latency connectivity with 5G rollouts.
Security and Next-Gen Upgradability
While any industrial platform must contain proper security measures, pushing system performance out to the edge makes security even more critical. Premio builds on accepted industry standards like TPM 2.0 to encrypt data. Then there’s the physical side to address—somebody literally stealing the physical system. To combat this, the NVMe drives on the Edge Boost Nodes are behind lock and key drive cages.
The modularity of the Edge Boost Nodes makes them inherently upgradable. Simply swap out your module for a higher performing version when it becomes available. While this feature may slightly increase the bill of materials (BOM), it protects long-term investments as the system can be guaranteed a longer active life. On the software front, wireless upgrades are possible in the field through the LAN, and can be done with confidence thanks to built-in security features. And as the industry moves to a cloud-native upgrade path, this has become the method of choice for upgrades. As long as the system remains “containerized,” security concerns are addressed and well-managed – both for the life of the system and in a constantly evolving landscape of digital security threats.
Premio recognizes that its localized manufacturing gives the company a leg up on the competition. All assembly takes place at its facility in Los Angeles, regardless of the size of the order. This removes the potential burden of the overseas supply chain, accelerating time to market and streamlining deployment to allow customers to get up and running very quickly for enterprise scale and deployments.