Hardware Edition: Ins and Outs of Designing an AI Edge Inference Computer

 Watch on Premio's Rugged Edge Media Hub

Key Takeaways from Podcast:

  • AI Edge Inference Computers provide a new level of computing capabilities
  • AI Edge Inference Computers deliver real-time inference through hardware acceleration that enable machine learning and intelligence
  • The future of computing lies in intelligent computers being able to adapt to changing environments. New demands are being shaped by four key technology superpowers:
    • Ubiquitous Compute
    • Cloud-to-edge infrastructure
    • Pervasive connectivity
    • Artificial intelligence

Q:What is an AI Edge Inference Computer?

 A: The reason why I wanted to use Pat Geisinger’s (Intel) four technology superpowers as a leading point is because these are very key principles that he mentions that shape the overall design of an AI inference computer. The four technology superpowers are important because they are at the forefront of leading industries going through a digital transformation. And a lot of this goes into the design for better performance and better power efficiency. Looking at the overall thermal mechanics to develop a system that is power efficient is the reason why we support rugged edge computing in remote and mobile environments. So in our definition, it's very fitting to be calling it an AI edge inference computer because a lot of the future workloads really stem into developing a hardware solution that leverage many different technologies that can interact with these new AI workloads. So essentially, what we're doing and why it's so important is these AI edge inference computers are moving into a market of what we are defining as rugged edge computing. And really in this market, there's this convergence of all these technologies from compute processing, high performance storage, and lower-latency wireless connectivity. But all that doesn't work in an environment that is harsh, an environment that is not a stable. So you need to really understand how to ruggedize and harden the product in order for it to be reliable and survive in harsh environments. So how we do that is we look at the overall system level architecture, and that's where our engineering really comes into play because we really are able to balance the necessary compute power and look at the overall power efficiency in order to deploy for a lot of these newer applications that are closer to  IoT sensors as well. When you have all this type of technology in close proximity to where data is being generated, you really can now start to do the decision making. You really do have the ability to interact and manipulate with that data, with efficient intelligence and with cognitive ability that's written into these new software algorithms.

Q: How does an AI Edge Inference Computer differ from other computer platforms?

A:Yeah so there's a few key differentiators I'll go into, but I really want to explain the design realization that came into this product that we are pushing into the market. And the reason why we are able to do that is where Premio value in our engineering and our design of our compute architecture for the past 30+ years. We have two different design teams. One design team is strictly focused in the embedded/edge solutions and the other is dedicated to high performance data center solutions. Traditionally for over the last 20 plus years, the embedded design team has been putting together hardware solutions that are low power, usually passively cooled, that are fit for very harsh environments that have reliability, endurance and I/O variety. On the flip side of that, you also have the high-performance data center type of design, and in the data center you do have control of maximum resources in a central location. So the ultimate goal there is to maximize the performance. And by maximizing performance, you're utilizing high performance hardware in order to make sure you deliver the best performance. So kind of in summary, when you take those two design principles together, you kind of have that middle ground (balance) and that's what we decided in our product design. The AI Edge Inference Computer is a balance of two design principles, a little bit of the embedded expertise and combined with the high performance, consolidated into a single product that we believe fits well for a lot of the data that's being processed at the edge in real time. So to dive into some of the key differences that makes our edge inference computer different is how we made it modular. So a lot of the solutions out there are a single piece design that in order to make any changes, you actually have to wait to the next generation. One key element that we wanted to offer the flexibility of our customers is that we made certain elements of the hardware or the computer modular. And the top portion, which we're calling our top node, uses our traditional x86 fanless industrial grade design. And that basically gives you the processing, gives you all the IO connectivity that can interact with all these IoT sensors. But what's the key differentiator in this specific AI Edge Inference Computer is in these bottom nodes, what we're calling them is our EDGEBoost nodes. And these EDGEBoost nodes are scalable in the sense that they offer the customer flexibility based on their application workload to deliver performance acceleration. Performance acceleration is extremely important when it comes to inference, because that is where you're able to leverage the hardware with the software that you've written to really deliver a real time inference based on machine intelligence or, you know, artificial intelligence.

Q: What applications is an AI Edge Inference Computer used for?

A:Yeah so, the main benefit of having an AI Edge Inference Computer is exactly with mentioning in that paradigm shift of being able to interact with data. And this shift away from centralized computing to more a localized compute where data can be acted in real time. So really, I believe the best applications for AI Edge Inference Computer are really focused on that keyword of inference. Inference is so important, which is different than a deep learning model, is that the inference AI has already been trained to a level of intelligence and a level of cognitive ability, where now it's able to look at that data, interact with that data and decipher that data very quickly and not use that data to actually learn about anything. It's a decision making principle, whereas in if you're using the data for a deep learning model, the data sets have to be quite large and you need a lot of horsepower to actually train the machines to get to a level of some type of intelligence. So when I talk about inference, some of the most interesting use cases include:

  • Computer vision, industrial automation, and robotics

With robotics, we have very interesting customers that are leveraging so much of our AI Edge Inference Computer for their computer vision applications. Essentially, they are connecting all these IoT sensors. They can be cameras and some type of automation line. And you can imagine, I think in this, this last in this current shortage, a lot of the pain problems is being able to use automation to kind of solve a lot of the issues with short staffing people who are not in like logistics warehouses. And a lot of these newer applications are really trying to use computer vision to eliminate a lot of that. So being able to recognize different objects coming off the line with computer vision, object detection, image detection, even we have some customers using AI Edge Inference Computers  basically navigate a lot of autonomous vehicles inside these logistics facilities, whether that be autonomous forklifts, you know, some types of autonomous guided vehicles that are helping move different things throughout the automation facility.

  • Security and Surveillance

A second application would be security and surveillance. I mean, that is actually a very good example of edge type of data interaction, right? And that inference is able to actually do object detection or pose detection directly from the camera feeds that are coming in from the cameras as well.

  • Advanced Driver Assistance Systems (ADAS & Autonomous Vehicles)

Another last major application that we're starting to see growth into is in the ADAS and  autonomous vehicles. So in this environment already, you can imagine you need to have extremely fast decision making because vehicles that are on the road need to be able to not only detect and decipher their environment, but they need to be able to recognize and kind of pivot based on where the car is going. So in addition, there's these air reference computers pull data that's being driven in these test vehicles to actually move back into the cloud, to do a little bit more deep learning because we're definitely not at the level of full autonomous driving. There's a lot of applications that are still trying to pull a lot of the data to kind of train and make these, these models a lot smarter. And these edge inference computers are able to kind of leverage all that in a lot of the technologies that we're putting to in the product itself.

Q:How do you see the rugged edge being part of what is shaping AI edge inference computing? Do we need to strategize around the edge versus the rugged edge differently in this scenario?

A:Yeah so, the ruggedized version of the product is very important for AI workloads that are usually in areas that are remote and mobile. So when you look at the overall system of the product, the reliability and durability is extremely important. So, being able to deploy products in wide temperature ranges or environments that are dealing with water or power instability, area or environments where the power, you know, those defining environmental challenges are things that we need to actually develop and look into the overall system design because we understand some of our customers' applications are in these extreme environments. So, for example, in underground mining environment where safety is a major factor or where they necessarily, you know, are using autonomous type of applications in an underground mining environment to kind of navigate or map out the whole environment. Another one would be, you know, in oil and gas, right? Oil and gas is where it's oftentimes in environments where it's quite dirty, rugged, harsh and you need something that's going to continuously be reliable as well. So really, I mean, this question really dives into really understanding how we ensure the product maintains its reliability 24-7 in these environments.  And how we do that is we test the product to the extremes with our test equipment, and then we are able to push to extremes.

Learn more about the Rugged Edge and download our ebook

Q: What storage options are available today? What's sort of the standard that you recommend? And are there trade-offs associated with each kind of storage option?

A;Yeah so, this question is a good question, specifically as a differentiator for our and inference computer. So if you look at storage as a whole, it's very important because it's clear that capacity of overall storage is key, but it's quite evident that if as you move further away from a data center and you have less and less resources for large amounts of storage, you need to be able to deliver different storage protocols or different elements of storage in a localized box that's able to still deliver, you know, the performance required. So if you look at overall storage media itself, I think the growth in technology is quite evident because you're starting to see more of these NAND flash layers, whether it be MLC or TLC capacities. And by adding these different layers, you're increasing the overall capacity. But also there is a level of endurance required. But as you move out to the edge, one of the major bottlenecks and challenges traditionally in a lot of this embedded compute is you were stuck at a SATA protocol and the SATA protocol itself is limited by the read and write speeds of 6 gigabit. So what we're doing differently and what we're actually bringing to the market first in our AI Edge Inference Computer is the ability to have high capacity NVMe storage in our canister bricks. And it's not that we're introducing NVMe into, you know, embedded computers first. What we're doing differently is we're actually making NVMe in hot swappable 2.5” inch drive trays, whether it be m.2, U.2 and you're now able to deliver NVMe with incredible high speed and write speeds, but also high IOPS. So what that does is that now allows the computer in to interact with that data extremely fast. You're able to kind of store that data, but most importantly, you can actually take a lot of that data that's stored on the computer and move it off that device very quickly. And now into another environment, say the cloud to where you can do a little bit more of that machine learning.

Learn more about NVMe in AI Edge Inference computers and download our whitepaper for benchmark tests

Q:Why is power efficiency a key specification in an AI Edge Inference Computer?

A:Yeah so it's all about being able to deliver the right amount of power with the performance budget, right, so on one side of the design, it's really easy to kind of put together all these different powerful components and you actually deploy it if it's using a lot of power. It's not being efficient. It's actually not going to be the best solution for the customer or the application. So what we need to do is we need to be very selective and evaluate certain components. That really meet the performance benchmark, but also are still power efficient. And why it’s so important to be power efficient again is like when you're at the edge, you don't have the resources, you don't have the reliable power constantly there. So you need to find ways to manage power efficiency as a whole. So one major way we're able to reduce the power efficiencies through the cooling methods. Right so one major element and what we try to put in a lot of our ruggedized computing design is looking at the product and using a passive cooling method. And lots of times passive cooling methods, results in using a fanless design that uses the thermal mechanics behind heatsinks to basically dissipate the heat away from a lot of these critical components. So, by actually removing a lot of like an active fan, a power element that actually is removed from the design. But essentially, what's also important is you need to understand the environment that the product is going into and once you have a balance of this performance and power. You also now need to be able to support the product in an environment where power actually be quite wide in terms of fluctuation. So what we've done is we still maintain our rugged power in terms of our wide power inputs. So, you know, like I mentioned in the top node, you're still able to do a wide power input from like 9 volt to 50 volt DC. But what's very interestingly different in the inference computer is that on even on the bottom node, when you have high performance GPU and storage and m.2 accelerators that are traditionally naturally, using a lot of power, we're still able to deliver a secondary power source still in a wide power range from 12 to 48 volt DC.  And additionally, when using high-performance accelerators like GPU and nvme storage, the  bottom node includes a hot swappable high RPM fan; and this fan itself is only dedicated for the edge boost nodes and their accelerators. Because when you actually include these type of performance accelerators, you need the ability to cool that right. So it's that balance, like I mentioned earlier, is understanding of low power efficiency, but still incorporating the high performance data center data center type of technologies and meeting in the middle to still provide a pretty robust AI Edge Inference Computer.

Q: But what should our audience be asking themselves as they maneuver decisions around the right power specs and power efficiency specs to meet their needs?

A:Yeah, so it's about understanding, I think, their application on what they're going to be using. Again, the computer itself is extremely rugged to where it can support a lot of major applications because of our knowledge of hardware engineering, we've done our best and based on our benchmarks and test to deliver a product and solution that will deliver that performance to power ratio. So you can almost can almost guarantee that, you know, we've selected the best type of component, the best type of processing, the best type of performance accelerator with a low power efficient model to where it's actually going to be a best bang for the total cost of the overall solution.

Discover 6 Steps in Building a Fanless Industrial PC

Q:Now let's talk upgrades, though this is critically important because as you laid out in your introduction, this industry is evolving constantly. How can upgrades occur in an AI Edge Inference Computer and do you have any key strategies?

A:Yeah, good question. So, I think from the embedded side of the markets and how we've looked at next generational upgrades has always been very strategic. And I think what's on the consumer side of high performance in the latest and greatest? The embedded market is actually a few generations behind for the reason of the ability to ensure necessary compute power for the customer to leverage without moving too quickly to a solution or a generation, potentially there could be bugs early on in a lot of the deployment. So with that challenge and that on the embedded side, what we're always trying to look at is, how do we come up with a timely solution that allows the customer to scale, allows the customer to stay competitive in their application, whether that be in all of the different type of technologies in processing, memory, storage, connectivity. But like you mentioned, technology continues to move so fast and that is a major challenge our customers are dealing with, and they're always looking for the latest and greatest in their type of deployments. So one of the major elements and benefits that has been timely in a lot of this embedded computing design and always been helpful is finding out a way to make it modular. Right so the idea of making simple things modular and really allowing the customer to upgrade very quickly is actually very key. So what we did differently in this AI Edge Inference Computer is we've made the compute module top node modular and we have performance acceleration modules and our edge boost nodes that can mix and match. So a very simple example is this current generation of computer is on a Intel 9th Gen core CPU, right? So with the next generation that we release, it really is a quick swap of the top node and you can still leverage all the performance acceleration through the edge boost nodes. And that is continuously going to happen as we increase the next generation design. And the benefit for that is the customer in the field can now make that swap extremely efficient with the modularity, rather than completely just swapping the whole entire unit.

Q: Are there any key challenges that you think an OEM may come across that they should keep in mind when integrating an AI Edge Inference Computer? And how do you get past those challenges? Why does that matter in our larger conversation?

A:So strictly speaking from a hardware perspective, I think one of the greatest challenges that a lot of the OEM customers that we're dealing with is making sure that their software application is ported on to the hardware in a very streamlined manner. I would say to us it's a challenge because we don't really understand what customer requires for their software. Essentially our goal is to really put together a robust, reliable hardware solution that can support the baseline of Windows or Linux to where the customer can now take that and integrate very quickly seamlessly into their deployment use cases. But I would say another interesting feature that this product has is we really try to do everything possible to make the integration for the customer software application very seamless. So we've also developed a software development kit that interacts with the hardware and certain parts of the hardware that are very important. So everything from overall fan speed, the overall ambient temperature and all of this is programmable through an API or a software development kit that we offer for the customers. Strictly speaking in the development kit, one thing that's very also important is we've included a programmable logic that allows the NVMe canister bricks to actually suspend their I/O operations with the click of a button. And this can be programmable based on what the customer's application. But essentially the overall function of this and why it's so important is that in data storage, while there's a lot of inputs and outputs coming through the computer, for example, if you needed to kind of eject the physical NVMe canister brick with the click of a button, we can actually suspend any I/O operations. This feature actually retains the data storage, but also prevents a lot of the data corruption that can occur during read and writes.

Q: How do Premio’s offerings in the AI edge compute world differ from the rest of the market and what is particularly practical and applicable about those differences based on everything we just laid out?

A:I think one of the defining features of our competitive differences as the manufacturer is we always are trying to find innovative ways to help our customers be as flexible and agile in their overall deployments. And I think in this whole conversation, it's been quite evident that in our new AI Edge Computer and how we've made a pivot and change into overall the modularity of the design by also introducing our edge boost nodes that are purpose built for the edge in terms of performance acceleration in GPU acceleration, NVMe storage and m.2 acceleration. That really allows our customers to leverage a purpose-built solution directly fit for these newer applications at the edge. Ultimately, what we've done is we've looked at some of the biggest challenges from a hardware side and pretty much put that together and solve that in an off the shelf solution. So what we've been able to do from a compute standpoint is really balance the performance, look at the overall reliability, employ data security, leverage an overall power budget in a new sophisticated industrial design that now allows these applications to be successful in the environments they are deployed.

Q:In summary, any actionable tips for our audience to begin to make sense of everything we laid out and turn that into some immediate steps for either upgrading or beginning to decide their design specs for an AI Edge Inference Computer?

A:I think the only takeaway is as this market continues to grow, we need to be able to leverage an ecosystem for success and growth. And that ecosystem heavily relies around a lot of different type of innovators in semiconductors, ecosystem technology partners that really push the overall solution to the full deployment for the end user. So, you know, really understanding the capabilities of engineering, the manufacturing of the product, and really pushing the computer to full deployment is what Premio is constantly trying to solve for a lot of our customers and their applications.