At the heart of any electronic device is a processor (or processing unit) that controls it. There are several flavors of processors, each designed with different purposes in mind. The following are the most prevalent types:
- CPU (central processing unit)
- GPU (graphical processing unit)
- FPGA (field-programmable gate array)
- ASIC (application-specific integrated circuit)
A CPU is basically the standard, vanilla-flavored processor. It can have one or several core microprocessors which each operate sequentially, processing information. This is used in ordinary computers, and is the default option for any application that doesn’t require high bandwidth or very efficient use of resources.
GPUs were created to handle large amounts of graphical throughput, although they are used for other applications with a similar need to process a lot of data simultaneously. A GPU consists of thousands of processor cores to enable significant parallel processing. However, in applications that cannot provide a near-constant stream of high-bandwidth data, too many cores sit idle, resulting in an inefficient solution that wastes power and can cause high latency.
FPGAs are by definition the most versatile processors. At its simplest, these are processors that can be programmed at the hardware level to fit the application’s need. In fact, FPGAs can be reprogrammed in the field as well (hence, the “field-programmable” part of their name). This is a big advantage in any application where the need for changes are anticipated – for example, in new applications where there are evolving standards and requirements. Like GPUs, FPGAs also employ parallel processing, but without the penalty for periods of low throughput.
ASICs are the opposite of FPGAs in terms of programmability. These are fully optimized for a particular application and cannot be changed once produced. For applications with large enough scale and relatively unchanging requirements, this lack of flexibility and the high cost of design and production of ASICs is deemed worthwhile because the end result is a processor that is tailor-made for the intended use.
Processor Components
When it comes to processors of any kind – whether CPUs, GPUs, FPGAs, of ASICs – there are several ways to quantify their performance. At the risk of oversimplification, let’s examine just three:
- Memory capacity – how much information can be stored on the chip
- Compute – how well (fast) information can be processed within the chip
- Bandwidth:
- I/O bandwidth – how quickly information can be ported in and out of the chip
- Memory bandwidth – how quickly the chip can read and write memory
While all of these features are fundamental and exist in all types of processor, different applications have different priorities which can be executed better by placing a higher focus on a certain one or another aspect.
In applications that require a lot of compute power, such as weather modeling, simulations, and semiconductor design, high compute CPUs or GPUs work best. In other cases, there may be minimal computational requirements, but large memory capacity is needed.
FPGAs in Networking Applications
When it comes to edge networking and other low latency applications, FPGAs are the most attractive option. The parallel processing trait of FPGAs enables them to handle the complex networking functionalities of a telecom network with competitive performance, while allowing for future changes since FGPAs can be reprogrammed in the field.
Timothy Prickett Morgan made a similar argument on The Next Platform in his in-depth overview of the upcoming Xilinx Versal high-bandwidth memory (HBM) device: He explained that “many latency sensitive workloads in the networking, aerospace and defense, telecom, and financial services industries simply cannot get the job done [without HBM devices].” He also quoted a Xilinx senior product line manager who pointed out that CPU-based HBM devices do not include a hardware switch, which means they are obligated to cannibalize some of the internal software to achieve this. (A switch is necessary to connect all ports to all sections of the internal memory.) Using an FPGA means that some of the hardware logic can be designated as a complete working switch without relying on software. This lowers the power consumption and latency of the device.
At Ethernity, we are stalwart proponents of the idea that FPGAs are the perfect building block for networking cards and appliances. For over 18 years, we have developed and improved upon our proprietary FPGA flow processor technology to fully harness the power of FPGAs for networking. In tests we ran comparing software running on white-box servers (CPUs) to our FPGA-based accelerated solutions such as the ACE-NIC100 SmartNIC, we found that the FPGA-based solution takes up less overall space because it requires fewer cores, uses significantly less power leading to much lower operational costs, and provides deterministic 100Gbps performance with less than 3 microseconds of latency.
Thus, it remains true that FPGAs offer the performance of ASICs with the flexibility of software running on CPUs.