Highly anticipated: NextSilicon's hardware is beginning to move from laboratory testing to deployment in real-world systems, starting with installations at Sandia National Laboratories through the federal Vanguard-2 supercomputing program. Validation of its performance and efficiency claims beyond the controlled environments could position the company as a major new competitor in a field where energy use and scalability increasingly determine computing progress.
NextSilicon, an Israeli startup backed by more than $300 million in funding, is advancing its challenge to the dominance of Nvidia, Intel, and AMD in high-performance computing. The company announced that its new line of chips, including the Maverick-2 Intelligent Compute Accelerator and a RISC-V-based CPU code-named Arbel, is being tested by US national laboratories such as Sandia.
Engineers familiar with the program say the company's processors are under evaluation for use in systems performing highly demanding scientific computations, including nuclear weapons modeling.
The Maverick-2, now shipping to select customers, is built on TSMC's 5-nanometer process and includes either 96 or 192 gigabytes of HBM3E memory, depending on configuration. The single-die version operates at 400 watts, while the dual-die model draws 750 watts and supports air or liquid cooling.
According to internal company benchmarks, the chip delivers up to 4x the FP64 performance per watt of Nvidia's HGX B200 and 20x the efficiency of Intel's Xeon Sapphire Rapids CPUs.
Unlike traditional Von Neumann processors, the Maverick-2 uses a dataflow architecture to eliminate control overhead. Each arithmetic logic unit is interconnected in a graph, automatically triggering computation as data moves through the network.
Company founder Elad Raz said the chip reconfigures itself in real time, identifying frequently executed code paths and optimizing hardware on the fly. The firm claims this adaptability provides roughly ten times the computational output of Nvidia's latest GPU series while consuming about 60 percent of the power.
The Maverick-2 employs what NextSilicon calls Intelligent Compute Architecture, a blend of adaptive software algorithms and reconfigurable hardware. By moving many control tasks traditionally handled by hardware to an intelligent software layer, the design frees more silicon for computation. This model allows the chip to run high-performance computing and AI code written in C, C++, Python, and Fortran without rewriting existing software.
In tests shared with partners, Maverick-2 achieved 32.6 giga-updates per second – 22 times faster than CPUs and roughly six times faster than GPUs – while consuming 460 watts. In High-Performance Conjugate Gradients workloads, it produced 600 gigaflops at 750 watts, which the company says is comparable to top GPUs but at roughly half the energy draw.
Sandia National Laboratories, which has worked with NextSilicon for more than four years, said preliminary test results "show real promise for advancing computational capabilities without the overhead of extensive code modifications," according to James H. Laros III, a senior scientist and Vanguard program lead.
Alongside the accelerator, NextSilicon detailed its first central processing unit, the Arbel. Built on the same 5-nanometer node, it employs the open RISC-V instruction set rivaling architectures from Intel and AMD.
Arbel features a 10-wide instruction pipeline running at 2.5 gigahertz, a 480-entry reorder buffer, and four 128-bit vector units designed for parallel data processing. The company said it can execute up to sixteen scalar instructions per cycle and integrates a shared L3 cache to reduce latency. While still in testing, people familiar with development said Arbel was designed to pair tightly with Maverick-2 accelerators in future systems.
Industry analysts point out that the RISC-V and dataflow combination could give researchers more flexibility than proprietary CPU-GPU pairings, such as those common in Nvidia's high-end systems. Steve Conway, an analyst with Intersect360 Research, described the approach as addressing persistent inefficiencies in high-performance computing caused by high-latency pipelines and power waste.

