RecAccel™ N3000 DLRM Inference Accelerator

The Neuchips RecAccel™ N3000 revolutionizes deep learning recommendation systems by combining patented FFP8 technology with advanced hardware architecture. This PCIe Gen5 solution delivers over 20M inferences per second at 20 Watts, achieving >99.95% accuracy while offering 1.7x better performance-per-watt. Built on TSMC 7nm process and supported by a comprehensive software ecosystem, the N3000 sets new standards for AI acceleration in modern data centers.

Overview

Revolutionary AI Acceleration
for DLRM

The Neuchips RecAccel N3000 is a specialized AI acceleration solution designed for deep learning recommendation models (DLRM), offering exceptional performance and efficiency. The solution's core technology advantages center around its advanced FFP8 technology, which features patented FFP8 quantization that maintains >99.95% FP32 accuracy while improving efficiency. The system achieves remarkable performance metrics, delivering over 20M inferences per second at just 20 Watts power consumption, made possible through its implementation of the TSMC 7nm process.

From a hardware perspective, the RecAccel N3000 comes as a dual-slot PCI Express Gen5 card with an architecture custom-designed for recommendation inference workloads. One of its most notable features is the ability to demonstrate near-linear performance scaling with multiple cards, while delivering a 1.7x improvement in performance-per-watt compared to alternatives.

The software ecosystem supporting the N3000 is equally robust, featuring a comprehensive SDK that includes automated tooling for network deployment. Developers benefit from a suite of development tools including an FFP Quantizer, bit-true simulator, and performance estimator. Integration support is streamlined through included debugger and calibration tools, while proprietary recommender operators and configurable resource allocation ensure optimal performance.

Performance validation has been thoroughly demonstrated through MLPerf v3.0 benchmarks, where the system showed industry-leading results. The solution maintains 99.9% accuracy using its proprietary INT8 calibrator, and notably achieved nearly 100% linear scaling across multiple cards during testing.

In terms of sustainability and business impact, the N3000's optimized architecture ensures minimal power consumption, leading to a lower overall total cost of ownership through improved efficiency. The system is specifically designed for elastic data center environments, and supports sustainability initiatives between hyperscalers and CSPs.

As organizations increasingly rely on recommendation systems for their operations, the RecAccel N3000 represents a significant advancement in AI recommendation acceleration. It successfully combines high accuracy, exceptional performance, and energy efficiency in a scalable solution that meets the demands of modern data center deployments.

Specification

RecAccel N3000 Specification
Embedded ARC HS48 Processors	CPut - Dual Core with MM Function CPU1 - Dual Core
Embedded ARC EV72 Processors	Vector DSP FPU
Communication Interface	PCle Gen 5x8
Memory	16 LPDDRS-6400 Controllers 512KB Shared Memory
Al Accelerators	Matrix Engines × 10 (Dynamic MLP Engine, DME) Vector Engines x 2 (Feature Cross, FX) Embedding Engine

Product Features

State-of-the-art Hardware Solution for AI Recommendation System Acceleration

Boost the power of your AI recommendation systems with the RecAccel™ N3000 PCIe card. This dual-slot PCI Express Gen5 card delivers exceptional performance and reliability, making it perfect for the most demanding elastic data centers. Powered by the NEUCHIPS RecAccel™ N3000 series AI chip, this card offers powerful acceleration capabilities that take your AI recommendation systems to the next level. Experience the ultimate in speed, performance, and reliability- and unlock the full potential of accelerated AI for your business.

Driven by Deep Understanding of Cloud Recommendation

The revolutionized domain-specific architecture design for cloud recommendation is a result of deep insights into the intricate interplay between compute-bound, latency-bound, memory-bound, energy-bound, and accuracy-bound requirements. With supreme algorithmic optimizations, hardware acceleration, data caching, and power management, this design promises to deliver unparalleled performance and efficiency, setting a new standard for cloud recommendation systems.

Industry Leading Results for MLPerf™ DLRM Inference Benchmarking

The RecAccel™ N3000 has proven itself to be a leader in both performance and power efficiency in the industry. During MLPerf™ v3.0, it has shown that the RecAccel™ N3000 system delivers 1.7 times better performance per watt for inference DLRM while maintaining 99.9% accuracy with the help of its proprietary INT8 calibrator.

SW-HW Co-design Achieves Linear Multicard Scalability

During system testing in MLPerf™ v3.0, the RecAccel™ N3000 delivered exceptional performance, with nearly 100% scaling observed across each card. This means that the card's performance increased linearly with the addition of more cards, allowing for seamless scalability and enhanced performance.

Key Benefits

Superior Energy Efficiency

Achieves 20 million inferences per second at just 20 watts
50% reduction in off-chip memory access
30% improvement in bandwidth utilization
Industry-leading power efficiency for DLRM workloads

Exceptional Accuracy

Achieves 99.97% of FP32 accuracy with INT8
Improves to 99.996% accuracy with FFP8
Maintains high accuracy while reducing power consumption
Optimized for production-grade recommendation models

Flexible Deployment Options

Available in dual M.2 modules
Supports PCIe Gen 5 cards
Scalable for various data center configurations

Simplified Implementation

Automated optimization path
Easy integration with existing systems
Comprehensive development tools and support

Recommend

Raptor N3000 LLM Inference ASIC