NVIDIA L4 Tensor Core GPU

The breakthrough universal accelerator for efficient video, AI, and graphics.

Accelerate Video, AI, and Graphics Workloads

The NVIDIA L4 Tensor Core GPU powered by the NVIDIA Ada Lovelace architecture delivers universal, energy-efficient acceleration for video, AI, visual computing, graphics, virtualization, and more. Packaged in a low-profile form factor, L4 is a cost-effective, energy-efficient solution for high throughput and low latency in every server, from the edge to the data center to the cloud.

Experience Real-Time AI Video Pipeline Performance

Transform video applications with the power of NVIDIA L4. Whether streaming live to millions of viewers, enabling users to build creative stories, or delivering immersive augmented and virtual reality (AR/VR) experiences, servers equipped with L4 allow hosting up to 1,040 concurrent AV1 video streams at 720p30 for mobile users.¹

With fourth-generation Tensor Cores and 1.5X larger GPU memory, NVIDIA L4 GPUs paired with the CV-CUDA® library take video content-understanding to a new level. L4 delivers 120X higher AI video performance than CPU-based solutions, letting enterprises gain real-time insights to personalize content, improve search relevance, detect objectionable content, and implement smart-space solutions.

1. Measured performance: 8x L4 AV1 low-latency P1 preset encode at 720p30.

Consume Less Energy and Space With L4

Generative AI for images and text makes customer lives more convenient and experiences more immersive across all industries. NVIDIA L4 supercharges compute-intensive generative AI inference by delivering up to 2.5X higher performance compared to the previous GPU generation. And with 50 percent more memory capacity, L4 enables larger image generation, up to 1024x768, which wasn’t possible on the previous GPU generation.

Optimize Graphics Performance

Over 4X Higher Real-Time Rendering and Over 3X Higher Ray-Tracing Performance

Measured performance:
Real-time rendering: NVIDIA Omniverse™ performance for real-time rendering at 1080p and 4K with NVIDIA Deep Learning Super Sampling (DLSS) 3.
Ray tracing: Gaming performance geomean for AAA titles supporting ray tracing and DLSS 3.

With third-generation RT Cores and AI-powered NVIDIA Deep Learning Super Sampling 3 (DLSS 3), NVIDIA L4 delivers over 4X higher performance for AI-based avatars, NVIDIA Omniverse™ virtual worlds, cloud gaming, and virtual workstations. These capabilities enable creators to build real-time, cinematic-quality graphics and scenes for immersive visual experiences not possible with CPUs.

Accelerate Workloads Efficiently and Sustainably

NVIDIA L4 is an integral part of the NVIDIA data center platform. Built for video, AI, NVIDIA RTX™ virtual workstation (vWS), graphics, simulation, data science, and data analytics, the platform accelerates over 3,000 applications and is available everywhere at scale, from data center to edge to cloud, delivering both dramatic performance gains and energy-efficiency opportunities.

Optimized for mainstream deployments, L4 delivers a low-profile form factor operating in a 72W low-power envelope, making it an efficient, cost-effective solution for any server or cloud instance from NVIDIA’s partner ecosystem.

Streamline Development and Deployment With Enterprise-Ready AI Software

Optimized to streamline AI development and deployment, the NVIDIA AI Enterprise software suite includes AI solution workflows, frameworks, pretrained models, and infrastructure optimization that are certified to run on common data center platforms and mainstream NVIDIA-Certified Systems™ with NVIDIA L4 GPUs.

NVIDIA AI Enterprise is a license addition for NVIDIA L4 GPUs, making AI accessible to nearly every organization with the highest performance in training, inference, and data science. NVIDIA AI Enterprise, together with NVIDIA L4, simplifies the building of an AI-ready platform, accelerates AI development and deployment, and delivers performance, security, and scalability to gather insights faster and achieve business value sooner.

Product Specifications

Form Factor	L4
FP32	30.3 teraFLOPs
TF32 Tensor Core	120 teraFLOPS*
FP16 Tensor Core	242 teraFLOPS*
BFLOAT16 Tensor Core	242 teraFLOPS*
FP8 Tensor Core	485 teraFLOPs*
INT8 Tensor Core	485 TOPs*
GPU memory	24GB
GPU memory bandwidth	300GB/s
NVENC \| NVDEC \| JPEG decoders	2 \| 4 \| 4
Max thermal design power (TDP)	72W
Form factor	1-slot low-profile, PCIe
Interconnect	PCIe Gen4 x16 64GB/s
Server options	Partner and NVIDIA-Certified Systems with 1–8 GPUs

* Shown with sparsity. Specifications are one-half lower without sparsity.

NVIDIA L4 Tensor Core GPU

Accelerate Video, AI, and Graphics Workloads

Up to 120X Higher AI Video Performance

Experience Real-Time AI Video Pipeline Performance

Consume Less Energy and Space With L4

2.5X More Generative AI Performance

Optimize Graphics Performance

Over 4X Higher Real-Time Rendering and Over 3X Higher Ray-Tracing Performance

Accelerate Workloads Efficiently and Sustainably

Streamline Development and Deployment With Enterprise-Ready AI Software

Product Specifications

Broadway Store

Valencia Store

Emeryville Store

Alameda Store