GitHub - NVIDIA-RTX/RTXNTC: NVIDIA Neural Texture Compression SDK
Key Points
- 1NVIDIA's RTX Neural Texture Compression (NTC) SDK introduces a novel lossy compression method that encodes PBR texture bundles into a small neural network decoder and latent tensors, allowing for efficient reconstruction directly in shaders.
- 2NTC significantly reduces texture memory footprint (e.g., 5 bits/texel for typical PBR bundles), offering flexible usage modes such as "Inference on Load" for BCn transcoding or "Inference on Sample" for direct, real-time decompression.
- 3The SDK leverages Cooperative Vector extensions on modern NVIDIA GPUs for optimized inference performance, provides tools for content pipeline integration, and supports DirectX 12 and Vulkan APIs across Windows and Linux.
NVIDIA RTX Neural Texture Compression (NTC) SDK v0.9.2 BETA introduces a lossy compression algorithm for PBR texture sets, designed to compress multiple correlated texture channels (up to 16, typically 9-10 for PBR materials like albedo, normal, metalness, roughness, ambient occlusion, opacity) into a single NTC texture set. The core methodology involves transforming original texture data into two components during compression: weights for a small neural network (decoder) and a tensor of latent features.
For decompression, these latents are sampled based on texture coordinates and then passed through the decoder, which is implemented as a Multi-Layer Perceptron (MLP), to reconstruct the texture colors. This sampling and decoding process is designed to be fast enough for direct integration into real-time rendering shaders, such as base pass pixel shaders or ray tracing hit shaders. As the decoder produces unfiltered data for a single texel, NTC is suggested to be combined with Stochastic Texture Filtering (STF) for filtered texture results.
NTC offers several operational modes:
- Inference on Load: Targets lower-end hardware. NTC textures are fully decompressed and transcoded into standard block-compressed formats (BCn) upon game or map loading.
- Inference on Sample: NTC textures are sampled and decoded directly within pixel shaders during rendering.
- Inference on Feedback: An advanced mode utilizing Sampler Feedback to identify only the texture tiles required for the current view, which are then decompressed and stored in a sparse tiled BCn texture.
The compression scheme is an adjustable quality/constant bitrate model, where the fixed data amount is determined by a specified latent shape. This latent shape, defined by the number and bit-depth of high- and low-resolution latent channels and their scale factor, directly corresponds to a specific per-texel bitrate. While NTC primarily uses a constant bitrate approach, it can approximate a constant quality/variable bitrate by performing pre-analysis to select optimal latent shapes for a target quality. For a typical 64 bits/texel PBR material (e.g., Albedo RGB, Normal XY, Roughness, Metalness, AO), NTC can achieve comparable quality to BCn (40-50 dB PSNR) at approximately 5 bits/texel. This translates to significant memory footprint reductions: a 2k x 2k texture bundle, which might be 32 MB raw or 12 MB BCn-compressed, can be reduced to 2.5 MB when NTC-compressed (for both disk/PCI-E traffic and VRAM in "on-sample" mode, or 2.5 MB disk/PCI-E traffic and 12 MB VRAM in "on-load" mode after transcoding to BCn).
The computational cost of NTC decompression, while modest compared to large deep learning models, is still significant for real-time rendering. NTC leverages Cooperative Vector extensions available in Vulkan and Direct3D 12 (specifically D3D12 Agility SDK 1.717.x-preview, requiring D3D12ExperimentalShaderModels and D3D12CooperativeVectorExperiment features) for hardware acceleration on Ada- and Blackwell-class GPUs, providing a 2-4x improvement in inference throughput. Fallback implementations using DP4a instructions or regular integer math ensure compatibility with Direct3D 12 Shader Model 6 on older hardware, albeit with substantial performance differences. A NVIDIA GPU driver version 590.26 or later (for DX12 Cooperative Vector) or 570+ (for Vulkan Cooperative Vector) is required for optimal performance on newer GPUs.
The SDK components include:
LibNTC: The core library for compression, decompression, serialization/deserialization, and GPU-based BCn encoding.ntc-cli: A command-line tool built onLibNTCfor automated texture processing.NTC Explorer: An interactive application for experimenting with NTC and viewing compressed files.NTC Renderer: A sample application demonstrating GLTF model rendering with NTC materials.BCTest: A test application for BCn encoder evaluation.ntc.pyandtest.py: Python scripts for automation and basic functional testing.
System requirements include Windows 10/11 x64 or Linux x64, with DirectX 12 (preview Agility SDK for Cooperative Vector) or Vulkan 1.3. For GPU capabilities: NTC decompression on load and transcoding requires Shader Model 6 (minimum GTX 1000 series, AMD RX 6000 series, Intel Arc A series), with NVIDIA Turing (RTX 2000 series) and newer recommended. NTC inference on sample also requires Shader Model 6 (functional but slow on older GPUs), with NVIDIA Ada (RTX 4000 series) and newer recommended for performance. NTC compression requires NVIDIA Turing (RTX 2000 series) minimum, and Ada (RTX 4000 series) and newer recommended.