AWS adds support for NIXL with EFA to accelerate LLM inference at scale

Published
March 19, 2026
https://aws.amazon.com/about-aws/whats-new/2026/03/aws-support-nixl-with-efa/

AWS Support for NIXL with EFA

AWS now supports the NVIDIA Inference Xfer Library (NIXL) with Elastic Fabric Adapter (EFA) to accelerate disaggregated large language model (LLM) inference on Amazon EC2. This integration enhances performance through:

  • Increased KV-cache throughput
  • Reduced inter-token latency
  • Optimized KV-cache memory utilization

NIXL with EFA enables high throughput, low-latency KV-cache transfer between nodes and efficient KV-cache movement between storage layers. It is interoperable with all EFA-enabled EC2 instances and integrates with frameworks like NVIDIA Dynamo, SGLang, and vLLM.

What to do

  • Upgrade to NIXL version 1.0.0 or higher
  • Use EFA installer version 1.47.0 or higher
  • Deploy on EFA-enabled EC2 instances in all AWS regions

Source: AWS release notes




If you need further guidance on AWS, our experts are available at AWS@westloop.io. You may also reach us by submitting the Contact Us form.

Follow our blog

Get the latest insights and advice on AWS services from our experts.

By clicking Sign Up you're confirming that you agree with our Terms and Conditions.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.