Announcing Amazon EC2 Trn3 UltraServers for faster, lower-cost generative AI training

Amazon EC2 Trn3 UltraServers Now Available
AWS has announced the general availability of Amazon EC2 Trn3 UltraServers, powered by the fourth-generation AI chip, Trainium3. This is AWS's first 3nm AI chip designed for next-generation agentic, reasoning, and video generation applications.
Key Features
- Performance: Each Trainium3 chip provides 2.52 petaflops (PFLOPs) of FP8 compute.
- Memory: 144 GB of HBM3e memory, 1.5x increase over Trainium2.
- Bandwidth: 4.9 TB/s of memory bandwidth, 1.7x increase over Trainium2.
- Scalability: Up to 144 Trainium3 chips (362 FP8 PFLOPs total) in EC2 UltraClusters 3.0.
- Memory and Bandwidth: Fully configured Trn3 UltraServer delivers up to 20.7 TB of HBM3e and 706 TB/s of aggregate memory bandwidth.
- Interconnect: NeuronSwitch-v1 doubles interchip interconnect bandwidth over Trn2 UltraServer.
- Performance Metrics: Up to 4.4x higher performance, 3.9x higher memory bandwidth, and 4x better performance/watt compared to Trn2 UltraServers.
- Amazon Bedrock: Trainium3 delivers up to 3× faster performance than Trainium2 with over 5× higher output tokens per megawatt at similar latency per user.
What to do
- Leverage the AWS Neuron SDK for breakthrough performance.
- Utilize native PyTorch integration for training and deployment without changing model code.
- Access deeper Trainium3 features for fine-tuning performance and customizing kernels.
Source: AWS release notes
If you need further guidance on AWS, our experts are available at AWS@westloop.io. You may also reach us by submitting the Contact Us form.



