Updated list of models supported for service tiers: Updated list of models supported for priority and flex service tiers

Published
January 2, 2026
https://docs.aws.amazon.com/bedrock/latest/userguide/service-tiers-inference.html

Amazon Bedrock Service Tiers Update

Amazon Bedrock now offers four service tiers for model inference: Reserved, Priority, Standard, and Flex. These tiers allow you to optimize for availability, cost, and performance.

Reserved Tier

The Reserved tier guarantees prioritized compute capacity for mission-critical applications with no downtime tolerance. You can allocate specific input and output tokens-per-minute capacities to match your workload requirements and control costs. If your application needs more capacity than reserved, it automatically overflows to the Standard tier, ensuring uninterrupted operations. The Reserved tier targets 99.5% uptime for model response. Capacity can be reserved for 1 or 3 months. Customers pay a fixed price per 1K tokens-per-minute and are billed monthly.

Priority Tier

The Priority tier offers the fastest response times at a premium over standard on-demand pricing. It is ideal for mission-critical applications with customer-facing workflows that do not require 24X7 capacity reservation. No prior reservation is needed. Simply set the "service_tier" parameter to "priority" to avail request-level prioritization. Priority tier requests are prioritized over Standard and Flex tier requests.

Standard Tier

The Standard tier provides consistent performance for everyday AI tasks such as content generation, text analysis, and routine document processing. By default, all inference requests are routed to the Standard tier when the "service_tier" parameter is missing. You can also set the "service_tier" parameter to "default" for your inference request to be served with the Standard tier.

Flex Tier

The Flex tier offers cost-effective processing for workloads that can handle longer processing times, such as model evaluations, content summarization, and agentic workflows. You can set the "service_tier" parameter to "flex" for your inference request to be served with the Flex tier and avail the pricing discount.

Using the Service Tier Capability

To access the service tier capability, set the "service_tier" parameter to "reserved", "priority", "default", or "flex" while calling the Amazon Bedrock runtime API.

Models and Regions Supported by Service Tiers

For detailed information on models and regions supported by each service tier, refer to the AWS documentation.

What to do

  • Set the "service_tier" parameter in your API calls to optimize for your specific needs.
  • Contact your AWS account team to access the Reserved tier.
  • Review the pricing details on the pricing page.

Source: AWS release notes




If you need further guidance on AWS, our experts are available at AWS@westloop.io. You may also reach us by submitting the Contact Us form.

Follow our blog

Get the latest insights and advice on AWS services from our experts.

By clicking Sign Up you're confirming that you agree with our Terms and Conditions.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.