Amazon SageMaker AI now supports OpenAI-compatible APIs for inference endpoints

Published

May 21, 2026

Amazon SageMaker Inference

Amazon SageMaker Inference now supports OpenAI-compatible APIs, allowing you to use familiar tools and frameworks like the OpenAI SDK, LangChain, and Strands Agents directly with your SageMaker endpoints. Simply update your endpoint URL to leverage this feature without any custom integration code, SDK wrappers, or rewrites.

With this launch, you can maintain your existing API format and authentication approach. You can choose your own GPU instances, keep data in your VPC, run any open source or fine-tuned model, and scale with auto-scaling policies tailored to your workload. Authentication uses existing AWS credentials with automatic token refresh, requiring no additional management in production.

Available Regions

US East (N. Virginia)
US West (Oregon)
US East (Ohio)
Asia Pacific (Mumbai)
Asia Pacific (Jakarta)
Europe (Ireland)
Europe (Frankfurt)
South America (São Paulo)
Asia Pacific (Tokyo)
Asia Pacific (Seoul)
Europe (London)
Asia Pacific (Singapore)
Asia Pacific (Sydney)
Canada (Central)

What to do

Update your endpoint URL to start using the new feature.
Continue using your existing SDK calls, streaming logic, and framework integrations.
Choose your preferred GPU instances and VPC settings.
Run any open source or fine-tuned model.
Scale with auto-scaling policies tailored to your workload.

Source: AWS release notes

If you need further guidance on AWS, our experts are available at AWS@westloop.io. You may also reach us by submitting the Contact Us form.