Amazon SageMaker AI now supports serverless reinforcement fine-tuning for 12 additional models

Amazon SageMaker AI Updates
Amazon SageMaker AI now supports serverless model customization and reinforcement fine-tuning for 12 additional open-weight models. You can fine-tune and evaluate these models without managing infrastructure and only pay for what you use.
Newly supported models include: gpt-oss-120b, Qwen2.5 72B Instruct, DeepSeek-R1-Distill-Llama-70B, Qwen3 14B, DeepSeek-R1-Distill-Qwen-14B, Qwen2.5 14B Instruct, DeepSeek-R1-Distill-Llama-8B, DeepSeek-R1-Distill-Qwen-7B, Qwen3 4B, Meta Llama 3.2 3B Instruct, Qwen3 1.7B, and DeepSeek-R1-Distill-Qwen-1.5B.
Reinforcement fine-tuning techniques such as RLVR and RLAIF are now available for these models, enabling alignment to complex reasoning tasks. RLVR improves accuracy on tasks like code generation and math, while RLAIF uses AI feedback to steer model behavior.
What to do
- Visit the Amazon SageMaker AI model customization product page to learn more.
- Check the Amazon SageMaker AI pricing page for the full list of models, techniques, and prices.
Source: AWS release notes
If you need further guidance on AWS, our experts are available at AWS@westloop.io. You may also reach us by submitting the Contact Us form.



