Converse API support for batch inference: You can now use the Converse API format for batch inference input data. When creating a batch inference job, set the model invocation type to Converse to use a consistent request format across models.

Published

February 28, 2026

Create a batch inference job

After setting up an Amazon S3 bucket with files for model inference, you can create a batch inference job. Ensure the files are formatted correctly as per the instructions in Format and upload your batch inference data.

What to do

Sign in to the AWS Management Console with the appropriate IAM permissions.
Navigate to the Amazon Bedrock console and select Batch inference.
Choose Create job and fill in the required details:

Job name: Provide a name for the job.
Model: Select a model for the batch inference job.
Model invocation type: Choose the API format for your input data.
Input data: Select an S3 location for your batch inference job.
Output data: Choose an S3 location to store the output files.
Service access: Select an existing service role or create a new one.

Optionally, add tags to the job.
Choose Create batch inference job.

API Method

To create a batch inference job via API, send a CreateModelInvocationJob request with the required fields:

jobName: Specify a name for the job.
roleArn: Provide the ARN of the service role with permissions to create and manage the job.
modelId: Specify the ID or ARN of the model to use in inference.
inputDataConfig: Specify the S3 location containing the input data.
outputDataConfig: Specify the S3 location to write the model responses to.

Optional fields:

modelInvocationType: Specify the API format of the input data.
timeoutDurationInHours: Specify the duration in hours after which the job will time out.
tags: Specify any tags to associate with the job.
vpcConfig: Specify the VPC configuration to use to protect your data during the job.
clientRequestToken: Ensure the API request completes only once.

The response returns a jobArn that you can use to refer to the job when carrying out other batch inference-related API calls.

Source: AWS release notes

If you need further guidance on AWS, our experts are available at AWS@westloop.io. You may also reach us by submitting the Contact Us form.

Converse API support for batch inference: You can now use the Converse API format for batch inference input data. When creating a batch inference job, set the model invocation type to Converse to use a consistent request format across models.

Create a batch inference job

What to do

API Method

Follow our blog

Related posts

AWS Batch now provides AMI status and supports AWS Health Planned Lifecycle Events

Updated model support for batch inference: Batch inference now supports DeepSeek V3.1, Qwen3 32B (dense), Qwen3 235B A22B 2507, Qwen3-Coder-30B-A3B-Instruct, and Qwen3 Coder 480B A35B Instruct.

Amazon Bedrock is now available in additional Regions

Email

Phone

Office