Converse API support for batch inference: You can now use the Converse API format for batch inference input data. When creating a batch inference job, set the model invocation type to Converse to use a consistent request format across models.

Create a batch inference job
After setting up an Amazon S3 bucket with files for model inference, you can create a batch inference job. Ensure the files are formatted correctly as per the instructions in Format and upload your batch inference data.
What to do
- Sign in to the AWS Management Console with the appropriate IAM permissions.
- Navigate to the Amazon Bedrock console and select Batch inference.
- Choose Create job and fill in the required details:
- Job name: Provide a name for the job.
- Model: Select a model for the batch inference job.
- Model invocation type: Choose the API format for your input data.
- Input data: Select an S3 location for your batch inference job.
- Output data: Choose an S3 location to store the output files.
- Service access: Select an existing service role or create a new one.
- Optionally, add tags to the job.
- Choose Create batch inference job.
API Method
To create a batch inference job via API, send a CreateModelInvocationJob request with the required fields:
- jobName: Specify a name for the job.
- roleArn: Provide the ARN of the service role with permissions to create and manage the job.
- modelId: Specify the ID or ARN of the model to use in inference.
- inputDataConfig: Specify the S3 location containing the input data.
- outputDataConfig: Specify the S3 location to write the model responses to.
- Optional fields:
- modelInvocationType: Specify the API format of the input data.
- timeoutDurationInHours: Specify the duration in hours after which the job will time out.
- tags: Specify any tags to associate with the job.
- vpcConfig: Specify the VPC configuration to use to protect your data during the job.
- clientRequestToken: Ensure the API request completes only once.
The response returns a jobArn that you can use to refer to the job when carrying out other batch inference-related API calls.
Source: AWS release notes
If you need further guidance on AWS, our experts are available at AWS@westloop.io. You may also reach us by submitting the Contact Us form.



