6 GPU instances, configured via AWS CDK, batch 100+ images in under an hour and then shut themselves off. That's the core promise of a new reference architecture from AWS that couples ComfyUI with SageMaker AI processing jobs. No idle compute, no manual babysitting.
Why ComfyUI on SageMaker Kills Idle GPU Costs
ComfyUI is a node-based visual workflow builder for generative AI. Think drag-and-drop pipelines for Stable Diffusion, audio synthesis, or video generation. AWS's team wired it to SageMaker processing jobs so you define the workflow as JSON, package it in a Docker container, and fire off a batch with a Lambda trigger.
The key financial lever: SageMaker processing jobs bill per-second and auto-terminate when the queue empties. No lingering GPU instances burning through your budget while you wait for designers to review a batch.
The example uses Z-Image Turbo, a 6B-parameter Scalable Single-Stream Transformer (S3DiT) for text-to-image diffusion. It fuses text and image tokens early in every layer, maximizing cross-modal interaction. The architecture delivers photorealistic outputs with manageable VRAM requirements for a ml.g5.xlarge (24 GB GPU memory).
How the Pipeline Actually Works
Three AWS CDK stacks build the infrastructure:
- DataStack: an S3 bucket with server-side encryption for outputs.
- SecurityStack: a VPC with private subnets, NAT gateway, KMS key with auto-rotation, and VPC Flow Logs.
- ComfyUISmStack: the Lambda trigger, ECR container definition, and SageMaker processing job specification.
The container downloads model weights from Hugging Face on startup, then reads prompts from a file. It loops through each prompt, populates the ComfyUI workflow template, and submits to the GPU. Generated images stream to S3 continuously as each batch finishes. A polling loop checks every 15 seconds until the queue is empty, then kills the instance.
The authors deployed with 6 ml.g5.xlarge instances in parallel, each with 125 GB of storage. Total batch time: under an hour for hundreds of outputs. You can swap in your own ComfyUI JSON workflows as long as the required models and custom nodes are in the container.
Beyond Still Images: What This Architecture Unlocks
The same pattern scales to audio synthesis, 3D asset rendering, and dynamic video. AWS calls out three specific production use cases: A/B testing ad creative at speed, generating locale-specific packaging designs for global product launches, and building interactive video narratives for gaming where AI-generated cutscenes adapt to user choices.
Because the processing job definition is reusable and the infrastructure is defined in CDK, you can plug any ComfyUI workflow into the same pipeline. The VPC with private subnets and KMS encryption keeps the whole process secure enough for enterprise brand assets.
The GitHub repository (aws-samples/sample-comfy-to-sagemaker-processing-job) includes the full CDK code, Dockerfile, and configuration YAML. Prerequisites are Python 3.13+, AWS CLI, Docker, and a bootstrapped CDK environment. You also need a service quota increase for 6 ml.g5.xlarge instances.
For enterprises that produce thousands of marketing assets monthly, cutting generation time from weeks to hours with zero idle GPU cost changes the economics of AI content pipelines. The template is there. Clone it, swap in your workflow, and fire off a batch.
Source: Running ComfyUI workflows on Amazon SageMaker AI processing jobs
Domain: aws.amazon.com
Comments load interactively on the live page.