About ACE-Step 1.5 in ComfyUI
ACE-Step 1.5 is a major update to the open-source music generation model, now natively supported in ComfyUI. It brings commercial-grade quality to your local machine with a novel hybrid architecture where a Language Model acts as an omni-capable planner, transforming simple user queries into comprehensive song blueprints. ACE-Step 1.5 Model highlights:- Commercial-grade quality: Achieves quality beyond most commercial music models, scoring 4.72 on musical coherence
- Blazing fast generation: Generate a full 4-minute song in ~1 second on RTX 5090, or under 10 seconds on RTX 3090 with ComfyUI
- 50+ language support: Strong support for English, Chinese, Japanese, Korean, Spanish, German, French, Portuguese, Italian, and Russian
- LoRA fine-tuning: Supports lightweight personalization through LoRA training in ComfyUI
Option 1: All-in-One Checkpoint (Recommended)
The AIO version packages all models into a single checkpoint file, making it easier to download and manage.AIO Workflow
Run on Comfy Cloud
Run the AIO workflow directly on Comfy Cloud.
Download Workflow
Download the all-in-one checkpoint workflow for local use.
AIO Model Download
AIO Model Storage LocationOption 2: Split Model Files
The split version allows you to download individual model components separately.Split Workflow
Run on Comfy Cloud
Run the split models workflow directly on Comfy Cloud.
Download Workflow
Download the split models workflow for local use.
Split Model Downloads
acestep_v1.5_turbo.safetensors
Diffusion model.
qwen_0.6b_ace15.safetensors
Text encoder (0.6B).
qwen_1.7b_ace15.safetensors
Text encoder (1.7B).
ace_1.5_vae.safetensors
VAE model.
ACE-Step 1.5 Key Features in ComfyUI
Chain-of-Thought Planning
The ACE-Step 1.5 model synthesizes metadata, lyrics, and captions via Chain-of-Thought reasoning to guide the diffusion process, resulting in more coherent long-form compositions.Hybrid LM + DiT Architecture
ACE-Step 1.5 combines a Language Model that plans the song structure with a Diffusion Transformer (DiT) that handles audio synthesis, all running natively in ComfyUI.LoRA Fine-Tuning in ComfyUI
With just a few songs, you can train a LoRA that captures a specific style. Because you run ACE-Step 1.5 locally in ComfyUI, you own the LoRA and don’t have to worry about data leakage.Coming Soon to ComfyUI
These features are available in ACE-Step 1.5 but not yet supported in ComfyUI:- Cover: Give the model any song as input along with a new prompt and lyrics, and it will reimagine the track in a completely different style
- Repaint: Select a segment, regenerate just that section, and the model stitches it back in while keeping everything else untouched