# ace-step/ace-step-1.5

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/ace-step-ace-step-1-5).**

6,002 stars · 675 forks · Python · mit

## Links

- GitHub: https://github.com/ace-step/ACE-Step-1.5
- Homepage: https://acemusic.ai/
- awesome-repositories: https://awesome-repositories.com/repository/ace-step-ace-step-1-5.md

## Description

ACE Step 1.5 is a local text-to-music generation and audio editing system that runs on consumer hardware. It transforms plain-language descriptions into full-length songs with lyrics, and can edit existing audio through cover generation, vocal removal, track separation, and selective repainting. The system supports multilingual prompts and lyrics in over 50 languages, and provides precise control over musical structure including duration, BPM, key, and time signature.

The project distinguishes itself through a dual-stream diffusion architecture that processes separate latent streams for vocals and instruments, synchronized through cross-attention layers during denoising. It enables style personalization through lightweight LoRA adapters that can be trained from a few songs in about one hour, and supports batch generation of up to eight songs simultaneously. The system can generate complete songs in under ten seconds on a standard consumer GPU while using less than four gigabytes of video memory.

The software is accessible through multiple interfaces including a Gradio web UI, a REST API, a CLI wizard, and a VST3 plugin for direct integration into digital audio workstations. It also includes a pre-trained source separation pipeline for isolating vocal and instrumental stems from mixed audio.

## Tags

### Artificial Intelligence & ML

- [Text-to-Music Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/ai-music-composition/text-to-music-engines.md) — An engine that transforms text prompts into full-length songs with lyrics, supporting multilingual input and precise control over musical structure.
- [Local Generative Music Systems](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-music-agents/local-generative-music-systems.md) — Generating full songs from text prompts on consumer hardware without cloud dependency, supporting multilingual lyrics and style control.
- [Compositional Parameter Controllers](https://awesome-repositories.com/f/artificial-intelligence-ml/algorithmic-music-composition/compositional-parameter-controllers.md) — Suno specifies duration, BPM, key, time signature, and lyrics in 50+ languages to guide the generated composition. ([source](https://cdn.jsdelivr.net/gh/ace-step/ace-step-1.5@main/README.md))
- [Audio Source Separation Models](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-source-separation-models.md) — Removing vocals from songs or isolating instrumental tracks from uploaded audio files for remixing and editing.
- [Source Separation Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-source-separation-models/source-separation-tools.md) — Separates mixed audio into vocal and instrumental stems using a pre-trained source separation model before applying targeted editing or conversion operations.
- [Dual-Stream Diffusion Architectures](https://awesome-repositories.com/f/artificial-intelligence-ml/dual-generator-architectures/dual-stream-diffusion-architectures.md) — Generates music by processing separate latent streams for vocals and instruments that are synchronised through cross-attention layers during the denoising process.
- [Latent Diffusion Models](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-models/latent-diffusion-models.md) — Encodes text prompts into a compressed latent space where a diffusion model iteratively denoises random noise into structured audio guided by cross-attention to the text embedding.
- [Music Generation Batch Processors](https://awesome-repositories.com/f/artificial-intelligence-ml/batch-inference-engines/music-generation-batch-processors.md) — An engine that produces up to eight songs simultaneously from text prompts to accelerate creative workflows.
- [Batch Generation Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-music-agents/batch-generation-pipelines.md) — Suno produces up to eight songs simultaneously to accelerate creative workflows and experimentation. ([source](https://cdn.jsdelivr.net/gh/ace-step/ace-step-1.5@main/README.md))
- [Multilingual Prompt Systems](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-music-agents/local-generative-music-systems/multilingual-prompt-systems.md) — A system that follows prompts in over 50 languages to generate and edit music, including lyrics and structural parameters.
- [LoRA Style Adapters](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-music-agents/style-based-music-generation/lora-style-adapters.md) — Training a lightweight LoRA adapter from a few songs to capture and reproduce a user's unique musical style.
- [Diffusion Model Adaptations](https://awesome-repositories.com/f/artificial-intelligence-ml/language-model-fine-tuning/partial-layer-fine-tunings/lora-fine-tuning-pipelines/diffusion-model-adaptations.md) — Applies low-rank adaptation matrices to the diffusion model's cross-attention layers, enabling personalised style transfer with minimal parameter updates and fast training.
- [LoRA Training](https://awesome-repositories.com/f/artificial-intelligence-ml/lora-training.md) — Suno trains a lightweight adapter from a few songs to capture a user's unique style, completing training in about one hour. ([source](https://cdn.jsdelivr.net/gh/ace-step/ace-step-1.5@main/README.md))

### Part of an Awesome List

- [Text-to-Music Generators](https://awesome-repositories.com/f/awesome-lists/media/music-and-audio-generation/text-to-sound-effect-generation/text-to-music-generators.md) — Suno transforms a plain-language description into a full-length song, handling composition, lyrics, and style from a single user query. ([source](https://cdn.jsdelivr.net/gh/ace-step/ace-step-1.5@main/README.md))
- [Audio Editing](https://awesome-repositories.com/f/awesome-lists/devtools/audio-editing.md) — Suno performs cover generation, selective repainting, vocal-to-BGM conversion, and track separation on uploaded audio files. ([source](https://cdn.jsdelivr.net/gh/ace-step/ace-step-1.5@main/README.md))
- [AI-Assisted Audio Editors](https://awesome-repositories.com/f/awesome-lists/devtools/audio-editing/ai-assisted-audio-editors.md) — A platform that performs cover generation, vocal removal, track separation, and selective repainting on uploaded audio files.

### Content Management & Publishing

- [Multilingual Music Generation](https://awesome-repositories.com/f/content-management-publishing/content-management-systems/content-architecture-modeling/documentation-tooling/generation-publishing/documentation-generators/multilingual-generation/multilingual-audio-generators/multilingual-music-generation.md) — Suno follows a text description accurately regardless of the language used, supporting multilingual music generation and editing. ([source](https://ace-step.github.io/ace-step-v1.5.github.io/))

### Graphics & Multimedia

- [Programmatic Audio Editing](https://awesome-repositories.com/f/graphics-multimedia/programmatic-audio-editing.md) — Editing and remixing existing audio through cover generation, selective repainting, and track separation via API or CLI.
- [Audio Track Repainting Tools](https://awesome-repositories.com/f/graphics-multimedia/audio-track-repainting-tools.md) — Suno replaces the vocal, instrumental, or background elements of an existing track while preserving the original structure and timing. ([source](https://ace-step.github.io/ace-step-v1.5.github.io/))
- [Batch Generation Workflows](https://awesome-repositories.com/f/graphics-multimedia/musical-composition-workflows/batch-generation-workflows.md) — Producing multiple song variations simultaneously from text prompts to accelerate creative experimentation and workflow.
- [Vocal-to-Instrumental Converters](https://awesome-repositories.com/f/graphics-multimedia/vocal-artifact-removal/vocal-to-instrumental-converters.md) — Suno removes the vocal track from a song and replaces it with a purely instrumental arrangement derived from the original. ([source](https://ace-step.github.io/ace-step-v1.5.github.io/))

### Networking & Communication

- [Local Web Interfaces](https://awesome-repositories.com/f/networking-communication/local-web-interfaces.md) — Suno starts a Gradio web UI, REST API, CLI wizard, or VST3 plugin for interactive or programmatic music generation. ([source](https://cdn.jsdelivr.net/gh/ace-step/ace-step-1.5@main/README.md))

### Data & Databases

- [Parallel Diffusion Generation](https://awesome-repositories.com/f/data-databases/parallel-batch-processing/parallel-diffusion-generation.md) — Generates multiple songs simultaneously by running independent diffusion processes in parallel on the GPU, maximising throughput for batch workflows.

### DevOps & Infrastructure

- [Low-VRAM Music Generation](https://awesome-repositories.com/f/devops-infrastructure/intel-hardware-acceleration/low-bit-weight-quantization/consumer-gpu-optimizations/low-vram-music-generation.md) — Suno generates complete songs in under ten seconds on a standard consumer GPU while using less than four gigabytes of video memory. ([source](https://ace-step.github.io/ace-step-v1.5.github.io/))

### Software Engineering & Architecture

- [VST3 Plugin Packaging](https://awesome-repositories.com/f/software-engineering-architecture/plugin-execution-engines/audio-plugin-hosting/vst3-plugin-packaging.md) — Packages the generation engine as a VST3 audio plugin, enabling direct integration into digital audio workstations for real-time music production workflows.

### User Interface & Experience

- [Gradio Interfaces](https://awesome-repositories.com/f/user-interface-experience/text-editors/graphical-frontends/web-based-debugger-frontends/gradio-interfaces.md) — Serves the model through a Gradio-based interactive UI and a RESTful API endpoint, allowing both manual and programmatic access to generation and editing features.
