# lllyasviel/omost

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/lllyasviel-omost).**

7,613 stars · 436 forks · Python · Apache-2.0

## Links

- GitHub: https://github.com/lllyasviel/Omost
- awesome-repositories: https://awesome-repositories.com/repository/lllyasviel-omost.md

## Description

Omost is a system of software components designed for iterative image refinement, regional layout control, and the optimization of text-to-image embedding processes. It functions as a diffusion model layout controller and an engine that uses large language models to generate executable code for precise control over image composition.

The project features a conversational image editor that allows for the refinement of visual content through natural language instructions and automated code execution. It distinguishes itself through a text embedding optimizer that organizes sub-prompts into tree graphs to prevent semantic truncation during text encoding.

The toolset covers regional image generation and complex image composition using grid-based layouts and regional attention guidance. It includes capabilities for visual element composition based on depth and color, as well as the integration of stylistic metadata to refine the overall atmospheric output.

## Tags

### Artificial Intelligence & ML

- [Code-Driven Layout Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-image-models/autoregressive-image-generation/llm-based-generators/code-driven-layout-generation.md) — Uses an LLM to generate executable code for precise control over image bounding boxes and composition.
- [Regional](https://awesome-repositories.com/f/artificial-intelligence-ml/attention-masking/regional.md) — Manipulates attention scores to ensure specific text prompts only affect designated image areas.
- [Diffusion Layout Controllers](https://awesome-repositories.com/f/artificial-intelligence-ml/diffusion-layout-controllers.md) — Defines regional bounding boxes and attention scores to guide the generation process of diffusion models.
- [Graph-Based Prompt Organization](https://awesome-repositories.com/f/artificial-intelligence-ml/graph-based-prompt-organization.md) — Structures independent descriptive concepts into a graph to merge them into cohesive prompts.
- [Image Composition Controls](https://awesome-repositories.com/f/artificial-intelligence-ml/image-composition-controls.md) — Uses large language models to generate executable code for precise control over image layout and composition.
- [Conversational Editing Interfaces](https://awesome-repositories.com/f/artificial-intelligence-ml/image-generation/image-editing/conversational-editing-interfaces.md) — Refines image composition and details through a chat interface instead of rewriting prompts.
- [Regional Prompting](https://awesome-repositories.com/f/artificial-intelligence-ml/image-region-reconstruction/regional-prompting.md) — Controls the placement and appearance of specific objects using bounding boxes and regional attention.
- [Semantic Embedding Merging](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/language-tools/tokenization-algorithms/byte-pair-encodings/greedy-merge-encoding/semantic-embedding-merging.md) — Implements a greedy merging strategy for sub-prompts to prevent semantic truncation during text encoding.
- [Precise Visual Layout Control](https://awesome-repositories.com/f/artificial-intelligence-ml/precise-visual-layout-control.md) — Ensures visual elements appear in exact intended positions using a grid system of global and local descriptions.
- [Embedding Optimization Processes](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-numeric-transformations/text-embeddings/embedding-optimization-processes.md) — Organizes sub-prompts into tree graphs to prevent semantic truncation during text encoding.
- [Embedding Optimizers](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-numeric-transformations/text-embeddings/embedding-optimizers.md) — Provides a greedy merging strategy for sub-prompts to ensure coherent text encoding and prevent semantic truncation. ([source](https://github.com/lllyasviel/omost#readme))
- [Latent Layout Mappings](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-models/latent-space-generative-models/latent-space-projections/image-to-latent-projections/generative-latent-mappings/latent-layout-mappings.md) — Organizes image components by depth and color to create initial latent maps for diffusion models.
- [Image Editing](https://awesome-repositories.com/f/artificial-intelligence-ml/image-generation/image-editing.md) — Allows users to refine generated visual content through iterative, chat-based adjustments to the image composition. ([source](https://github.com/lllyasviel/omost#readme))
- [Prompt Optimizers](https://awesome-repositories.com/f/artificial-intelligence-ml/model-optimization/profiling-and-benchmarking/model-performance-optimization/prompt-optimizers.md) — Optimizes descriptive concepts using structured graphs and embedding merges to prevent semantic truncation.
- [Code-to-Image Composition](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-to-code-generators/code-to-image-composition.md) — Converts natural language prompts into executable code for a virtual canvas agent to arrange complex visual content. ([source](https://github.com/lllyasviel/omost#readme))
- [Prompt Graph Organizers](https://awesome-repositories.com/f/artificial-intelligence-ml/prompt-graph-organizers.md) — Implements a prefix tree system to organize independent descriptive concepts into cohesive prompts via specific traversal paths. ([source](https://github.com/lllyasviel/omost#readme))

### Graphics & Multimedia

- [Grid-Based Image Layouts](https://awesome-repositories.com/f/graphics-multimedia/grid-based-image-layouts.md) — Uses a discretized grid system to assign global and local descriptions to specific bounding boxes. ([source](https://github.com/lllyasviel/omost#readme))
- [Conversational Image Editors](https://awesome-repositories.com/f/graphics-multimedia/image-processing-and-manipulation/conversational-image-editors.md) — Provides a chat interface for iteratively refining visual content through natural language and automated code execution.
- [Latent Element Composition](https://awesome-repositories.com/f/graphics-multimedia/composite-visual-overlays/latent-element-composition.md) — Organizes image components by relative depth and color to create layout maps for use as initial latents. ([source](https://github.com/lllyasviel/omost#readme))
- [Image Composition](https://awesome-repositories.com/f/graphics-multimedia/image-composition.md) — Arranges multiple visual elements on a virtual canvas using code and layout maps to create structured scenes.

### Scientific & Mathematical Computing

- [Generative](https://awesome-repositories.com/f/scientific-mathematical-computing/data-discretization/spatial-discretization/generative.md) — Provides a grid-based coordinate system to map global and local descriptions to specific image areas.
