

IMAGE
Step1X-Edit is designed for general-purpose image editing based on user instructions. It takes a reference image and a text prompt describing the desired change (e.g., “remove the person”, “change the background to a beach”, “make it look like a watercolor painting”). The model uses a multimodal language model to understand the image and the instruction, and then guides a diffusion process to generate the edited image while preserving relevant parts of the original.