v0.8.0: Versatile Diffusion - Text, Images and Variations All in One Diffusion Model
πββοΈ New Models
VersatileDiffusion
VersatileDiffusion, released by SHI-Labs, is a unified multi-flow multimodal diffusion model that is capable of doing multiple tasks such as text2image, image variations, dual-guided(text+image) image generation, image2text.
- [Versatile Diffusion] Add versatile diffusion model by @patrickvonplaten @anton-l #1283
Make sure to installtransformers
from "main":
pip install git+https://github.com/huggingface/transformers
Then you can run:
from diffusers import VersatileDiffusionPipeline
import torch
import requests
from io import BytesIO
from PIL import Image
pipe = VersatileDiffusionPipeline.from_pretrained("shi-labs/versatile-diffusion", torch_dtype=torch.float16)
pipe = pipe.to("cuda")
# initial image
url = "https://huggingface.co/datasets/diffusers/images/resolve/main/benz.jpg"
response = requests.get(url)
image = Image.open(BytesIO(response.content)).convert("RGB")
# prompt
prompt = "a red car"
# text to image
image = pipe.text_to_image(prompt).images[0]
# image variation
image = pipe.image_variation(image).images[0]
# image variation
image = pipe.dual_guided(prompt, image).images[0]
More in-depth details can be found on:
AltDiffusion
AltDiffusion is a multilingual latent diffusion model that supports text-to-image generation for 9 different languages: English, Chinese, Spanish, French, Japanese, Korean, Arabic, Russian and Italian.
- Add AltDiffusion by @patrickvonplaten @patil-suraj #1299
Stable Diffusion Image Variations
StableDiffusionImageVariationPipeline
by @justinpinkney is a stable diffusion model that takes an image as an input and generates variations of that image. It is conditioned on CLIP image embeddings instead of text.
- StableDiffusionImageVariationPipeline by @patil-suraj #1365
Safe Latent Diffusion
Safe Latent Diffusion (SLD), released by ml-research@TUDarmstadt group, is a new practical and sophisticated approach to prevent unsolicited content from being generated by diffusion models. One of the authors of the research contributed their implementation to diffusers
.
- Add Safe Stable Diffusion Pipeline by @manuelbrack #1244
VQ-Diffusion with classifier-free sampling
vq diffusion classifier free sampling by @williamberman #1294
LDM super resolution
LDM super resolution is a latent 4x super-resolution diffusion model released by CompVis.
- Add LDM Super Resolution pipeline by @duongna21 #1116
CycleDiffusion
CycleDiffusion is a method that uses Text-to-Image Diffusion Models for Image-to-Image Editing. It is capable of
- Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
Traditional unpaired image-to-image translation with diffusion models trained on two related domains. - Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
Traditional unpaired image-to-image translation with diffusion models trained on two related domains.
CLIPSeg + StableDiffusionInpainting.
Uses CLIPSeg to automatically generate a mask using segmentation, and then applies Stable Diffusion in-painting.
K-Diffusion wrapper
K-Diffusion Pipeline is community pipeline that allows to use any sampler from K-diffusion with diffusers
models.
- [Community Pipelines] K-Diffusion Pipeline by @patrickvonplaten #1360
πNew SOTA Scheduler
DPMSolverMultistepScheduler
is the 𧨠diffusers
implementation of DPM-Solver++, a state-of-the-art scheduler that was contributed by one of the authors of the paper. This scheduler is able to achieve great quality in as few as 20 steps. It's a drop-in replacement for the default Stable Diffusion scheduler, so you can use it to essentially half generation times. It works so well that we adopted it for the Stable Diffusion demo Spaces: https://huggingface.co/spaces/stabilityai/stable-diffusion, https://huggingface.co/spaces/runwayml/stable-diffusion-v1-5.
You can use it like this:
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
repo_id = "runwayml/stable-diffusion-v1-5"
scheduler = DPMSolverMultistepScheduler.from_pretrained(repo_id, subfolder="scheduler")
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler)
π Better scheduler API
The example above also demonstrates how to load schedulers using a new API that is coherent with model loading and therefore more natural and intuitive.
You can load a scheduler using from_pretrained
, as demonstrated above, or you can instantiate one from an existing scheduler configuration. This is a way to replace the scheduler of a pipeline that was previously loaded:
from diffusers import DiffusionPipeline, EulerDiscreteScheduler
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
Read more about these changes in the documentation. See also the community pipeline that allows using any of the K-diffusion samplers with diffusers
, as mentioned above!
π Performance
We work relentlessly to incorporate performance optimizations and memory reduction techniques to 𧨠diffusers. These are two of the most noteworthy incorporations in this release:
- Enable memory-efficient attention by default if xFormers is installed.
- Use batched-matmuls when possible.
π Quality of Life improvements
- Fix/Enable all schedulers for in-painting
- Easier loading of local pipelines
- cpu offloading: mutli GPU support
π Changelog
- Add multistep DPM-Solver discrete scheduler by @LuChengTHU in #1132
- Remove warning about half precision on MPS by @pcuenca in #1163
- Fix typo latens -> latents by @duongna21 in #1171
- Fix community pipeline links by @pcuenca in #1162
- [Docs] Add loading script by @patrickvonplaten in #1174
- Fix dtype safety checker inpaint legacy by @patrickvonplaten in #1137
- Community pipeline img2img inpainting by @vvvm23 in #1114
- [Community Pipeline] Add multilingual stable diffusion to community pipelines by @juancopi81 in #1142
- [Flax examples] Load text encoder from subfolder by @duongna21 in #1147
- Link to Dreambooth blog post instead of W&B report by @pcuenca in #1180
- Fix small typo by @pcuenca in #1178
- [DDIMScheduler] fix noise device in ddim step by @patil-suraj in #1189
- MPS schedulers: don't use float64 by @pcuenca in #1169
- Warning for invalid options without "--with_prior_preservation" by @shirayu in #1065
- [ONNX] Improve ONNXPipeline scheduler compatibility, fix safety_checker by @anton-l in #1173
- Restore compatibility with deprecated
StableDiffusionOnnxPipeline
by @pcuenca in #1191 - Update pr docs actions by @mishig25 in #1194
- handle dtype xformers attention by @patil-suraj in #1196
- [Scheduler] Move predict epsilon to init by @patrickvonplaten in #1155
- add licenses to pipelines by @natolambert in #1201
- Fix cpu offloading by @anton-l in #1177
- Fix slow tests by @patrickvonplaten in #1210
- [Flax] fix extra copy pasta π by @camenduru in #1187
- [CLIPGuidedStableDiffusion] support DDIM scheduler by @patil-suraj in #1190
- Fix layer names convert LDM script by @duongna21 in #1206
- [Loading] Make sure loading edge cases work by @patrickvonplaten in #1192
- Add LDM Super Resolution pipeline by @duongna21 in #1116
- [Conversion] Improve conversion script by @patrickvonplaten in #1218
- DDIM docs by @patrickvonplaten in #1219
- apply
repeat_interleave
fix formps
to stable diffusion image2image pipeline by @jncasey in #1135 - Flax tests: don't hardcode number of devices by @pcuenca in #1175
- Improve documentation for the LPW pipeline by @exo-pla-net in #1182
- Factor out encode text with Copied from by @patrickvonplaten in #1224
- Match the generator device to the pipeline for DDPM and DDIM by @anton-l in #1222
- [Tests] Fix mps+generator fast tests by @anton-l in #1230
- [Tests] Adjust TPU test values by @anton-l in #1233
- Add a reference to the name 'Sampler' by @apolinario in #1172
- Fix Flax usage comments by @pcuenca in #1211
- [Docs] improve img2img example by @ruanrz in #1193
- [Stable Diffusion] Fix padding / truncation by @patrickvonplaten in #1226
- Finalize stable diffusion refactor by @patrickvonplaten in #1269
- Edited attention.py for older xformers by @Lime-Cakes in #1270
- Fix wrong link in text2img fine-tuning documentation by @daspartho in #1282
- [StableDiffusionInpaintPipeline] fix batch_size for mask and masked latents by @patil-suraj in #1279
- Add UNet 1d for RL model for planning + colab by @natolambert in #105
- Fix documentation typo for
UNet2DModel
andUNet2DConditionModel
by @xenova in #1275 - add source link to composable diffusion model by @nanliu1 in #1293
- Fix incorrect link to Stable Diffusion notebook by @dhruvrnaik in #1291
- [dreambooth] link to bitsandbytes readme for installation by @0xdevalias in #1229
- Add Scheduler.from_pretrained and better scheduler changing by @patrickvonplaten in #1286
- Add AltDiffusion by @patrickvonplaten in #1299
- Better error message for transformers dummy by @patrickvonplaten in #1306
- Revert "Update pr docs actions" by @mishig25 in #1307
- [AltDiffusion] add tests by @patil-suraj in #1311
- Add improved handling of pil by @patrickvonplaten in #1309
- cpu offloading: mutli GPU support by @dblunk88 in #1143
- vq diffusion classifier free sampling by @williamberman in #1294
- doc string args shape fix by @kamalkraj in #1243
- [Community Pipeline] CLIPSeg + StableDiffusionInpainting by @unography in #1250
- Temporary local test for PIL_INTERPOLATION by @pcuenca in #1317
- Fix gpu_id by @anton-l in #1326
- integrate ort by @prathikr in #1110
- [Custom pipeline] Easier loading of local pipelines by @patrickvonplaten in #1327
- [ONNX] Support Euler schedulers by @anton-l in #1328
- img2text Typo by @patrickvonplaten in #1329
- add docs for multi-modal examples by @natolambert in #1227
- [Flax] Fix loading scheduler from subfolder by @skirsten in #1319
- Fix/Enable all schedulers for in-painting by @patrickvonplaten in #1331
- Correct path to schedlure by @patrickvonplaten in #1322
- Avoid nested fix-copies by @anton-l in #1332
- Fix img2img speed with LMS-Discrete Scheduler by @NotNANtoN in #896
- Fix the order of casts for onnx inpainting by @anton-l in #1338
- Legacy Inpainting Pipeline for Onnx Models by @ctsims in #1237
- Jax infer support negative prompt by @entrpn in #1337
- Update README.md: IMAGIC example code snippet misspelling by @ki-arie in #1346
- Update README.md: Minor change to Imagic code snippet, missing dir error by @ki-arie in #1347
- Handle batches and Tensors in
pipeline_stable_diffusion_inpaint.py:prepare_mask_and_masked_image
by @vict0rsch in #1003 - change the sample model by @shunxing1234 in #1352
- Add bit diffusion [WIP] by @kingstut in #971
- perf: prefer batched matmuls for attention by @Birch-san in #1203
- [Community Pipelines] K-Diffusion Pipeline by @patrickvonplaten in #1360
- Add Safe Stable Diffusion Pipeline by @manuelbrack in #1244
- [examples] fix mixed_precision arg by @patil-suraj in #1359
- use memory_efficient_attention by default by @patil-suraj in #1354
- Replace logger.warn by logger.warning by @regisss in #1366
- Fix using non-square images with UNet2DModel and DDIM/DDPM pipelines by @jenkspt in #1289
- handle fp16 in
UNet2DModel
by @patil-suraj in #1216 - StableDiffusionImageVariationPipeline by @patil-suraj in #1365