[WIP] Add img2img #3426

ayushtues · 2023-05-13T11:37:48Z

Adding img2img pipeline for Kandinksy, part of #3308

ayushtues · 2023-05-13T11:41:03Z

@yiyixuxu the UnCLIP scheduler doesn't have an add_noise function, which is needed for img2img to noise the image to a particular timestep, I am not familiar with UnCLIP scheduler, but is it the same as DDPM?

yiyixuxu · 2023-05-13T17:14:03Z

I am not familiar with UnCLIP scheduler, but is it the same as DDPM?

yes yes!! see this comments here #3308 (comment) - let's just use DDPM for img2img. We will need to swap unclip scheduler with DDPM for other 2 pipelines too! but we can wait to do that later

HuggingFaceDocBuilderDev · 2023-05-14T13:04:55Z

The documentation is not available anymore as the PR was closed or merged.

ayushtues · 2023-05-14T14:12:13Z

Got some initial results working

from diffusers import KandinskyPipeline, KandinskyPriorPipeline, KandinskyImg2ImgPipeline
# from diffusers.src.pipelines import KandinskyImg2ImgPipeline
from transformers import AutoTokenizer
import requests
from PIL import Image
from io import BytesIO
from diffusers import DDPMScheduler
import torch
import numpy as np
import gc

ddpm_config = {
  "clip_sample": True,
  "clip_sample_range": 2.0,
  "sample_max_value": None,
  "num_train_timesteps": 1000,
  "prediction_type": "epsilon",
  "variance_type": "learned_range",
  "thresholding": True,
  "beta_schedule": "linear",
  "beta_start": 0.00085,
  "beta_end":0.012
}

url = "https://preview.redd.it/yu4maxz3dxo91.jpg?width=1024&format=pjpg&auto=webp&v=enabled&s=64ebd870f1f0cab6c94b5ee75ca03a53a1070068"
response = requests.get(url)
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((768, 768))

prompt = "A red cartoon frog, 4k"
device = 'cuda'
batch_size=1 


# # create prior 
pipe_prior = KandinskyPriorPipeline.from_pretrained("YiYiXu/Kandinsky-prior")
pipe_prior.to("cuda")

# use prior to generate image_emb based on our prompt
generator = torch.Generator(device=device).manual_seed(0)
image_emb = pipe_prior(prompt, generator=generator,)
zero_image_emb = pipe_prior("")


pipe = KandinskyPipeline.from_pretrained("YiYiXu/Kandinsky")
ddpm = DDPMScheduler(**ddpm_config)
generator = torch.Generator(device=device).manual_seed(0)

pipe_img2img = KandinskyImg2ImgPipeline(text_encoder=pipe.text_encoder, tokenizer=pipe.tokenizer, text_proj=pipe.text_proj, unet=pipe.unet, scheduler=ddpm, movq=pipe.movq)
pipe_img2img.to(device)
out = pipe_inpainting(prompt=prompt, image=init_image, height=768, width=768, num_inference_steps=100, generator=generator, image_embeds=image_emb, negative_image_embeds=zero_image_emb, strength=0.2) 

out[0][0]

Before img2img

After img2img

yiyixuxu · 2023-05-14T20:57:12Z

@ayushtues
awesome! Did you compare it with the results from the original repo?

ayushtues · 2023-05-15T09:50:01Z

They use DDIM scheduler, this is what I got from the original repo

from kandinsky2 import get_kandinsky2
model = get_kandinsky2(
    'cuda', 
    task_type='text2img', 
    cache_dir='.', 
    model_version='2.1', 
)


import requests
from PIL import Image
from io import BytesIO

url = "https://preview.redd.it/yu4maxz3dxo91.jpg?width=1024&format=pjpg&auto=webp&v=enabled&s=64ebd870f1f0cab6c94b5ee75ca03a53a1070068"
response = requests.get(url)
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((768, 768))

out = model.generate_img2img(prompt="A red cartoon frog, 4k", pil_img=init_image, strength=0.8, h=768, w=768)

out[0]

ayushtues · 2023-05-15T09:50:28Z

Would we expect to get exactly the same image if I replace our scheduler with DDIM?

yiyixuxu · 2023-05-15T16:19:53Z

@ayushtues

I think we will have to change the code slightly to make it match entirely - I did that for text2img. I can help run it to make sure once this is merged in
maybe you can try running the original repo with p_sampler? the https://github.com/ai-forever/Kandinsky-2/blob/main/kandinsky2/kandinsky2_1_model.py#L438

anyways let me know when this is ready to merge in - we can always make changes later from the base PR!

and also do you have a twitter handle?

yiyixuxu · 2023-05-15T16:28:20Z

@ayushtues

ohh another thing you should try is to replace the Unclip scheduler in the text2img pipeline to use DDPM - that should be easy since you already did for img2img! And we will need to see the same results as unclip scheduler for that

ayushtues · 2023-05-16T10:09:21Z

@yiyixuxu I replaced UnCLIP in text2img, and found that it produced different images, that's because there seem to be slight differences between UnCLIP and DDPM, namely

The step_ratio calculation - DDPM vs UnCLIP
Thresholding and Clipping - Its either threshold or clipping in DDPM but its both threshold and clipping in UnCLIP

So getting the same results from them is not possible, since they are different in implementation.

Also my twitter handle is - ayush_tues

ayushtues · 2023-05-16T12:03:10Z

Running with p_sampler is giving a different result, probably some difference in the way they have implemented DDPM.

Other than that, if we are fine with not exactly getting the same results as the original repo, I think we can merge the repo, and then make changes to the schedulers in the base PR itself

Or I can dig deeper into their DDPM implementation and figure out where the difference is

yiyixuxu · 2023-05-16T19:55:00Z

@ayushtues I see - I will update our DDPM scheduler to make sure it works for our model. Don't worry about that for now

Can you add a test for img2img like what I did here for inpainting? https://github.com/huggingface/diffusers/pull/3308/files#diff-a94251ed6c7af41b0a066c4bc7cc78cadc0e0fe9c766c51ed5d308a383626cc1

This reverts commit 88efed5.

yiyixuxu · 2023-05-18T00:51:53Z

@ayushtues I merged in and will make changes from my PR - Thanks for the great work!

ayushmangal added 2 commits May 13, 2023 15:56

Add img2img

c805393

Merge branch 'kandinsky' into kandinksy_img2img

821f4f5

ayushtues changed the title ~~[WIP ]Add img2img~~ [WIP] Add img2img May 13, 2023

ayushmangal added 2 commits May 14, 2023 17:20

Merge branch 'kandinsky' into kandinksy_img2img

18a9952

Add DDPM scheduler and image encoding/processing

25d5700

Fix import

3ec09a9

yiyixuxu mentioned this pull request May 14, 2023

Add Kandinsky 2.1 #3308

Merged

11 tasks

ayushmangal added 2 commits May 16, 2023 10:35

Merge branch 'kandinsky' into kandinksy_img2img

9fb7e84

Add DDPM to text2img

79c818b

ayushmangal added 3 commits May 17, 2023 17:55

Merge branch 'kandinsky' into kandinksy_img2img

0ed95a1

Add img2img tests

05a5b06

Add expected image url

3788c50

yiyixuxu merged commit 88efed5 into huggingface:kandinsky May 17, 2023

yiyixuxu pushed a commit that referenced this pull request May 17, 2023

Revert "[WIP] Add img2img (#3426)"

a05cc4f

This reverts commit 88efed5.

yiyixuxu pushed a commit that referenced this pull request May 17, 2023

ayush's PR to add img2img #3426

5b0736c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Add img2img #3426

[WIP] Add img2img #3426

Uh oh!

ayushtues commented May 13, 2023

Uh oh!

ayushtues commented May 13, 2023

Uh oh!

yiyixuxu commented May 13, 2023

Uh oh!

HuggingFaceDocBuilderDev commented May 14, 2023 •

edited

Loading

Uh oh!

ayushtues commented May 14, 2023 •

edited

Loading

Uh oh!

yiyixuxu commented May 14, 2023

Uh oh!

ayushtues commented May 15, 2023

Uh oh!

ayushtues commented May 15, 2023 •

edited

Loading

Uh oh!

yiyixuxu commented May 15, 2023

Uh oh!

yiyixuxu commented May 15, 2023

Uh oh!

ayushtues commented May 16, 2023 •

edited

Loading

Uh oh!

ayushtues commented May 16, 2023 •

edited

Loading

Uh oh!

yiyixuxu commented May 16, 2023

Uh oh!

yiyixuxu commented May 18, 2023

Uh oh!

Uh oh!

[WIP] Add img2img #3426

[WIP] Add img2img #3426

Uh oh!

Conversation

ayushtues commented May 13, 2023

Uh oh!

ayushtues commented May 13, 2023

Uh oh!

yiyixuxu commented May 13, 2023

Uh oh!

HuggingFaceDocBuilderDev commented May 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayushtues commented May 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiyixuxu commented May 14, 2023

Uh oh!

ayushtues commented May 15, 2023

Uh oh!

ayushtues commented May 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiyixuxu commented May 15, 2023

Uh oh!

yiyixuxu commented May 15, 2023

Uh oh!

ayushtues commented May 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayushtues commented May 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiyixuxu commented May 16, 2023

Uh oh!

yiyixuxu commented May 18, 2023

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 14, 2023 •

edited

Loading

ayushtues commented May 14, 2023 •

edited

Loading

ayushtues commented May 15, 2023 •

edited

Loading

ayushtues commented May 16, 2023 •

edited

Loading

ayushtues commented May 16, 2023 •

edited

Loading