Closed
Description
import PIL
import requests
import torch
from io import BytesIO
from diffusers import StableDiffusionInpaintPipeline
def download_image(url):
response = requests.get(url)
return PIL.Image.open(BytesIO(response.content)).convert("RGB")
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
init_image = download_image(img_url)
mask_image = download_image(mask_url)
pipe = StableDiffusionInpaintPipeline.from_pretrained(
"runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")
prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
image = pipe(prompt=prompt, image=init_image, mask_image=mask_image, height=1024, width=1024).images[0]
I got this error:
│ D:\App\miniconda\envs\aigc\lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stabl │
│ e_diffusion_inpaint.py:871 in __call__ │
│ │
│ 868 │ │ │ │ │
│ 869 │ │ │ │ # concat latents, mask, masked_image_latents in the channel dimension │
│ 870 │ │ │ │ latent_model_input = self.scheduler.scale_model_input(latent_model_input │
│ ❱ 871 │ │ │ │ latent_model_input = torch.cat([latent_model_input, mask, masked_image_l │
│ 872 │ │ │ │ │
│ 873 │ │ │ │ # predict the noise residual │
│ 874 │ │ │ │ noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=prom │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 128 but got size 64 for tensor number 2
in the list.
If I set the height=520, width=520, I got this error:
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 65 but got size 64 for tensor number 2 in
the list.
Can I customize the height and width of the output image? If so, what are the requirements for height and width?
Thanks for your helps!
Metadata
Metadata
Assignees
Labels
No labels