Closed
Description
Describe the bug
The behavior of the option output_type = "latent"
is not the same for the inpainting and the txt2image pipeline.
Inpainting pipeline:
If output_type = "latent"
, the value returned is the decoded image (latents
should be returned in this case):
link to file
# 11. Post-processing
image = self.decode_latents(latents)
# 12. Run safety checker
image, has_nsfw_concept = self.run_safety_checker(image, device, prompt_embeds.dtype)
# 13. Convert to PIL
if output_type == "pil":
image = self.numpy_to_pil(image)
# Offload last model to CPU
if hasattr(self, "final_offload_hook") and self.final_offload_hook is not None:
self.final_offload_hook.offload()
if not return_dict:
return (image, has_nsfw_concept)
return StableDiffusionPipelineOutput(images=image, nsfw_content_detected=has_nsfw_concept)
Text2Image pipeline:
Here when output_type == "latent"
is chosen, the latents
value is returned:
if output_type == "latent":
image = latents
has_nsfw_concept = None
elif output_type == "pil":
# 8. Post-processing
image = self.decode_latents(latents)
# 9. Run safety checker
image, has_nsfw_concept = self.run_safety_checker(image, device, prompt_embeds.dtype)
# 10. Convert to PIL
image = self.numpy_to_pil(image)
else:
# 8. Post-processing
image = self.decode_latents(latents)
# 9. Run safety checker
image, has_nsfw_concept = self.run_safety_checker(image, device, prompt_embeds.dtype)
=> I can try opening a PR for this :)
Reproduction
Not relevant here
Logs
No response
System Info
Not relevant here