[스테이블 디퓨전] Depth Maps 확장기능을 이용한 이미지 생성후 다양한 각도 이미지 만들기

이 기능은 아직 시간이 없어서 안해봤지만 충분히 가능성이 있는 기능이다.

reddit에서 어느 유저가 이런 글을 올렸다!

I found a way to create different consistent angles from the same image. I generated the image with SD then in Blender rotated the angle I desired using the depth map of the image and screen-printed it. The side shot had a lot of distortions so I dropped it back in SD img2img and it is fixed

그리고 아래의 영상과 함께~

영상이 다운로드 안되어 이미지로만

이 유저 말로는 아래 공간을 프롬프트로 작성해 img2img로 보내어 블렌더에서 이미지의 깊이 맵을 사용해 다른 각도로 회전하여 동일 이미지의 다른 각도 뷰를 만들었다는 거다!

음.. 곰곰히 생각해 보니 말이 된다!

우선 아래 High Resolution Depth Maps for Stable Diffusion WebUI 확장 기능을 설치해야 한다.

https://github.com/thygate/stable-diffusion-webui-depthmap-script.git

GitHub – thygate/stable-diffusion-webui-depthmap-script: High Resolution Depth Maps for Stable Diffusion WebUI

High Resolution Depth Maps for Stable Diffusion WebUI – GitHub – thygate/stable-diffusion-webui-depthmap-script: High Resolution Depth Maps for Stable Diffusion WebUI

This script is an addon for AUTOMATIC1111’s Stable Diffusion WebUI that creates depth maps, and now also 3D stereo image pairs as side-by-side or anaglyph from a single image. The result can be viewed on 3D or holographic devices like VR headsets or Looking Glass displays, used in Render- or Game- Engines on a plane with a displacement modifier, and maybe even 3D printed.

To generate realistic depth maps from a single image, this script uses code and models from the MiDaS repository by Intel ISL, or LeReS from the AdelaiDepth repository by Advanced Intelligent Machines. Multi-resolution merging as implemented by BoostingMonocularDepth is used to generate high resolution depth maps.

3D stereo, and red/cyan anaglyph images are generated using code from the stereo-image-generation repository. Thanks to @sina-masoud-ansari for the tip! Discussion here. Improved techniques for generating stereo images and balancing distortion between eyes by @semjon00, see here and here.

3D Photography using Context-aware Layered Depth Inpainting by Virginia Tech Vision and Learning Lab , or 3D-Photo-Inpainting is used to generate a 3D inpainted mesh and render videos from said mesh.

Rembg by @DanielGatis support added by @graemeniedermayer, using U-2-Net by @xuebinqin to remove backgrounds.

위와 같이 나와 있다!

해석하면

이 스크립트는 깊이 맵을 생성하는 AUTOMATIC1111의 Stable Diffusion WebUI용 애드온이며 이제 3D 스테레오 이미지도 나란히 또는 단일 이미지에서 애너글리프로 쌍을 이룹니다. 결과는 VR 헤드셋이나 Looking Glass 디스플레이와 같은 3D 또는 홀로그램 장치에서 볼 수 있으며 변위 수정자가 있는 평면의 렌더 또는 게임 엔진에 사용되며 3D 인쇄도 가능합니다.

단일 이미지에서 사실적인 깊이 맵을 생성하기 위해 이 스크립트는 Intel ISL의 MiDaS 리포지토리 또는 Advanced Intelligent Machines의 AdelaiDepth 리포지토리의 LeReS에서 코드 및 모델을 사용합니다. BoostingMonocularDepth에 의해 구현된 다중 해상도 병합은 고해상도 깊이 맵을 생성하는 데 사용됩니다.

3D 스테레오 및 빨강/청록 애너글리프 이미지는 스테레오 이미지 생성 저장소의 코드를 사용하여 생성됩니다. 팁을 주신 @sina-masoud-ansari에게 감사드립니다! 여기에서 토론하십시오. @semjon00의 스테레오 이미지 생성 및 눈 사이의 왜곡 균형을 위한 향상된 기술은 여기 및 여기를 참조하십시오.

Virginia Tech Vision 및 Learning Lab의 컨텍스트 인식 계층 깊이 인페인팅 또는 3D-포토 인페인팅을 사용하는 3D 사진은 3D 인페인팅 메시를 생성하고 해당 메시에서 비디오를 렌더링하는 데 사용됩니다.

Rembg by @DanielGatis 지원은 @graemeniedermayer가 추가했으며 @xuebinqin의 U-2-Net을 사용하여 배경을 제거합니다.