首页 > Ai资讯 > Ai知识库 > PhotoMaker：腾讯最新开源，根据照片生成该人物各种风格图片，项目可落地！

PhotoMaker：腾讯最新开源，根据照片生成该人物各种风格图片，项目可落地！

发布时间：2024年06月06日

项目简介

PhotoMaker是腾讯最近开源的照片生成工具，这是一种高效的个性化文本到图像生成方法。该工具不仅能根据文本描述生成逼真的人类照片，还具备“堆叠ID嵌入”功能，这一功能可以利用多张照片作为身份ID，获取人物特征，从而创造出新的、个性化的人物图像。

PhotoMaker展现了在文本到图像生成领域的新突破，尤其是在个性化和真实感方面的进步。这种技术有望在多个领域中发挥重要作用，例如艺术创作、媒体出版等。

示例

依赖和安装

Python >= 3.8（推荐使用 Anaconda 或 Miniconda）PyTorch >= 2.0.0

conda create --name photomaker python=3.10

conda activate photomaker

pip install -U pip

# Install requirements

pip install -r requirements.txt

# Install photomakerpip install git+https://github.com/TencentARC/PhotoMaker.git

然后可以运行以下命令来使用

from photomaker import PhotoMakerStableDiffusionXLPipeline

如何测试

像使用 diffusers 一样使用

·依赖项

import torch

import os

from diffusers.utils import load_image

from diffusers import EulerDiscreteScheduler

from photomaker import PhotoMakerStableDiffusionXLPipeline

### Load base model

pipe = PhotoMakerStableDiffusionXLPipeline.from_pretrained( base_model_path, # can change to any base model based on SDXL torch_dtype=torch.bfloat16, use_safetensors=True, variant="fp16").to(device)

### Load PhotoMaker checkpoint pipe.load_photomaker_adapter( os.path.dirname(photomaker_path), subfolder="", weight_name=os.path.basename(photomaker_path), trigger_word="img" # define the trigger word)

pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config)

### Also can cooperate with other LoRA modules

# pipe.load_lora_weights(os.path.dirname(lora_path), weight_name=lora_model_name, adapter_name="xl_more_art-full")

# pipe.set_adapters(["photomaker", "xl_more_art-full"], adapter_weights=[1.0, 0.5])

pipe.fuse_lora()

·输入 ID 图像

### define the input ID images

input_folder_name = './examples/newton_man'

image_basename_list = os.listdir(input_folder_name)

image_path_list = sorted([os.path.join(input_folder_name, basename) for basename in image_basename_list])

input_id_images = []

for image_path in image_path_list: input_id_images.append(load_image(image_path))

·生成

# Note that the trigger word `img` must follow the class word for personalization

prompt = "a half-body portrait of a man img wearing the sunglasses in Iron man suit, best quality"

negative_prompt = "(asymmetry, worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), open mouth, grayscale"

generator = torch.Generator(device=device).manual_seed(42)

Images=pipe( prompt=prompt, input_id_images=input_id_images, negative_prompt=negative_prompt, num_images_per_prompt=1, num_inference_steps=num_steps, start_merge_step=10, generator=generator,).images[0]

gen_images.save('out_photomaker.png')

启动一个本地 gradio 演示

运行以下命令：

python gradio_demo/app.py

使用提示

上传更多要定制的人物的照片，以提高 ID 忠实度。如果输入是亚洲面孔，可以考虑在类别词前加上“亚洲”，例如，亚洲女性图像。

在进行风格化时，生成的面孔如果看起来太真实了，调整风格强度到 30-50，数字越大，ID 忠实度越低，但风格化能力会更好。你也可以尝试使用其他基础模型或具有良好风格化效果的 LoRAs。

减少生成的图像数量和采样步骤可以加快速度。然而，减少采样步骤可能会影响 ID 忠实度。

项目链接

https://github.com/TencentARC/PhotoMaker

出自：https://mp.weixin.qq.com/s/SY-JGJfm728Gs6IyWgnGoA

RLHF 在 Text2SQL 领域中的探索 ChatGPT写论文最强指令！

PhotoMaker：腾讯最新开源，根据照片生成该人物各种风格图片，项目可落地！

最新工具