Skip to content

关于ddpm中的模型存在一些疑惑,烦请解答 #31

@tongchangD

Description

@tongchangD

已知 $x_t=\sqrt{α_t}x_0+\sqrt{1-α_t}ε$ 其中 $α_t$ 随着t的增大$α_t$越来越小, 即原图权重越来越小,而噪声权重越来越大。

训练时 根据以上公式,可以从 $x_{t-1}$得到$x_t$的图像,再通过Unet 网络输入$x_t$和t 预测出 $x_{t-1}$$x_t$的 噪声。

    eps = torch.randn_like(src).to(device)  # 根据 x的shape 随机生成eps 数据 数据符合正态分布随机数
    x_t = ddpm.sample_forward(src, t, eps)  # 原图 步数t 噪声图
    eps_theta = net(x_t, t.reshape(current_batch_size, 1))
    loss = loss_fn(eps_theta, eps)

而推理时
$x_{t-1}=\frac{x_t-\sqrt{1-α_t}ε}{\sqrt{α_t}}$
但是在如下代码实现中为什么 在倒数第二行,返回的是 mean + noise,即 输入$X_t$返回的 $X_{t-1}+noise$ ,为什么要加noise

    def sample_backward_step(self, x_t, t, net, simple_var=False):
        n = x_t.shape[0]
        t_tensor = torch.tensor([t] * n, dtype=torch.long).to(x_t.device).unsqueeze(1)
        eps = net(x_t, t_tensor)
        if t == 0:
            noise = 0
        else:
            if simple_var:
                var = self.betas[t]
            else:
                var = (1 - self.alpha_bars[t - 1]) / (1 - self.alpha_bars[t]) * self.betas[t]
            noise = torch.randn_like(x_t)
            noise *= torch.sqrt(var)
        mean = (x_t - (1 - self.alphas[t]) / torch.sqrt(1 - self.alpha_bars[t]) * eps) / torch.sqrt(self.alphas[t])
        x_t = mean + noise
        return x_t

望解答

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions