Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about equations in the paper. #4

Open
frankkim1108 opened this issue Sep 18, 2024 · 1 comment
Open

Questions about equations in the paper. #4

frankkim1108 opened this issue Sep 18, 2024 · 1 comment

Comments

@frankkim1108
Copy link

frankkim1108 commented Sep 18, 2024

Hello, @liuff19

Recently, I found your project very interesting and started reading your paper. However, I have some questions on some equations in the paper.

For equations (12)
$$\ L_{\text{diffusion}} = \mathbb{E}{x \sim p, \epsilon \sim \mathcal{N}(0, I), c{\text{view}}, c_{\text{struc}}, t} \left[ |\epsilon - \epsilon_{\theta} (x_t, t, c_{\text{view}}, c_{\text{struc}})|^2_2 \right] $$

in the paper it says that $x_t$ is the noise latent from the ground-truth views of the training data. Which ground-truth view are you referencing from?

In section 5.4 it says that
'For the generated frames $\{I_i\}_{i=1}^{K{\prime}}$ we denote $\hat{C}_i$ and $C_i$ the per-pixel color value for generated and ground-truth view $i$.'

What do you mean by ground-truth view $C_i$?

It also appears in equation (13)

$$ L_{I_i} = - \log \left( \frac{1}{\sqrt{2 \pi \sigma_i^2}} \exp \left( -\frac{|C{\prime}_i - C_i|^2}{2 \sigma_i^2} \right) \right) $$

For equation (14)
$$\ L_{\text{conf}} = \sum_{i=1}^{K{\prime}} C_i \left( \lambda_{\text{rgb}} L_1(\hat{I_i}, I_i) + \lambda{\text{ssim}} L_{\text{ssim}}(\hat{I_i}, I_i) + \lambda{\text{lpips}} L_{\text{lpips}}(\hat{I_i}, I_i) \right) $$

in this equation it seems like the loss is calculated between the 32 generated frames $\hat{I_i}$ with its GT frames $I_i$.
Which GT frames are you comparing with? Is it the input sparse view? or is it from the train dataset video?

Thank you in advance for your time to reply to this issue.

Best regards
Frank

@wzy-99
Copy link

wzy-99 commented Sep 19, 2024

Hello, @liuff19

Can you tell more detail about the process of getting image

As the paper:

image

Does the equation just get only one image for training equation (13) for view i? With only one sample, it is able to optimaze equation (13)?

Or may be the equation return a set of image for view i, as:
image

Then we can use the set of data to train equation (13) to get
image and
image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants