Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MD5 check error while installing environment #748

Open
2 tasks done
romulorosa opened this issue Nov 13, 2024 · 9 comments
Open
2 tasks done

MD5 check error while installing environment #748

romulorosa opened this issue Nov 13, 2024 · 9 comments

Comments

@romulorosa
Copy link

Checklist

  • I added a descriptive title
  • I searched open reports and couldn't find a duplicate

What happened?

I generated a conda-lock file and later tried to create an environment based on the conda-lock file. However, while resolving the packages, an error was thrown saying that the expected MD5 did not match the current MD5.

In some cases, I checked the MD5 in my conda-lock and they would match the ones in https://conda.anaconda.org and https://repo.anaconda.com but even though the conda-lock install would return a different MD5.

What solved my problem was to delete all the content in ~/miniconda3/pkgs/.

How to reproduce:

I noticed that when I regenerated the conda-lock for the first time, the conda-lock file generated would contain a lot of .tar.bz2 files instead of .conda which usually happens. When I tried creating the environment, the conda-lock install returned 404 for most of the .tar.bz2 files.

Then I tried to generate the conda-lock file once more, and then I noticed that most of the files were .conda files this time. When I tried to install the environment again the 404 was not present anymore but the MD5 issues happened.

So, in order to reproduce the error do the following:

  1. Create a conda environment file that contains at least 1 package that is available as .conda and tar.bz2(such as aiohappyeyeballs)
  2. Install the environment based on the conda-lock
  3. Change the conda-lock file by updating the package to the other extension (if originally was .conda then change it to tar.bz2) and make sure to update the MD5 and SHA based on the ones available in the repository (I was using this one https://repo.anaconda.com/pkgs/main/linux-64/)
  4. Try to install again the environment and you should see the error

Additional Context

Setup:
conda 24.9.2
conda-lock, version 2.5.6
OS: Debian Linux f1fabfdb4ca8 5.15.49-linuxkit #1 SMP PREEMPT Tue Sep 13 07:51:32 UTC 2022 x86_64 GNU/Linux

Image

@maresb
Copy link
Contributor

maresb commented Nov 13, 2024

Thanks @romulorosa for the detailed report!

It seems to me like the root of the problem is that conda-lock is incorrectly changing the extensions of the files. You should expect an MD5 error if the extensions change, right?

Could you please check if you can reproduce the error from main? You can install it via

pipx install --force 'git+https://github.com/conda/conda-lock@main'

where pipx can be replaced with pip and main can be replaced with a current Git SHA.

@ctcjab
Copy link

ctcjab commented Nov 18, 2024

All the conda-lock users that I support at my firm have been hitting this issue too, and unfortunately it still reproduces with the latest development version of conda-lock.

> conda-lock --version
conda-lock, version 2.5.8.dev333+g5c5d6c1

> conda-lock install -n repro conda-lock.yml
ERROR:root:Conda detected a mismatch between the expected content and downloaded content
ERROR:root:for url 'https://artifactory.../linux-64/protobuf-5.27.5-py310hf71b8c6_0.conda'.
ERROR:root:  download saved to: /.../protobuf-5.27.5-py310hf71b8c6_0.conda
ERROR:root:  expected md5: 6b29f0a551bc9d7829b5ba90cc19a365
ERROR:root:  actual md5: ba0e6b7b9ab20a576e7cecff14816c5d
...

@maresb
Copy link
Contributor

maresb commented Nov 18, 2024

Thanks for the additional info! I haven't been able to reproduce this yet.

I'm using Docker in an attempt to achieve a portable reproducer.

Dockerfile:

FROM mambaorg/micromamba:2.0.2

RUN micromamba install -y conda-lock
COPY environment.yml /tmp
ARG MAMBA_DOCKERFILE_ACTIVATE=1
RUN conda-lock
RUN conda-lock install -n conda-lock-748-test

environment.yml:

channels:
- conda-forge
dependencies:
- protobuf
platforms:
- linux-64

Any ideas? Thanks!

@ctcjab
Copy link

ctcjab commented Nov 19, 2024

I was able to intercept the temporary file that conda-lock install is creating, save a copy, and transform it for external reproducibility (e.g. remove packages that are only available internally, and rewrite URLs to use conda-forge directly rather than go through our internal Artifactory instance). Please see attached repro.txt.

> conda create -y -n tmp --file repro.txt  # mimic the 'conda create' command that 'conda-lock install' runs

Downloading and Extracting Packages:


ChecksumMismatchError: Conda detected a mismatch between the expected content and downloaded content
for url 'https://conda.anaconda.org/conda-forge/linux-64/unicodedata2-15.1.0-py310h2372a71_0.conda'.
  download saved to: /ctc/users/bronsonj/conda_pkgs/unicodedata2-15.1.0-py310h2372a71_0.conda
  expected md5: 72637c58d36d9475fda24700c9796f19
  actual md5: 6aa8b34b52cf3ef421104720cac95423

ChecksumMismatchError: Conda detected a mismatch between the expected content and downloaded content
for url 'https://conda.anaconda.org/conda-forge/linux-64/unicodedata2-15.1.0-py310h2372a71_0.conda'.
  download saved to: /ctc/users/bronsonj/conda_pkgs/unicodedata2-15.1.0-py310h2372a71_0.conda
  expected md5: 72637c58d36d9475fda24700c9796f19
  actual md5: 6aa8b34b52cf3ef421104720cac95423

Please note that the output will be nondeterministic -- conda mentions only one of the many packages that caused a checksum mismatch error, and picks a different one every time -- though after running the above command 100 times, it resulted in a checksum mismatch failure every time.

I attempted to minimize the attached lockfile to only include one of the packages associated with a checksum mismatch error, but the issue did not reproduce in that case.

Any ideas? Thanks for your help with this!

@jab
Copy link

jab commented Nov 20, 2024

(Hi from my personal account)

Meant to say, like the OP, I am also using Linux. I just found conda/conda#13488 which also mentions conda-lock but they are using Windows. Related?

@ctcjab
Copy link

ctcjab commented Nov 20, 2024

Looked into conda/conda#13488 further, and confirmed I too have .partial files in my pkgs_dir, and when I remove those, I can no longer reproduce this. So that does look like the same issue, and it seems it is more likely to occur when using conda-lock.

@maresb
Copy link
Contributor

maresb commented Nov 20, 2024

Amazing effort, thanks so much @ctcjab for tracking this down, it's much appreciated!!!

Now there's the issue of how to proceed. Based on the upstream discussion, I really don't understand how this mismatch actually arises. It gives me the impression that there is a subtle bug that nobody has figured out yet.

Some approaches I can think of:

  1. Fixing the issue upstream in conda
  2. Recommend a workaround of deleting .partial files in pkgs/ before running conda-lock install
  3. Deleting the .partial files from conda-lock
  4. Something else I'm missing?

Obviously 1. would be ideal, but I'm not going to manage that. I think that @ctcjab has a good chance at uncovering the underlying bug by pushing a little bit deeper. I like 2. because it's low-effort, but it's not so ideal for users. As for 3. it feels a bit convoluted, but it could work.

What do you think?

@dholth
Copy link

dholth commented Nov 20, 2024

Are we using the same .partial for .conda and .tar.bz2

@maresb
Copy link
Contributor

maresb commented Nov 20, 2024

I don't know, I don't think conda-lock does anything directly with the .partial files. Isn't that more of a question for conda/conda#13488?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants