Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a user a can manage labels on the content #3338

Closed
ipanova opened this issue Oct 20, 2022 · 12 comments
Closed

As a user a can manage labels on the content #3338

ipanova opened this issue Oct 20, 2022 · 12 comments
Labels
COPR Features desired by the COPR Team Feature

Comments

@ipanova
Copy link
Member

ipanova commented Oct 20, 2022

Is your feature request related to a problem? Please describe.
Scope (these are not all usecases):
As a user I can filter content by specific labels
As a user I can set labels on the content I am uploading to a destination repository
As a user I can set labels on the content which is already present in a repo
As a user I can update/remove labels on the content which is already present in a repo
As a user I can preserve labels by specifying a flag –preserve-labels=True when copying content between repos ( by default content is copied without labels)

Additional context
We already have labels on repositories, remotes,distributions, etc
Let's add labels to the content too.
Labels will have k8s style labels.

Questions:

  1. When labeling content should labels be shared across all the repos/repo versions it is present in?Example: Alice labels rpm.foo as "release: rhel8" this label is applied to every repo Dev, Test, Prod. (Not sure how this will work with RBAC, what if user Alice who labels rpm.foo does not have any RBAC to the repo Prod but just Dev and Test)
  2. When labeling content does it make sense to be able to label content differently based on the repo/repo version it belongs to aka content labels are not shared across repo/repo versions/ Example: Alice labels rpm.foo as {"release: rhel8", "env": dev"} in repo Dev. This label is applied to only repo Dev. Alice labels rpm.foo as {"env: test"} in repo Test.
  3. Should the content be searched by labels per repo or across repos. I guess this will depend on 1 or 2

Additional info:
https://www.jfrog.com/confluence/display/JFROG/Property+Sets
https://www.jfrog.com/confluence/display/JFROG/Artifactory+REST+API#ArtifactoryRESTAPI-RetrieveArtifact
https://www.jfrog.com/confluence/display/JFROG/Artifactory+REST+API#ArtifactoryRESTAPI-PropertySearch

@mdellweg
Copy link
Member

I do not think, 1. fit's the pulp philosophy.
What if we make ContentLabel a new content-type?

@ipanova
Copy link
Member Author

ipanova commented Oct 21, 2022

@mdellweg problem with this approach is that it adds additional overhead when it comes to ContentLabels management. You will need to safeguard that labels are removed if content is removed from a repo( labels cannot stay in the repo without its content). Then there is question whether labels should (or not) be automatically copied over between repos when content is copied.

@ipanova
Copy link
Member Author

ipanova commented Oct 21, 2022

One of the ideas was to use RepositoryContent model to make the labels content and repo aware. @dralley you remember the details?

@FrostyX
Copy link
Contributor

FrostyX commented Jul 29, 2024

I thought that I needed labels for artifacts but labels for content might work for me as well. Let me elaborate on my use-case.

When a Copr build finishes and I want to upload the artifacts, I do:

  1. Make sure a correct repository and distribution exists
  2. For all produced RPM packages:
    1. Upload the RPM package as an artifact
    2. Create a content for the artifact (effectively adding it to the repository)
  3. Create a new publication for the repository
  4. Update my distribution to point to the new publication

Somewhere along the way, I would like to label the artifacts/content with some additional information such as Copr build ID.

For some other processes, I have a requirement to work with RPM packages that came from a specific Copr build. For example, remove them once they are no longer needed. In that case I do:

  1. Get the latest repository version (I'd prefer to get a repository but a repository version is needed for another step)
  2. List all packages for that repository version
  3. I need to filter only the packages labeled copr_build_id=1234 but this is not possible to do yet. I think this RFE should solve this. Also, for performance reasons it would be better if this wasn't a separate step but rather part of the previous step.
  4. Remove the packages

It is very well possible that I do something that doesn't make sense or that I don't really know what I want. Please let me know if you need more information about something.

@ggainey
Copy link
Contributor

ggainey commented Aug 13, 2024

Notes from team-discussion around this issue , 2024-08-13:

  • Three (so far) possible implementation approaches
    a) Label Content itself
    b) Label the RepositoryVersion
    c) Label the RepositoryContent table

Labelling Content collides, in multi-user Pulp installs, with RBAC and Content-deduplication.

  • example: Two users upload the same (by sha256) Content into their "own" Repositories. Both can label it - and both will "see" the other user's label (and could unlabel it!)

Labelling the RepositoryVersion says "everything added in this RV is labelled 'foo'". This collides with "deleting repo-versions collapses 'added' into the remaining repo-version"

  • example: content is added and labelled in RV-1, RV-2, and RV-3. Deleting RV-1 and RV-2 results in labels for RV-3 saying "content here is a mix of label-1, label-2, and label-3", with no way to differentiate what content started with which label.

Labelling at the RepositoryContent model says "This Content in This Repository is labelled 'foo''. This "feels like" it's the usecase COPR (specifically) needs. It does require the user to provide both content- and repository-info.

Meanwhile, pulp_ansible has CollectionMarks to accomplish a similar goal - maybe this approach should be generalized into pulpcore?

Discussion continues - but the priority is "agree on an approach and implement Soon"

@ggainey
Copy link
Contributor

ggainey commented Aug 13, 2024

@FrostyX just a quick note - your "For all produced RPM packages: " can be reduced to "upload Package to Repository" as one API call - you don't have to create Artifact/Turn into Package/Assign to Repo" as separate steps., specifying "file" and "repository".

@mdellweg
Copy link
Member

For completeness sake: The marks mentioned are a type of content themselves, allowing to add attributes to content depending on the repository version.
e.g. You can mark a piece of content "deprecated" starting in some repository version. The big question is how sticky are the marks, when it comes to copying content from one repository to another.

@daviddavis
Copy link
Contributor

We would like to be able to tag content when it's uploaded/created. Our use case is that we need to keep track of which team in our org uploaded the content initially, and then filter that content later on. I think being able to tag content within the context of a repo would also be useful although we don't have an immediate need for it.

@daviddavis
Copy link
Contributor

Labelling Content collides, in multi-user Pulp installs, with RBAC and Content-deduplication.

  • example: Two users upload the same (by sha256) Content into their "own" Repositories. Both can label it - and both will "see" the other user's label (and could unlabel it!)

We only have a single user installation of Pulp but I wonder if it'd be possible for a user to only see his/her own labels?

@ggainey
Copy link
Contributor

ggainey commented Aug 19, 2024

We only have a single user installation of Pulp but I wonder if it'd be possible for a user to only see his/her own labels?

Not if the label(s) are on the Content itself - because Content is deduplicated/shared, so you'd see any labels anybody who has access to a specific content-id applied, and not know who had applied them.

@daviddavis
Copy link
Contributor

Right, I was imagining that Pulp would have to keep track of who created the label.

@daviddavis
Copy link
Contributor

I haven't used Pulp's RBAC so forgive my ignorance, but would another option be to only allow admins (or certain roles) to edit/read/etc content labels? Or perhaps have specific permissions for users to read/edit/etc content labels globally?

FrostyX added a commit to FrostyX/copr that referenced this issue Aug 28, 2024
Fix pulp/pulp_rpm#3719

Instead of uploading RPM packages as artifacts in one API call, and
then creating a content from them in a separate call, we will now
create the content directly.

There are mutiple reasons to do so:

- One API call instead of two
- It fixes the issue with installing packages mentioned above
- @dkliban says there is an effort to not allow uploading artifacts on
  shared Pulp instances

There is only one disadvantage of doing this, we lose track of what
RPM packages belong to a specific Copr build ID. We will use labels
for this, once they are implemented.

See pulp/pulpcore#3338 (comment)
praiskup pushed a commit to fedora-copr/copr that referenced this issue Sep 2, 2024
Fix pulp/pulp_rpm#3719

Instead of uploading RPM packages as artifacts in one API call, and
then creating a content from them in a separate call, we will now
create the content directly.

There are mutiple reasons to do so:

- One API call instead of two
- It fixes the issue with installing packages mentioned above
- @dkliban says there is an effort to not allow uploading artifacts on
  shared Pulp instances

There is only one disadvantage of doing this, we lose track of what
RPM packages belong to a specific Copr build ID. We will use labels
for this, once they are implemented.

See pulp/pulpcore#3338 (comment)
@ipanova ipanova added the COPR Features desired by the COPR Team label Sep 10, 2024
ggainey added a commit to ggainey/pulpcore that referenced this issue Feb 25, 2025
Uses the same approach used by Repository, Distribution, and Remote.

closes pulp#3338.
ggainey added a commit to ggainey/pulpcore that referenced this issue Feb 25, 2025
Uses the same approach used by Repository, Distribution, and Remote.

closes pulp#3338.
ggainey added a commit to ggainey/pulpcore that referenced this issue Feb 25, 2025
Uses the same approach used by Repository, Distribution, and Remote.

closes pulp#3338.
ggainey added a commit to ggainey/pulpcore that referenced this issue Feb 25, 2025
Uses the same approach used by Repository, Distribution, and Remote.

closes pulp#3338.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
COPR Features desired by the COPR Team Feature
Projects
None yet
Development

No branches or pull requests

5 participants