Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Severe Performance Degradation in Nexus Repository OSS After Migration to New Cluster #508

Open
mrdiogon opened this issue Nov 7, 2024 · 0 comments
Assignees
Labels
triage Issues that need to be investigated, replicated

Comments

@mrdiogon
Copy link

mrdiogon commented Nov 7, 2024

Description: Since migrating our Nexus Repository OSS (running in swarm mode) to a new cluster on Red Hat 9 with Docker 26, we have been experiencing significant performance degradation during artifact uploads. Initially, the upload speed is around 26MB/s, but it gradually decreases to 250KB/s over time. Restarting Nexus temporarily restores performance, but it degrades again within one or two days.

Steps to Reproduce:

  • Deploy Nexus Repository OSS in a Docker Swarm environment on Red Hat 9.
  • Start uploading artifacts to the repository.
  • Monitor the upload speed over time.
  • Expected Behavior: The upload speed should remain consistent and not degrade over time.

Actual Behavior: The upload speed starts at 26MB/s but gradually decreases to 250KB/s. Restarting Nexus temporarily restores the speed, but it degrades again within one or two days.

Workaround: To mitigate the issue temporarily, we implemented an automatic daily restart of Nexus. This solution worked for about a week, but now our reverse proxy returns errors because Nexus exceeds the read timeout, despite our attempts to adjust it.

Feature or Behavior Required: Consistent and reliable upload performance is essential for our CI/CD pipeline and overall development workflow.

Nexus Repository Deployment:

  • Version: 3.68.1 (upgraded to 3.70.0)
  • Operating System: Red Hat 9
  • Docker Version: 26
  • Database: OrientDB

Additional Information:

  • Numerous resolutions attempted: adjusting open file limits, updating Nexus, using different base images (Alpine, various Java versions), bypassing reverse proxy and swarm managers, verifying MTU settings, increasing container memory limits (from 10GB to 16GB), adjusting heap and direct memory settings.
  • Network tests indicate the issue is specific to Nexus. Other services (e.g., GitLab) migrated without issues.
  • Interesting observation: Downloading a large file using curl from within the Nexus container is extremely slow, but normal from another container on the same swarm node.
@mrprescott mrprescott added triage Issues that need to be investigated, replicated and removed pending labels Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Issues that need to be investigated, replicated
Projects
None yet
Development

No branches or pull requests

2 participants