The following is a suggestion for an action plan to handle known vulnerability posture management. Please, provide corrections, suggestions and addendums.
1. Context
The OS2ai ecosystem incorporates numerous upstream software components. Currently, the core team lacks granular control over these dependencies, and visibility into their security lifecycle is limited.
We are currently unable to:
- Automatically detect new CVEs in our stack.
- Act swiftly on security threats due to external dependencies in our build process.
- Guarantee a hardened environment for users.
2. Drivers
- Security: Minimizing the risk of supply chain attacks.
- Agility: Empowering the core team to patch vulnerabilities without waiting for third-party updates.
- Compliance: Meeting stakeholders expectations for security monitoring in AI tools.
3. Options
- Option 1: Maintain the status quo (relying on upstream image maintainers).
- Option 2: Implement automated scheduled vulnerability scanning on existing images but remain on the current build process.
- Option 3: (Proposed) Implement automated scheduled vulnerability scanning, migrate to internally managed Docker builds and block new releases if a new CVE's is detected in upstream during CI.
4. Outcome
The core team will proceed with Option 3. To ensure immediate security coverage, we will utilize Option 2 as a temporary stop-gap measure during the transition.
Consequences
Positive:
- Full Traceability: The core team knows exactly what goes into every image.
- Proactive Defense: Automated alerts allow the core team to address vulnerabilities before they are exploited.
Negative:
- Maintenance Overhead: The core team must now maintain Dockerfiles and CI/CD workflows.
- Triage Responsibility: Time must be dedicated by the core team to fixing vulnerability alerts.
5. Implementation Plan
- Immediate Stop-gap (Option 2): Implement automated scanning on our currently used images to gain immediate visibility into existing CVEs.
- Governance: Draft a "Security Guideline" defining how the core team must act upon vulnerability reports based on severity.
- Docker Migration: Transition to core team ownership of the OS2ai Docker image build process to allow for direct patching and image hardening.
6. Estimate
Implementation according to plan with all steps included is a Large complexity, roughly translating to a week of work for a senior professional. That also includes review and desired refactor from review.
Another important factor to note is the recurring cost of maintaining additional explicit responsibility in the shape of patching, mitigations and maintenance of pipelines. This would be amount to a Small complexity every month until retirement of the system is completed and EOL has been reached. Over the course of releases this work load will probably increase to a Medium if not mitigated in due time via strict EOL policies and the removal of deprecated images.
The following is a suggestion for an action plan to handle known vulnerability posture management. Please, provide corrections, suggestions and addendums.
1. Context
The OS2ai ecosystem incorporates numerous upstream software components. Currently, the core team lacks granular control over these dependencies, and visibility into their security lifecycle is limited.
We are currently unable to:
2. Drivers
3. Options
4. Outcome
The core team will proceed with Option 3. To ensure immediate security coverage, we will utilize Option 2 as a temporary stop-gap measure during the transition.
Consequences
Positive:
Negative:
5. Implementation Plan
6. Estimate
Implementation according to plan with all steps included is a Large complexity, roughly translating to a week of work for a senior professional. That also includes review and desired refactor from review.
Another important factor to note is the recurring cost of maintaining additional explicit responsibility in the shape of patching, mitigations and maintenance of pipelines. This would be amount to a Small complexity every month until retirement of the system is completed and EOL has been reached. Over the course of releases this work load will probably increase to a Medium if not mitigated in due time via strict EOL policies and the removal of deprecated images.