forked from archetana/cmbcluster
-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
awsAmazon Web Services relatedAmazon Web Services relatedinfrastructureInfrastructure and deployment issuesInfrastructure and deployment issues
Description
User Story
As a CMBCluster user on AWS, I want seamless S3 storage mounting so that my research environments can access persistent storage with the same functionality as GCP Cloud Storage.
Description
Replace the current gcsfuse implementation with s3fs-fuse to enable S3 bucket mounting in user environments when deployed on AWS, maintaining feature parity with the GCP storage experience.
Current GCS Implementation Analysis
Based on codebase analysis:
- Setup Script:
scripts/setup-cluster.shenables GcsFuseCsiDriver addon - Storage Integration: Used for mounting Cloud Storage buckets in user environments
- User Experience: Transparent file system access to cloud storage
- Configuration: Automated through Kubernetes CSI driver
AWS S3 Equivalent Requirements
- S3 CSI Driver: Replace GcsFuseCsiDriver with s3fs-fuse integration
- Bucket Mounting: Mount S3 buckets as filesystems in user pods
- Authentication: Use IRSA (IAM Roles for Service Accounts) instead of Workload Identity
- Performance: Maintain similar performance characteristics
- Compatibility: Ensure existing user workflows continue to work
Technical Implementation
-
CSI Driver Replacement
- Remove GcsFuseCsiDriver dependency in AWS deployment
- Implement s3fs-fuse mounting solution
- Configure appropriate IAM permissions for S3 access
-
Storage Class Updates
- Create AWS-specific storage classes
- Update volume provisioning for S3 integration
- Maintain compatibility with existing PVC patterns
-
Pod Configuration
- Update user environment pod templates for s3fs-fuse
- Configure S3 credentials and access patterns
- Ensure proper filesystem permissions and security
Acceptance Criteria
- s3fs-fuse successfully mounts S3 buckets in user environments
- User environments can read/write to S3 storage transparently
- Performance is comparable to gcsfuse implementation
- Authentication works correctly with IRSA
- Storage management APIs work with S3 buckets
- Existing user workflows remain functional
- Proper error handling and logging implemented
- Documentation covers S3 storage configuration
Key Technical Differences
- Authentication: IRSA vs Workload Identity
- Mount Process: s3fs-fuse vs gcsfuse mounting
- Performance: Different caching and performance characteristics
- Configuration: S3-specific mount options and parameters
Files to Modify
- AWS deployment scripts for s3fs-fuse setup
- Helm charts with AWS-specific storage configuration
- User environment pod templates
- Storage management backend code for S3 API integration
- Documentation for AWS storage setup
Testing Requirements
- Functional testing of S3 bucket mounting
- Performance comparison with GCS implementation
- User environment compatibility testing
- Error handling and edge case validation
- Integration testing with CMBCluster storage APIs
Related to
Epic #22 - Multi-Cloud Support
Definition of Done
- S3 storage mounting works reliably in user environments
- Feature parity with GCS storage is achieved
- Performance meets user expectations
- Integration is seamless for end users
- Documentation is complete and accurate
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
awsAmazon Web Services relatedAmazon Web Services relatedinfrastructureInfrastructure and deployment issuesInfrastructure and deployment issues