Add Terraform example for AWS deployment#13
Conversation
Deploys loreserver on ECS Fargate with S3/DynamoDB storage. DynamoDB schemas and IAM permissions verified against lore-aws source. Signed-off-by: Sam Biggins <sabiggin@amazon.com>
- Explain that the Dockerfile build auto-registers lore-aws plugin - Document that the task runs in private subnets (VPC access required) - Add ingress to the Customize section for production paths Signed-off-by: Sam Biggins <sabiggin@amazon.com>
- Add s3:DeleteObjectVersion (required for versioned bucket cleanup) - Add edge pod service with replicated+remote stores via Cloud Map - Add Cloud Map private DNS for edge→primary discovery - Add internal SG rules for node-to-node QUIC+gRPC Signed-off-by: Sam Biggins <sabiggin@amazon.com>
565e71f to
bc4cff3
Compare
- Generate CA + server cert via tls provider (SAN: primary.lore.internal) - Store certs in Secrets Manager, provision via init containers - Primary: enables quic_internal:41340 with cert for edge replication - Edge: trusts primary CA via SSL_CERT_FILE, connects replicated+remote - Both services confirmed running in deployment test Signed-off-by: Sam Biggins <sabiggin@amazon.com>
bc4cff3 to
1ef073c
Compare
Signed-off-by: Sam Biggins <sabiggin@amazon.com>
Signed-off-by: Sam Biggins <sabiggin@amazon.com>
|
Hey @sambiggins-aws this is awesome, thanks for cotributing this. Is it possible to add some sort of integration test just to make sure it's not going stale with changes in tf versions, aws resources or Lore itself (where there are dependencies)? |
Validates resource schemas, variable wiring, and service configuration without AWS credentials. Catches breakage from Terraform/provider version upgrades or changes to the Lore AWS plugin config contract. Run: cd examples/aws && terraform init && terraform test Signed-off-by: Sam Biggins <sabiggin@amazon.com>
|
@ragnarula let me know if this is what you had in mind, or if you are looking for something more along the lines of a github action? |
|
A GH action to run your test when those files change sounds like a good idea. Thanks! |
9eb1103 to
d7a7309
Compare
Replace Fargate with ECS on EC2 to demonstrate Lore's core value proposition: NVMe-cached edge nodes with high-throughput serving. - c8gd.8xlarge default (32 vCPU, 64GB, 1.9TB NVMe, 25Gbps) - Composite store: local NVMe cache + S3 durable (primary) - Composite store: local NVMe cache + replicated durable (edge) - Separate IAM roles (primary has S3+DDB, edge has none) - Cloud Map for both primary and edge (client-facing DNS) - TLS cert SANs include both primary and edge DNS names - HMAC key via Secrets Manager for presigned URLs - Health check grace periods (120s primary, 300s edge) - DynamoDB PITR on all tables, S3 lifecycle for multipart cleanup - GSI key_schema (provider 6.x), runtime_platform ARM64 - Cache sized to 80% of NVMe (1.52TB on c8gd.8xlarge) - e2e test script (scripts/e2e-test.sh) for post-deploy validation - Full Lore CLI workflow documented in README Signed-off-by: Sam Biggins <sabiggin@amazon.com>
d7a7309 to
f9b7ec6
Compare
Runs terraform fmt, validate, and test on changes to examples/aws/. Uses mock providers (no AWS credentials needed). - hashicorp/setup-terraform@v4 pinned to 1.15.3 - Concurrency group cancels superseded runs - Self-triggering path filter for workflow changes Signed-off-by: Sam Biggins <sabiggin@amazon.com>
f9b7ec6 to
af9b711
Compare
|
@ragnarula added. cheers! |
| run: | ||
| working-directory: examples/aws | ||
| steps: | ||
| - uses: actions/checkout@v4 |
There was a problem hiding this comment.
Hi @sambiggins-aws - can you pin all actions used by this workflow to use explicit version hashes, similar to how the existing workflows do it?
Pin actions/checkout to v6.0.3 and hashicorp/setup-terraform to v4.0.1 using explicit commit SHAs, matching the convention in dco.yml and lint.yml. Signed-off-by: Sam Biggins <sabiggin@amazon.com>
|
Hi @sambiggins-aws , just reviewing the rest of this PR and was wondering whether you've actually span up the infrastructure in this PR and tested it end-to-end? If you haven't already, can you do that and detail exactly what you've proven as working in the PR description under a "Test Plan" heading. Thanks |
| { name = "LORE__IMMUTABLE_STORE__COMPOSITE__LOCAL__LOCAL__MAX_SIZE", value = "1520000000000" }, | ||
| { name = "LORE__IMMUTABLE_STORE__COMPOSITE__LOCAL__LOCAL__FLUSH_DELAY_SECONDS", value = "10" }, | ||
| { name = "LORE__IMMUTABLE_STORE__COMPOSITE__DURABLE__MODE", value = "replicated" }, | ||
| { name = "LORE__IMMUTABLE_STORE__COMPOSITE__DURABLE__REPLICATED__REMOTE_URL", value = "lore://primary.${local.name}.internal:${local.port_replication}" }, |
There was a problem hiding this comment.
The ReplicatedStore is based off a quic client so this should be quic:// or quics:// - preferably the latter.
Self-contained Terraform configuration at
examples/aws/that deploys a Lore primary + edge topology on ECS Fargate with durable S3/DynamoDB storage.Creates
Usage
See README.md for more information