-
Notifications
You must be signed in to change notification settings - Fork 31
Add a Terraform configuration to deploy lnt.llvm.org #128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This patch adds a Terraform configuration file that should allow deploying to an EC2 instance. It requires a few secrets to be made available to Github Actions.
| } | ||
|
|
||
| resource "aws_instance" "docker_server" { | ||
| ami = "ami-0c97bd51d598d45e4" # Amazon Linux 2023 kernel-6.12 AMI in us-west-2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@boomanaiden154 Are we OK with hardcoding the AMI? What do you folks usually do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not familiar with how AWS does things. Hard coding it doesn't seem like a big deal. But we want to be able to change it, which would probably force instance recreation. I think we should do what I suggested above where the instance is a clean slate on every boot but mounts a persistent volume that has the DB info.
| resource "aws_instance" "docker_server" { | ||
| ami = "ami-0c97bd51d598d45e4" # Amazon Linux 2023 kernel-6.12 AMI in us-west-2 | ||
| instance_type = "t2.micro" | ||
| key_name = "test-key-name" # TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what to put here, I presume this needs to match a key in the LLVM Foundation's actual AWS account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any keys should be specified in the provider. I think this is a different type of key.
|
|
||
| on: | ||
| push: | ||
| tags: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am deploying on tags at the moment: I don't think we want to re-deploy at every commit since we risk bringing down the instance. Actually, I even wonder whether that should be a manually triggered job. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to tag images by commit SHA, but explicitly version them in the terraform. That means we get a new image per commit, but only redeploy when we explicitly bump the commit of the images we're running.
| - name: Initialize Terraform | ||
| run: terraform init | ||
|
|
||
| - name: Apply Terraform changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the instance already exists and then we re-deploy it, what's going to happen? My understanding is that we'd start over from scratch with an empty EC2 instance, which means we would lose all of the existing data stored on the instance. Is that not the case?
Do you understand the mechanism by which the data we store in VOLUMES in the Docker container end up being persisted across re-deployments of the EC2 instance? I don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure it just calls whatever AWS API calls it needs to update the instance to match your terraform file, it won't get destroyed. Terraform holds state about this type of stuff.
The volume will just be stored on the root block device since we haven't attached any EBS storage or anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure it just calls whatever AWS API calls it needs to update the instance to match your terraform file, it won't get destroyed. Terraform holds state about this type of stuff.
I see. But the Terraform state is not kept across invocations of the Github Action, so I don't really understand how Terraform can tell that we even already have an instance.
|
CC @lukel97 I went ahead and gave this a shot, I was curious to understand the whole pipeline |
|
@boomanaiden154 I also created the appropriately-named secrets in the Github Actions of this repository, however they all have fake values at the moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fleshing this out, I'm not sure if you've tried deploying this to a test AWS account yet but it looks like it's missing a security group/ingress rules etc., so the web server won't be reachable by any public traffic IIUC.
I've also got some terraform files written here, it would be good to collaborate on this. I don't want us to step on each others toes so I'll just leave review comments for now but let me know if you'd rather have me just commit directly to the branch.
| - name: Initialize Terraform | ||
| run: terraform init | ||
|
|
||
| - name: Apply Terraform changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure it just calls whatever AWS API calls it needs to update the instance to match your terraform file, it won't get destroyed. Terraform holds state about this type of stuff.
The volume will just be stored on the root block device since we haven't attached any EBS storage or anything.
Please feel free to commit directly to the branch. Sorry, I didn't know you had started on this already. I did try to deploy an EC2 instance in my personal account, however that account is blocked right now (IDK why) so I haven't gotten very far. This was intended to be a starting point. Feel free to push whatever changes you have to the branch. |
|
|
||
| resource "aws_instance" "docker_server" { | ||
| ami = "ami-0c97bd51d598d45e4" # Amazon Linux 2023 kernel-6.12 AMI in us-west-2 | ||
| instance_type = "t2.micro" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default block storage on these devices are tiny (~8GB IIRC?), you probably want to expand it to a few GB more
| instance_type = "t2.micro" | |
| instance_type = "t2.micro" | |
| root_block_device { | |
| volume_size = 64 | |
| volume_type = "gp3" | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple GB boot disk should be fine, but slightly bigger might be good. The DB should probably be on a separate volume.
|
|
||
| LNT_DB_PASSWORD=${__db_password__} | ||
| LNT_AUTH_TOKEN=${__auth_token__} | ||
| docker compose --file compose.yaml up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to daemonize this otherwise cloudinit will never finish
| docker compose --file compose.yaml up | |
| docker compose --file compose.yaml up -d |
|
|
||
| LNT_DB_PASSWORD=${__db_password__} | ||
| LNT_AUTH_TOKEN=${__auth_token__} | ||
| docker compose --file compose.yaml up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC these user data scripts are only called when the instance is first initialized, but not e.g. rebooted. So we probably want to change the docker-compose restart policy to be unless-stopped so the containers get relaunched on a reboot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends upon how we set it up. I was thinking it might be better to setup the machine to be a clean slate on every boot, and mount a persistent volume that actually contains the DB. That makes it super easy to change system software inside TF.
Hah, my AWS account was also blocked, I'm currently waiting for AWS support to verify my identity. I feel your pain :) |
|
|
||
| on: | ||
| push: | ||
| tags: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to tag images by commit SHA, but explicitly version them in the terraform. That means we get a new image per commit, but only redeploy when we explicitly bump the commit of the images we're running.
| sudo usermod -a -G docker ec2-user | ||
| sudo chkconfig docker on | ||
|
|
||
| LNT_DB_PASSWORD=${__db_password__} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are these env variables coming from?
|
|
||
| LNT_DB_PASSWORD=${__db_password__} | ||
| LNT_AUTH_TOKEN=${__auth_token__} | ||
| docker compose --file compose.yaml up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends upon how we set it up. I was thinking it might be better to setup the machine to be a clean slate on every boot, and mount a persistent volume that actually contains the DB. That makes it super easy to change system software inside TF.
| provider "aws" { | ||
| region = "us-west-2" | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need a way to set the terraform state. We use a GCS bucket in the premerge cluster to do this. https://github.com/llvm/llvm-zorg/blob/87d07e600970abf419046d2ab6083b2d64240bce/premerge/main.tf#L31
Otherwise state isn't saved across checkouts, which means things won't work.
| region = "us-west-2" | ||
| } | ||
|
|
||
| variable "lnt_db_password" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should probably be data resources that reference secrets stored inside AWS's secret manager.
https://github.com/llvm/llvm-zorg/blob/87d07e600970abf419046d2ab6083b2d64240bce/premerge/main.tf#L113 is how we set this up for premerge. Not sure exactly how to do this for AWS.
| } | ||
|
|
||
| resource "aws_instance" "docker_server" { | ||
| ami = "ami-0c97bd51d598d45e4" # Amazon Linux 2023 kernel-6.12 AMI in us-west-2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not familiar with how AWS does things. Hard coding it doesn't seem like a big deal. But we want to be able to change it, which would probably force instance recreation. I think we should do what I suggested above where the instance is a clean slate on every boot but mounts a persistent volume that has the DB info.
|
|
||
| resource "aws_instance" "docker_server" { | ||
| ami = "ami-0c97bd51d598d45e4" # Amazon Linux 2023 kernel-6.12 AMI in us-west-2 | ||
| instance_type = "t2.micro" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple GB boot disk should be fine, but slightly bigger might be good. The DB should probably be on a separate volume.
| resource "aws_instance" "docker_server" { | ||
| ami = "ami-0c97bd51d598d45e4" # Amazon Linux 2023 kernel-6.12 AMI in us-west-2 | ||
| instance_type = "t2.micro" | ||
| key_name = "test-key-name" # TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any keys should be specified in the provider. I think this is a different type of key.
This patch adds a Terraform configuration file that should allow deploying to an EC2 instance. It requires a few secrets to be made available to Github Actions.