Skip to content

Commit

Permalink
docs: add simplified deploy of sidecar using default config
Browse files Browse the repository at this point in the history
This change does the following:
- Make it so the `job` label comes directly from the config. This makes
  the behaviour consistent with `PodMonitoring` in the GKE operator and
  alse is covered in go/run-gmp-scrape
- Add a cloudbuild script for a simple deploy that doesn't involve
  secret manager
- Update docs to clarify that the secret manager steps are only
  necessary if you want to customize the config.
- Refactored the readme so split out the default config and custom
  config journeys
- Create new Cloud Run service config for when secret manager is not
  needed

Change-Id: Ibc7ff0d8e22244369ab290cc124b27d597da656c
Signed-off-by: Ridwan Sharif <[email protected]>
  • Loading branch information
ridwanmsharif committed Dec 1, 2023
1 parent 65f8d7b commit 3ff0923
Show file tree
Hide file tree
Showing 11 changed files with 238 additions and 33 deletions.
92 changes: 73 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ To enable the depending service APIs with `gcloud` command, you can the followin
```console
gcloud services enable run.googleapis.com --quiet
gcloud services enable artifactregistry.googleapis.com --quiet
gcloud services enable cloudtrace.googleapis.com --quiet
gcloud services enable monitoring.googleapis.com --quiet
```

Expand All @@ -34,45 +33,47 @@ Account](https://cloud.google.com/run/docs/configuring/service-accounts) has, at
minimum, the following IAM roles:

* `roles/monitoring.metricWriter`
* `roles/cloudtrace.agent`
* `roles/logging.logWriter`

The default Compute Engine Service Account has these roles already.

### Run sample

#### Cloud Build
### Run sample (automated)

Because this sample requires `docker` or similar container build system for Linux runtime, you can use Cloud Build when you are trying without local Docker support. To enable Cloud Build, you need to enable Cloud Build API in your Google Cloud project.

```console
gcloud services enable cloudbuild.googleapis.com --quiet
```

The bundled configuration file for Cloud Build (`cloudbuild.yaml`) requires a new service account with the following roles or stronger:
The bundled configuration file for Cloud Build (`cloudbuild-simple.yaml`) requires a new service account with the following roles or stronger:

* `roles/iam.serviceAccountUser`
* `roles/storage.objectViewer`
* `roles/monitoring.metricWriter`
* `roles/logging.logWriter`
* `roles/artifactregistry.createOnPushWriter`
* `roles/run.admin`
* `roles/secretmanager.admin` (Needed for custom configs only)
* `roles/secretmanager.secretAccessor`(Needed for custom configs only)

Running `create-service-account.sh` creates a new service account `run-gmp-sa@<project-id>.iam.gserviceaccount.com` for you. Then launch a Cloud Build task with `gcloud` command.
Running `create-sa-and-ar.sh` creates a new service account `run-gmp-sa@<project-id>.iam.gserviceaccount.com` for you, and an Artifact Registry repo for the images. Then launch a Cloud Build task with `gcloud` command.

```console
./create-service-account.sh
gcloud builds submit . --config=cloudbuild.yaml
./create-sa-and-ar.sh
gcloud builds submit . --config=cloudbuild-simple.yaml
```

> **_NOTE:_** If you have an Org policy that prevents unauthenticated access, then you might see a failure in the final step. You can safely ignore this failure.
After the build, run the following command to check the endpoint URL.

```console
gcloud run services describe run-gmp-sidecar-service --region=us-east1 --format="value(status.url)"
```

#### Build and Run Manually
### Run sample (manual steps)

##### Build the sample app
#### Build the sample app

The `app` directory contains a sample app written in Go. This app generates some
simple prometheus metrics (a gauge and a counter).
Expand Down Expand Up @@ -104,7 +105,7 @@ docker push us-east1-docker.pkg.dev/$GCP_PROJECT/run-gmp/sample-app
popd
```

##### Build the Collector image
#### Build the Collector image

The `collector` directory contains a Dockerfile and OpenTelemetry Collector
config file. The Dockerfile builds a Collector image that bundles the local
Expand All @@ -117,22 +118,47 @@ docker build -t us-east1-docker.pkg.dev/$GCP_PROJECT/run-gmp/collector .
docker push us-east1-docker.pkg.dev/$GCP_PROJECT/run-gmp/collector
```

#### Create RunMonitoring config and store as a secret
#### Create the Cloud Run Service (default config)

The `run-service-simple.yaml` file defines a multicontainer Cloud Run Service with the
sample app and Collector images built above. This will run with the default config, which scrapes an application emitting metrics at port `8080` at the path `/metrics`.

Replace the `%SAMPLE_APP_IMAGE%` and `%OTELCOL_IMAGE%` placeholders in
`run-service-simple.yaml` with the images you built above, ie:

```
sed -i s@%OTELCOL_IMAGE%@us-east1-docker.pkg.dev/${GCP_PROJECT}/run-gmp/collector@g run-service-simple.yaml
sed -i s@%SAMPLE_APP_IMAGE%@us-east1-docker.pkg.dev/${GCP_PROJECT}/run-gmp/sample-app@g run-service-simple.yaml
```

Create the Service with the following command:

```
gcloud run services replace run-service-simple.yaml
```

This command will return an external URL for your Service’s endpoint. Save this
and use it in the next section to trigger the sample app so you can see the
telemetry collected by OpenTelemetry.

#### Create the Cloud Run Service (custom config)

##### Create RunMonitoring config and store as a secret

Create a `RunMonitoring` config and store it in secret manager. In this example, we use
`run-gmp-config` as the secret name.
`run-gmp-config` as the secret name. The file we're using is `default-config.yaml` and it scrapes the main container at port `8080` using the path `/metrics`. You can replace this with any `RunMonitoring` config file that you want the sidecar to use.

```
gcloud secrets create ${RUN_GMP_CONFIG} --data-file=default-config.yaml
```

##### Create the Cloud Run Service
##### Deploy the service

The `run-service.yaml` file defines a multicontainer Cloud Run Service with the
sample app and Collector images built above.
sample app and Collector images built above, using the config you placed in secret manager.

Replace the `%SAMPLE_APP_IMAGE%` and `%OTELCOL_IMAGE%` placeholders in
`run-service.yaml` with the images you built above, ie:
Replace the `%SAMPLE_APP_IMAGE%`, `%OTELCOL_IMAGE%`, `%PROJECT%` and `%SECRET%`
placeholders in `run-service.yaml` with the images you built above, ie:

```
sed -i s@%OTELCOL_IMAGE%@us-east1-docker.pkg.dev/${GCP_PROJECT}/run-gmp/collector@g run-service.yaml
Expand All @@ -151,20 +177,24 @@ This command will return an external URL for your Service’s endpoint. Save thi
and use it in the next section to trigger the sample app so you can see the
telemetry collected by OpenTelemetry.

#### Allow unauthenticated HTTP access

Finally before you make make the request to the URL, you need to change
the Cloud Run service policy to accept unauthenticated HTTP access.

```
gcloud run services set-iam-policy run-gmp-sidecar-service policy.yaml
```

> **_NOTE:_** If you have an Org policy that prevents unauthenticated access, then this step will fail. But fear not, you can simply curl the endpoint using `curl -H "Authorization: Bearer $(gcloud auth print-identity-token)" <ENDPOINT>` instead.
### View telemetry in Google Cloud

Use `curl` to make a request to your Cloud Run Service’s endpoint URL:

```
export SERVICE_URL=<service-url>
curl $SERVICE_URL/metrics
curl $SERVICE_URL
```

This should return the following output on success:
Expand All @@ -173,6 +203,30 @@ This should return the following output on success:
User request received!
```

> **_NOTE:_** If you get permission errors because of unauthenticated access, then this will fail. But fear not, you can simply curl the endpoint using `curl -H "Authorization: Bearer $(gcloud auth print-identity-token)" $SERVICE_URL` instead.
You should now be able to use Cloud Monitoring to find metrics from the application. The `app` container emits the following metrics:
- `foo_metric`: A `gauge` metric that emits the current time as a float
- `bar_metric`: A `counter` metric that emits the current time as a float

#### Troubleshooting the sidecar

The sidecar reports self metrics and self logs to Cloud Monitoring and Cloud Logging respectively.

##### Self observability metrics
You should also check out the sidecar's self metrics:
- `agent_uptime`: Uptime of the sidecar collector
- `agent_memory_usage`: Memory in use by the sidecar collector
- `agent_api_request_count`: Count of API requests from the sidecar collector
- `agent_monitoring_point_count`: Count of metric points written by the agent to Cloud Monitoring by the sidecar collector

Querying these metrics using the Google Cloud Monitoring UI is left as an
exercise for the reader. Be sure to check out the resource and metric labels for
added homework.

##### Self observability logs
Logs from the sidecar are written against the `Cloud Run Revision` [monitored resource](https://cloud.google.com/monitoring/api/resources#tag_cloud_run_revision) in Cloud Logging.

### Clean up

After running the demo, please make sure to clean up your project so that you don't consume unexpected resources and get charged.
Expand Down
6 changes: 5 additions & 1 deletion clean-up-cloud-run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,11 @@ SA_NAME="run-gmp-sa"
REGION="us-east1"

gcloud run services delete run-gmp-sidecar-service --region ${REGION} --quiet
gcloud secrets delete run-gmp-config
# Delete secret if we created it before
if gcloud secrets list --filter="name ~ .*run-gmp-config.*" | grep run-gmp-sidecar
then
gcloud secrets delete run-gmp-config
fi
gcloud artifacts repositories delete run-gmp \
--location=${REGION} \
--quiet
Expand Down
107 changes: 107 additions & 0 deletions cloudbuild-simple.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

steps:
- name: "gcr.io/cloud-builders/docker"
args: ["build", "-t", "${_IMAGE_APP}", "./sample-apps/simple-app"]
id: BUILD_SAMPLE_APP
waitFor: ["-"]

- name: "gcr.io/cloud-builders/docker"
args: ["push", "${_IMAGE_APP}"]
id: PUSH_SAMPLE_APP
waitFor:
- BUILD_SAMPLE_APP

- name: "gcr.io/cloud-builders/docker"
args: ["build", "-t", "${_IMAGE_COLLECTOR}", "."]
id: BUILD_COLLECTOR
waitFor: ["-"]

- name: "gcr.io/cloud-builders/docker"
args: ["push", "${_IMAGE_COLLECTOR}"]
id: PUSH_COLLECTOR
waitFor:
- BUILD_COLLECTOR

- name: "ubuntu"
env:
- "IMAGE_APP=${_IMAGE_APP}"
- "IMAGE_COLLECTOR=${_IMAGE_COLLECTOR}"
- "PROJECT=${_GCP_PROJECT}"
script: |
sed -i s@%OTELCOL_IMAGE%@${IMAGE_COLLECTOR}@g run-service-simple.yaml
sed -i s@%SAMPLE_APP_IMAGE%@${IMAGE_APP}@g run-service-simple.yaml
id: REPLACE_YAML_VALUE
waitFor:
- PUSH_COLLECTOR

- name: "gcr.io/google.com/cloudsdktool/cloud-sdk:slim"
entrypoint: gcloud
args:
[
"run",
"services",
"replace",
"run-service-simple.yaml",
"--region",
"${_REGION}",
]
id: DEPLOY_MULTICONTAINER
waitFor:
- PUSH_SAMPLE_APP
- PUSH_COLLECTOR
- REPLACE_YAML_VALUE

- name: "gcr.io/google.com/cloudsdktool/cloud-sdk:slim"
entrypoint: gcloud
args:
[
"run",
"services",
"set-iam-policy",
"run-gmp-sidecar-service",
"policy.yaml",
"--region",
"${_REGION}",
"--quiet",
]
id: ALLOW_UNAUTHENTICATED
waitFor:
- DEPLOY_MULTICONTAINER

substitutions:
_REGION: us-east1
_GCP_PROJECT: ${PROJECT_ID}
_REGISTRY: ${_REGION}-docker.pkg.dev/${_GCP_PROJECT}/run-gmp
_IMAGE_APP: ${_REGISTRY}/sample-app
_IMAGE_COLLECTOR: ${_REGISTRY}/collector
_SA_NAME: run-gmp-sa

images:
- ${_IMAGE_APP}
- ${_IMAGE_COLLECTOR}

# comment out the following line if you want to run Cloud Build with the existing
# service account with the following roles.
# * roles/iam.serviceAccountUser
# * roles/storage.objectViewer
# * roles/logging.logWriter
# * roles/artifactregistry.createOnPushWriter
# * roles/run.admin
serviceAccount: "projects/${_GCP_PROJECT}/serviceAccounts/${_SA_NAME}@${_GCP_PROJECT}.iam.gserviceaccount.com"

options:
dynamic_substitutions: true
logging: CLOUD_LOGGING_ONLY
2 changes: 1 addition & 1 deletion confgenerator/agentmetrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ func (r AgentSelfMetrics) OTelReceiverPipeline() otel.ReceiverPipeline {
Config: map[string]interface{}{
"config": map[string]interface{}{
"scrape_configs": []map[string]interface{}{{
"job_name": "run-gmp-sidecar",
"job_name": "run-gmp-sidecar-self-metrics",
"scrape_interval": "1m",
"static_configs": []map[string]interface{}{{
"targets": []string{fmt.Sprintf("0.0.0.0:%d", r.Port)},
Expand Down
6 changes: 2 additions & 4 deletions confgenerator/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -270,7 +270,7 @@ func (rc *RunMonitoringConfig) endpointScrapeConfig(index int) (*promconfig.Scra
}
relabelCfgs := relabelingsForMetadata(metadataLabels, rc.Env)
return endpointScrapeConfig(
fmt.Sprintf("RunMonitoring/%s", rc.Name),
rc.Name,
rc.Spec.Endpoints[index],
relabelCfgs,
rc.Spec.Limits,
Expand Down Expand Up @@ -378,9 +378,7 @@ func endpointScrapeConfig(id string, ep ScrapeEndpoint, relabelCfgs []*relabel.C
}

scrapeCfg := &promconfig.ScrapeConfig{
// Generate a job name to make it easy to track what generated the scrape configuration.
// The actual job label attached to its metrics is overwritten via relabeling.
JobName: fmt.Sprintf("%s/%s", id, ep.Port),
JobName: id,
ServiceDiscoveryConfigs: discoveryCfgs,
MetricsPath: metricsPath,
Scheme: ep.Scheme,
Expand Down
4 changes: 2 additions & 2 deletions confgenerator/testdata/add-metadata-labels/golden/otel.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ receivers:
allow_cumulative_resets: true
config:
scrape_configs:
- job_name: RunMonitoring/run-run-run/8080
- job_name: run-run-run
honor_timestamps: false
scrape_interval: 1m
scrape_timeout: 1m
Expand Down Expand Up @@ -127,7 +127,7 @@ receivers:
prometheus/run-gmp-self-metrics:
config:
scrape_configs:
- job_name: run-gmp-sidecar
- job_name: run-gmp-sidecar-self-metrics
metric_relabel_configs:
- action: replace
replacement: "42"
Expand Down
4 changes: 2 additions & 2 deletions confgenerator/testdata/builtin/golden/otel.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ receivers:
allow_cumulative_resets: true
config:
scrape_configs:
- job_name: RunMonitoring/run-gmp-sidecar/8080
- job_name: run-gmp-sidecar
honor_timestamps: false
scrape_interval: 1m
scrape_timeout: 1m
Expand Down Expand Up @@ -139,7 +139,7 @@ receivers:
prometheus/run-gmp-self-metrics:
config:
scrape_configs:
- job_name: run-gmp-sidecar
- job_name: run-gmp-sidecar-self-metrics
metric_relabel_configs:
- action: replace
replacement: "42"
Expand Down
4 changes: 2 additions & 2 deletions confgenerator/testdata/relabel-labels/golden/otel.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ receivers:
allow_cumulative_resets: true
config:
scrape_configs:
- job_name: RunMonitoring/mycollector/8080
- job_name: mycollector
honor_timestamps: false
scrape_interval: 10s
scrape_timeout: 10s
Expand Down Expand Up @@ -143,7 +143,7 @@ receivers:
prometheus/run-gmp-self-metrics:
config:
scrape_configs:
- job_name: run-gmp-sidecar
- job_name: run-gmp-sidecar-self-metrics
metric_relabel_configs:
- action: replace
replacement: "42"
Expand Down
Loading

0 comments on commit 3ff0923

Please sign in to comment.