From 092ebba183b534d57aa56773c16b26134d478a21 Mon Sep 17 00:00:00 2001 From: Ganga Mahesh Siddem Date: Fri, 8 Oct 2021 14:27:35 -0700 Subject: [PATCH] Gangams/dev to prod merge oct2021 release (#667) * separate build yamls for ci_prod branch (#415) * re-enable adx path (#420) * Gangams/release changes (#419) * updates related to release * updates related to release * fix the incorrect version * fix pr feedback * fix some typos in the release notes * fix for zero filled metrics (#423) * consolidate windows agent image docker files (#422) * consolidate windows agent image docker files * revert docker file consolidation * revert readme updates * merge back windows dockerfiles * image tag update * Gangams/cluster creation scripts (#414) * onprem k8s script * script updates * scripts for creating non-aks clusters * fix minor text update * updates * script updates * fix * script updates * fix scripts to install docker * fix: Pin to a particular version of ltsc2019 by SHA (#427) * enable collecting npm metrics (optionally) (#425) * enable collecting npm metrics (optionally) * fix default enrichment value * fix adx * Saaror patch 3 (#426) * Create README.MD Creating content for Kubecon lab * Update README.MD * Update README.MD * Gangams/add containerd support to windows agent (#428) * wip * wip * wip * wip * bug fix related to uri * wip * wip * fix bug with ignore cert validation * logic to ignore cert validation * minor * fix minor debug log issue * improve log message * debug message * fix bug with nullorempty check * remove debug statements * refactor parsers * add debug message * clean up * chart updates * fix formatting issues * Gangams/arc k8s metrics (#413) * cluster identity token * wip * fix exception * fix exceptions * fix exception * fix bug * fix bug * minor update * refactor the code * more refactoring * fix bug * typo fix * fix typo * wait for 1min after token renewal request * add proxy support for arc k8s mdm endpoint * avoid additional get call * minor line ending fix * wip * have separate log for arc k8s cluster identity * fix bug on creating crd resource * remove update permission since not required * fixed some bugs * fix pr feedback * remove list since its not required * fix: Reverting back to ltsc2019 tag (#429) * more kubelet metrics (#430) * more kubelet metrics * celan up new config * fix nom issue when config is empty (#432) * support multiple docker paths when docker root is updated thru knode (#433) * Gangams/doc and other related updates (#434) * bring back nodeslector changes for windows agent ds * readme updates * chart updates for azure cluster resourceid and region * set cluster region during onboarding for managed clusters * wip * fix for onboarding script * add sp support for the login * update help * add sp support for powershell * script updates for sp login * wip * wip * wip * readme updates * update the links to use ci_prod branch * fix links * fix image link * some more readme updates * add missing serviceprincipal in ps scripts (#435) * fix telemetry bug (#436) * Gangams/readmeupdates non aks 09162020 (#437) * changes for ciprod09162020 non-aks release * fix script to handle cross sub scenario * fix minor comment * fix date in version file * fix pr comments * Gangams/fix weird conflicts (#439) * separate build yamls for ci_prod branch (#415) (#416) * [Merge] dev to prod for ciprod08072020 release (#424) * separate build yamls for ci_prod branch (#415) * re-enable adx path (#420) * Gangams/release changes (#419) * updates related to release * updates related to release * fix the incorrect version * fix pr feedback * fix some typos in the release notes * fix for zero filled metrics (#423) * consolidate windows agent image docker files (#422) * consolidate windows agent image docker files * revert docker file consolidation * revert readme updates * merge back windows dockerfiles * image tag update Co-authored-by: Vishwanath Co-authored-by: rashmichandrashekar Co-authored-by: Vishwanath Co-authored-by: rashmichandrashekar * fix quote issue for the region (#441) * fix cpucapacity/limit bug (#442) * grwehner/pv-usage-metrics (#431) - Send persistent volume usage and capacity metrics to LA for PVs with PVCs at the pod level; config to include or exclude kube-system namespace. - Send PV usage percentage to MDM if over the configurable threshold. - Add PV usage recommended alert template. * add new custom metric regions (#444) * add new custom metric regions * fix commas * add 'Terminating' state (#443) * Gangams/sept agent release tasks (#445) * turnoff mdm nonsupported cluster types * enable validation of server cert for ai ruby http client * add kubelet operations total and total error metrics * node selector label change * label update * wip * wip * wip * revert quotes * grwehner/pv-collect-volume-name (#448) Collect and send the volume name as another tag for pvUsedBytes in InsightsMetrics, so that it can be displayed in the workload workbook. Does not affect the PV MDM metric * Changes for september agent release (#449) Moving from v1beta1 to v1 for health CRD Adding timer for zero filling Adding zero filling for PV metrics * Gangams/arc k8s related scripts, charts and doc updates (#450) * checksum annotations * script update for chart from mcr * chart updates * update chart version to match with chart release * script updates * latest chart updates * version updates for chart release * script updates * script updates * doc updates * doc updates * update comments * fix bug in ps script * fix bug in ps script * minor update * release process updates * use consistent name across scripts * use consistent names * Install CA certs from wireserver (#451) * grwehner/pv-volume-name-in-mdm (#452) Add volume name for PV to mdm dimensions and zero fill it * Release changes for 10052020 release (#453) * Release changes for 10052020 release * remove redundant kubelet metrics as part of PR feedback * Update onboarding_instructions.md (#456) * Update onboarding_instructions.md Updated the documentation to reflect where to update the config map. * Update onboarding_instructions.md * Update onboarding_instructions.md * Update onboarding_instructions.md Updated the link * chart update for sept2020 release (#457) * add missing version update in the script (#458) * November release fixes - activate one agent, adx schema v2, win perf issue, syslog deactivation (#459) * activate one agent, adx schema v2, win perf issue, syslog deactivation * update chart * remove hiphen for params in chart (#462) Merging as its a simple fix (remove hiphen) * Changes for cutting a new build for ciprod10272020 release (#460) * using latest stable version of msys2 (#465) * fixing the windows-perf-dups (#466) * chart updates related to new microsoft/charts repo (#467) * Changes for creating 11092020 release (#468) * MDM exception aggregation (#470) * grwehner/mdm custom metric regions (#471) Remove custom metrics region check for public cloud * updaitng rs limit to 1gb (#474) * grwehner/pv inventory (#455) Add fluentd plugin to request persistent volume info from the kubernetes api and send to LA * Gangams/fix for build release pipeline issue (#476) * use isolated cdpx acr * correct comment * add pv fluentd plugin config to helm rs config (#477) * add pv fluentd plugin to helm rs config * helm rbac permissions for pv api calls * Gangams/fix rs ooming (#473) * optimize kpi * optimize kube node inventory * add flags for events, deployments and hpa * have separate function parseNodeLimits * refactor code * fix crash * fix bug with service name * fix bugs related to get service name * update oom fix test agent * debug logs * fix service label issue * update to latest agent and enable ephemeral annotation * change stream size to 200 from 250 * update yaml * adjust chunksizes * add ruby gc env * yaml changes for cioomtest11282020-3 * telemetry to track pods latency * service count telemetry * rename variables * wip * nodes inventory telemetry * configmap changes * add emit streams in configmap * yaml updates * fix copy and paste bug * add todo comments * fix node latency telemetry bug * update yaml with latest test image * fix bug * upping rs memory change * fix mdm bug with final emit stream * update to latest image * fix pr feedback * fix pr feedback * rename health config to agent config * fix max allowed hpa chunk size * update to use 1k pod chunk since validated on 1.18+ * remove debug logs * minor updates * move defaults to common place * chart updates * final oomfix agent * update to use prod image so that can be validated with build pipeline * fix typo in comment * Gangams/enable arc onboarding to ff (#478) * wip * updates * trigger login if the ctx cloud not same as specified cloud * add missed commit * Convert PV type dictionary to json for telemetry so it shows up in logs (#480) * fix 2 windows tasks - 1) Dont log to termination log 2) enable ADX route for containerlogs in windows (for O365) (#482) * fix ci envvar collection in large pods (#483) * grwehner/jan agent tasks (#481) - Windows agent fix to use log filtering settings in config map. - Error handling for kubelet_utils get_node_capacity in case /metrics/cadvsior endpoint fails. - Remove env variable for workspace key for windows agent * updating fbit version and cpu limit (#485) * reverting to older version (#487) * Gangams/add fbsettings configurable via configmap (#486) * wip * fbit config settings * add config warn message * handle one config provided but not other * fixed pr feedback * fix copy paste error * rename config parameter names * fix typo * fix fbit crash in helm path * fix nil check * Gangams/jan agent release tasks (#484) * wip * explicit amd64 affinity for hybrid workloads * fix space issue * wip * revert vscode setting file * remove per container logs in ci (#488) * updates for ciprod01112021 release (#489) * new yaml files (#491) * Use cloud-specific instrumentation keys (#494) If APPLICATIONINSIGHTS_AUTH_URL is set/non-empty then the agent will now grab a custom IKey from a URL stored in APPLICATIONINSIGHTS_AUTH_URL * upgrade apt to latest version (#492) * upgrade apt to latest version * fix pr feedback * Gangams/add support for extension msi for arc k8s cluster (#495) * wip * add env var for the arc k8s extension name * chart update * extension msi updates * fix bug * revert chart and image to prod version * minor text changes * image tag to prod * wip * wip * wip * wip * final updates * fix whitespaces * simplify crd yaml * Gangams/arm template arc k8s extension (#496) * arm templates for arc k8s extension * update to use official extension type name * update * add identity property * add proxyendpointurl parameter * add default values * Gangams/aks monitoring via policy (#497) * enable monitoring through policy * wip * handle tags * wip * add alias * wip * working * updates * working * with deployment name * doc updates * doc updates * fix typo in the docs * revert to use operatingSystem from osImage for node os telemety (#498) * Container log v2 schema changes (#499) * make pod name in mdsd definition as str for consistency. msgp has no type checking, as it has type metadata in it the message itself. * Add priority class to the daemonsets (#500) * Add priority class to the daemonsets Add a priority class for omsagent and have the daemonsets use this to be sure to schedule the pods. Daemonset pods are constrained in scheduling to run on specific nodes. This is done by the daemonset controller. When a node shows up it will create a pod with a strong affinity to that node. When a node goes away, it will delete the pod with the node affinity to that node. Kubernetes pod scheduling does not know it is a daemonset but it does know it is tied to a specific node. With default scheduling, it is possible for the pods to be "frozen out" of a node because the node already is full. This can happen because "normal" pods may already exist and are looking for a node to get scheduled on when a node is added to the cluster. The daemonset controller will only first create the pod for the node at around the same time. The kubernetes scheduler is running async from all of this and thus there can be a race as to who gets scheduled on the node. The pod priority class (and thus the pod priority) is a way to indicate that the pod has a higher scheduling priority than a default pod. By default, all pods are at priority 0. Higher numbers are higher priority. Setting the priority to something greater than zero will allow the omsagent daemonsets to win a race against "normal" pods for scheduled resources on a node - and will also allow for graceful eviction in the case the node is too full. Without this, omsagent can be left out of node in clusters that are very busy, especially in dynamic scaling situations. I did not test the windows pod as we have no windows clusters. * CR feedback * fix node metric issue (#502) * Bug fixes for Feb release (#504) * bug fix for mdm metrics with no limits * fix exception bug * Gangams/feb 2021 agent bug fix (#505) * fix npe in getKubeServiceRecords * use image fields from spec * fix typo * cover all cases * handle scenario only digest specified * changes for release -ciprod02232021 (#506) * Gangams/e2e test framework (#503) * add agent e2e fw and tests * doc and script updates * add validation script * doc updates * yaml updates * fix typo * doc updates * more doc updates * add ISTEST for helm chart to use arc conf * refactor test code * fix pr feedback * fix pr feedback * fix pr feedback * fix pr feedback * scrape new kubelet pod count metric name (#508) * Adding explicit json output to az commands as the script fails if az is configured with Table output #409 (#513) * Gangams/arc proxy contract and token renewal updates (#511) * fix issue with crd status updates * handle renewal token delays * add proxy contract * updates for proxy cert for linux * remove proxycert related changes * fix whitespace issue * fix whitespace issue * remove proxy in arm template * doc updates for microsoft charts repo release (#512) * doc updates for microsoft charts repo release * wip * Update enable-monitoring.sh (#514) Line 314 and 343 seems to have trailing spaces for some subscriptions which is exiting the script even for valid scenarios Co-authored-by: Ganga Mahesh Siddem * Prometheus scraping from sidecar and OSM changes (#515) * add liveness timeout for exec (#518) * chart and other updates (#519) * Saaror osmdoc (#523) * Create ReadMe.md * Update ReadMe.md * Update ReadMe.md * Update ReadMe.md * Update ReadMe.md * Add files via upload * Update ReadMe.md * Update ReadMe.md * Update ReadMe.md * Update ReadMe.md * Update ReadMe.md * Update ReadMe.md * telemetry bug fix (#527) * Fix conflicting logrotate settings (#526) The node and the omsagent container both have a cron.daily file to rotate certain logs daily. These settings are the same for some files in /var/log (mounted from the node with read/write access), causing the rotation to fail when both try to rotate at the same time. So then the /var/log/*.1 file is written to forever. Since these files are always written to and never rotated, it causes high memory usage on the node after a while. This fix removes the container logrotate settings for /var/log, which the container does not write to. * bug fix (#528) * Gangams/arc ev2 deployment (#522) * ev2 deployment for arc k8s extension * fix charts path issue * rename scripts tar * add notifications * fix line endings * fix line endings * update with prod repo * fix file endings * added liveness and telemetry for telegraf (#517) * added liveness and telemetry for telegraf * code transfer * removed windows liveness probe * done * Windows metric fix (#530) * changes * about to remove container fix * moved caching code to existing loop * removed un-necessary changes * removed a few more un-necessary changes * added windows node check * fixed a bug * everything works confirmed * OSM doc update (#533) * Adding MDM metrics for threshold violation (#531) * Rashmi/april agent 2021 (#538) * add Read_from_Head config for all fluentbit tail plugins (#539) See the commit message of: fluent/fluent-bit@70e33fa for details explaining the fluentbit change and what Read_from_Head does when set to true. * fix programdata mount issue on containerd win nodes (#542) * Update sidecar mem limits (#541) * David/release 4 22 2021 (#544) * updating image tag and agent version * updated liveness probe * updated release notes again * fixed date in version file * 1m, 1m, 1s by default (#543) * 1m, 1m, 1s by default * setting default through a different method * David/aad stage 1 release (#556) * update to latest omsagent, add eastus2 to mdsd regions * copied oneagent bits to a CI repository release * mdsd inmem mode * yaml for cl scale test * yaml for cl scale test * reverting dockerProviderVersion version to 15.0.0 * prepping for release (updated image version, dockerProviderVersion, and release notes * container log scaletest yamls * forgot to update image version in chart * fixing windows tag in dockerfile, changing release notes wording * missed windows tag in one more place * forgot to change the windows dockerProviderVersion back Co-authored-by: Ganga Mahesh Siddem * Update ReleaseNotes.md (#558) fix imagetag in the release notes * Add wait time for telegraf and also force mdm egress to use tls 1.2 (#560) * Add wait time for telegraf and also force mdm egress to use tls 1.2 * add wait for all telegraf dependencies across all containers (ds & rs) * remove ssl change so we dont include as part of the other fix until we test with att nodes. * partially disabled telegraf liveness probe check, we'll still have telemetry but the probe won't fail if telegraf isn't running (#561) * changes for 05202021 release (#563) * changes for 05202021 release * fixed typos * Rashmi/jedi wireserver (#566) * Update ReadMe.md (#565) * Update ReadMe.md * Update ReadMe.md Included feedback from OSM team and Fixed * Gangams/aad stage2 full switch to mdsd (#559) * full switch to mdsd, upgrade to ruby v1 & omsagent removal * add odsdirect as fallback option * cleanup * cleanup * move customRegion to stage3 * updates related to containerlog route * make xml eventschema consistent * add buffer settings * address HTTPServerException deprecation in ruby 2.6 * update to official mdsd version * fix log message issue * fix pr feedback * get ridoff unused code from omscommon * fix pr feedback * fix pr feedback * clean up * clean up * fix missing conf * Send perf metrics to MDM from windows daemonset (#568) * updating json gem to address CVE-2020-10663 (#567) * updating json gem to address CVE-2020-10663 * updating json gem to address CVE-2020-10663 * update recommended alerts readme (#570) @dcbrown16 pointed out that this page links to the wrong document in [this issue](https://github.com/microsoft/Docker-Provider/issues/475). The content in the currently linked page is identitical to the page which should be linked, so it's a simple fix. * trying again to fix the json gem (#571) * trying again to fix the json gem * removing installation of newer json gem * Addressing PR comments for - https://github.com/microsoft/Docker-Provider/pull/568 (#569) * Mem_Buf_limit is configurable via ConfigMap (#574) * add log rotation settings for fluentd logs (#577) * Gangams/release 06112021 (#578) * updates related to ciprod06112021 release * minor update * release note update (#579) * Make sidecar fluentbit chunk size configurable (#573) * Fix vulnerabilities (#583) * test * test1 * test-2 * test-3 * 3 * 4 * test * 2 * 3 * 4 * 5 * 6 * rename gem for windows * fix * fix * Windows build optimization (#582) * fix windows build failure due to msys2 version * Fix telegraf startup issue when endpoint is unreachable (#587) * revert fbit tail plugins defaults to std defaults (#586) * fixed another bug (#593) * feat: add new metrics to MDM for allocatable % calculation of cpu and memory usage (#584) * feat: allocatable cpu and memory % metrics for MDM * maybe * linux is working * windwos.... * some more * comment * better * syntax * ruby * revert omsagent.yaml * comments * pr feedback * pr feedback * testing msys2 version update * better * update adx sdk for perf issue (#601) * remove md check * Gangams/release notes update for hotfix (#596) * release notes updates * release notes updates for ciprod06112021-1 * Cherry picking hotfix changes to ci_dev (#605) * release changes (#607) * Gangams/aad stage3 msi auth (#585) * changes related to aad msi auth feature * use existing envvars * fix imds token expiry interval * refactor the windows agent ingestion token code * code cleanup * fix build errors * code clean up * code clean up * code clean up * code clean up * more refactoring * fix bug * fix bug * add debug logs * add nil checks * revert changes * revert yaml change since this added in aks side * fix pr feedback * fix pr feedback * refine retry code * update mdsd env as per official build * cleanup * update env vars per mdsd * update with mdsd official build * skip cert gen & renewal incase of aad msi auth * add nil check * cherry windows agent nodeip issue * fix merge issue Co-authored-by: rashmichandrashekar * Gangams/remove chart version dependency (#589) * remove chart version dependency * remove unused code * fix resource type * fix * handle weird cli chars * update release process * Gangams/july 2021 release tasks 3 (#613) * use artifact and pipeline creds for image push * minor update * add vuln fix here so that pr can be merged * remove un-used output plugin (#614) * fix telegraf telemetry and improve fluentd liveness (#611) * fix telegraf telemetry and improve fluentd liveness * address identified vuln with libsystemd0 * fix exported image file extension * Gangams/july 2021 release tasks 2 (#612) * tail rs mdsd err logs * configure mdsd log rotation * log rotation for mdsd log files * Fix out_oms.go dependency vulnerabilities (#623) * revert libsystemd0 update (#616) * updates for ci-prod release instructions (#619) * cherry pick changes from ci_prod (#622) * Support az login for passwords starting with dash ('-') (#626) Co-authored-by: Vladimir Babichev * Gangams/add telemetry fbit settings (#628) * add telemetry to track fbit settings * add telemetry to track fbit settings * check onboarding status (#629) * Gangams/arc k8s conformance test updates (#617) * conf test updates * clean up * wip * update with mcr cidev image * handle log path * cleanup * clean up * wip * working * update for mcr image * minor * image update * handle latency of connected cluster resource creation * update conftest image * upgrade golang version for windows in pipeline build and locally (#630) * Updating a link in Readme.md (#632) The link to the build pipelines now goes directly to our build pipelines (instead of to all github-private pipelines) * Updating omsagent yaml to have parity with omsagent yaml file in AKS RP (#615) * Unit test tooling (#625) Added tooling and examples for unit tests * run unit tests after a merge too (#634) * flag stale PRs & issues * Adding script to collect logs (for troubleshooting) (#636) * added script for collecting logs * added windows daemonset and prometheus sidecar, as well as some explanatory prints * added kubectl describe and kubectl logs output * changed message to make it more clear some erros are expected * Sarah/ev2 (#640) * ev2 artifacts for release pipeline * update parameters reference * add artifacts tar file * changes to rollout and service model * change agentimage path * adding agentimage to artifact script * removing charts from tarball * change script to use blob storage * change blob variables * echo variables * change blob uri * use release id for blob prefix * change to delete blob file * add check for if blob storage file exists * fix script errors * update check for file in storage * change true check * comments and change storage account info to pipeline variables * Changes for windows tar file * PR changes * documenting fbit tail plugin configmap settings. (#638) * documenting fbit tail plugin configmap settings. * Install unzip package on shell extension (#642) * Changing installation in ev2 script (#644) * Adjust release pipeline to use cdpx acr (#647) * Adjust release pipeline to use cdpx acr * Adjust release pipeline to use cdpx acr * Update CDPX ACR path * Add check for cdpx repo variable * Sarah/ev2 prod (#649) * Ev2 changes for prod * CDPX repo naming change (#652) * Sarah/ev2 update (#654) * remove acr name from repo path * add check to make sure tag does not exist in mcr repo * change tag syntax for mcr repo check (#655) * Gangams/optimize win livenessprobe (#653) * livenessprobe optimization * optimize windows agent liveness probe * optimize windows agent liveness probe * optimize windows agent liveness probe * optimize windows agent liveness probe * optimize windows agent liveness probe * optimize windows agent liveness probe * optimize windows agent liveness probe * optimize windows agent liveness probe * Gangams/addon token adapter image tag to telemetry (#656) * addon token adapter image tag * addon token adapter image tag * Sarah/ev2 helm (#658) * Use MSI for Arc Release * Use CIPROD_ACR AME subscription for shell extension * remove extra line endings * Sarah/ev2 pipeline (#661) * testing build artifact dir changes * add .pipelines directory and omsagent.yaml to build artifacts * add charts directory to build artifacts (#662) * Sarah/remove cdpx creds (#664) * don't use cdpx acr creds from kv * add e2etest.yaml to build output * keep cdpx creds for now * chart updates for rbac api version change (#660) * chart updates for rbac api version change * include windows ds for arc * proxy support (for non-aks) (#665) * changes related to aad msi auth feature * use existing envvars * fix imds token expiry interval * initial proxy support * merge? * cleaning up some files which should've merged differently * proxy should be working, but most tables don't have any data. About to merge, maybe whatever was wrong is now fixed * linux AMA proxy works * about to merge * proxy support appears to be working, final mdsd build location will still change * removing some unnecessary changes * forgot to remove one last change * redirected mdsd stderr to stdout instead of stdin * addressing proxy password location comment Co-authored-by: Ganga Mahesh Siddem * updates for the release ciprod10082021 and win-ciprod10082021 * updates for the release ciprod10082021 and win-ciprod10082021 * updates for the release ciprod10082021 and win-ciprod10082021 * updates for the release ciprod10082021 and win-ciprod10082021 Co-authored-by: Vishwanath Co-authored-by: rashmichandrashekar Co-authored-by: bragi92 Co-authored-by: saaror <31900410+saaror@users.noreply.github.com> Co-authored-by: Grace Wehner Co-authored-by: deagraw Co-authored-by: David Michelman Co-authored-by: Michael Sinz <36865706+Michael-Sinz@users.noreply.github.com> Co-authored-by: Nicolas Yuen Co-authored-by: seenu433 Co-authored-by: Tsubasa Nomura Co-authored-by: Vladimir Co-authored-by: Vladimir Babichev Co-authored-by: sarahpeiffer <46665092+sarahpeiffer@users.noreply.github.com> --- .github/workflows/pr-checker.yml | 4 +- .github/workflows/run_unit_tests.yml | 34 + .github/workflows/stale.yml | 28 + .gitignore | 4 + .pipelines/build-linux.sh | 5 + ...l.all_tag.all_phase.all_config.ci_prod.yml | 27 +- .pipelines/pipeline.user.linux.yml | 20 + ...l.all_tag.all_phase.all_config.ci_prod.yml | 6 +- .pipelines/pipeline.user.windows.yml | 6 +- .pipelines/release-agent.sh | 74 + Dev Guide.md | 125 ++ Documentation/AgentSettings/ReadMe.md | 26 + README.md | 44 +- ReleaseNotes.md | 28 +- ReleaseProcess.md | 39 +- .../scripts/td-agent-bit-conf-customizer.rb | 8 +- .../scripts/tomlparser-prom-agent-config.rb | 102 + .../conf/td-agent-bit-prom-side-car.conf | 6 +- .../conf/telegraf-prom-side-car.conf | 8 +- build/linux/installer/conf/telegraf-rs.conf | 36 +- build/linux/installer/conf/telegraf.conf | 28 +- build/linux/installer/conf/test.json | 1 + .../installer/datafiles/base_container.data | 15 +- .../linux/installer/scripts/livenessprobe.sh | 18 +- build/windows/Makefile.ps1 | 27 +- .../installer/livenessprobe/livenessprobe.cpp | 137 ++ .../installer/scripts/livenessprobe.cmd | 36 - .../in_heartbeat_request.rb | 20 +- .../templates/omsagent-crd.yaml | 2 +- .../templates/omsagent-daemonset-windows.yaml | 4 +- .../templates/omsagent-rbac.yaml | 10 +- .../ContainerInsights.Linux.Parameters.json | 68 + .../ContainerInsights.Windows.Parameters.json | 68 + .../RolloutSpecs/RolloutSpecs.json | 36 + .../ScopeBindings/Public.ScopeBindings.json | 51 + .../Scripts/pushAgentToAcr.sh | 72 + .../ServiceModels/Public.ServiceModel.json | 56 + .../ServiceGroupRoot/buildver.txt | 1 + ...ContainerInsightsExtension.Parameters.json | 28 +- .../Public.Canary.RolloutSpec.json | 6 +- .../ScopeBindings/Public.ScopeBindings.json | 74 +- .../Scripts/pushChartToAcr.sh | 134 +- .../ServiceModels/Public.ServiceModel.json | 32 +- kubernetes/container-azm-ms-agentconfig.yaml | 20 +- kubernetes/linux/Dockerfile | 4 +- .../linux/defaultpromenvvariables-sidecar | 3 + kubernetes/linux/logrotate.conf | 39 + kubernetes/linux/main.sh | 248 ++- kubernetes/linux/setup.sh | 7 +- kubernetes/omsagent.yaml | 146 +- kubernetes/windows/Dockerfile | 5 +- kubernetes/windows/Dockerfile-dev-base-image | 43 + kubernetes/windows/Dockerfile-dev-image | 44 + .../build-and-publish-dev-docker-image.ps1 | 64 + .../dockerbuild/build-dev-base-image.ps1 | 32 + kubernetes/windows/main.ps1 | 71 +- .../windows/install-build-pre-requisites.ps1 | 4 +- .../ci-extension-dcr-streams.md | 186 ++ scripts/dcr-onboarding/ci-extension-dcr.json | 59 + .../onboarding/managed/disable-monitoring.sh | 2 +- .../onboarding/managed/enable-monitoring.ps1 | 23 +- .../onboarding/managed/enable-monitoring.sh | 44 +- .../onboarding/managed/upgrade-monitoring.sh | 46 +- scripts/troubleshoot/collect_logs.sh | 54 + source/plugins/go/src/extension/extension.go | 103 + .../go/src/extension/extension_test.go | 74 + source/plugins/go/src/extension/interfaces.go | 34 + .../plugins/go/src/extension/socket_writer.go | 110 ++ source/plugins/go/src/go.mod | 33 +- source/plugins/go/src/go.sum | 497 ++++- .../plugins/go/src/ingestion_token_utils.go | 516 +++++ source/plugins/go/src/oms.go | 89 +- source/plugins/go/src/telemetry.go | 24 +- source/plugins/go/src/utils.go | 160 +- source/plugins/go/src/utils_test.go | 79 + .../ruby/ApplicationInsightsUtility.rb | 12 +- .../plugins/ruby/CAdvisorMetricsAPIClient.rb | 71 +- source/plugins/ruby/CustomMetricsUtils.rb | 4 +- source/plugins/ruby/KubernetesApiClient.rb | 63 +- source/plugins/ruby/MdmMetricsGenerator.rb | 22 +- source/plugins/ruby/constants.rb | 30 +- source/plugins/ruby/extension.rb | 77 + source/plugins/ruby/extension_utils.rb | 27 + source/plugins/ruby/filter_cadvisor2mdm.rb | 82 +- .../ruby/filter_health_model_builder.rb | 23 +- source/plugins/ruby/in_cadvisor_perf.rb | 26 +- source/plugins/ruby/in_containerinventory.rb | 51 +- source/plugins/ruby/in_kube_events.rb | 22 +- source/plugins/ruby/in_kube_nodes.rb | 243 ++- source/plugins/ruby/in_kube_nodes_test.rb | 171 ++ source/plugins/ruby/in_kube_podinventory.rb | 53 +- source/plugins/ruby/in_kube_pvinventory.rb | 23 +- .../plugins/ruby/in_kubestate_deployments.rb | 21 +- source/plugins/ruby/in_kubestate_hpa.rb | 18 +- source/plugins/ruby/in_win_cadvisor_perf.rb | 12 + source/plugins/ruby/kubelet_utils.rb | 108 ++ source/plugins/ruby/oms_common.rb | 143 ++ source/plugins/ruby/omslog.rb | 50 + source/plugins/ruby/out_mdm.rb | 27 +- test/e2e/conformance.yaml | 15 + test/e2e/e2e-tests.yaml | 21 +- test/e2e/src/common/constants.py | 6 +- test/e2e/src/core/Dockerfile | 17 +- test/e2e/src/core/conftest.py | 37 +- test/e2e/src/core/e2e_tests.sh | 200 +- test/e2e/src/core/setup_failure_handler.py | 18 + test/e2e/src/tests/test_ds_workflows.py | 28 +- test/e2e/src/tests/test_e2e_workflows.py | 231 +-- .../tests/test_node_metrics_e2e_workflow.py | 61 +- .../tests/test_pod_metrics_e2e_workflow.py | 14 +- test/e2e/src/tests/test_resource_status.py | 12 +- test/e2e/src/tests/test_rs_workflows.py | 17 +- .../kube-nodes-malformed.txt | 1674 +++++++++++++++++ .../canned-api-responses/kube-nodes.txt | 851 +++++++++ test/unit-tests/run_go_tests.sh | 12 + test/unit-tests/run_ruby_tests.sh | 13 + test/unit-tests/test_driver.rb | 13 + 117 files changed, 7996 insertions(+), 1040 deletions(-) create mode 100644 .github/workflows/run_unit_tests.yml create mode 100644 .github/workflows/stale.yml create mode 100644 .pipelines/release-agent.sh create mode 100644 Dev Guide.md create mode 100644 Documentation/AgentSettings/ReadMe.md create mode 100644 build/common/installer/scripts/tomlparser-prom-agent-config.rb create mode 100644 build/linux/installer/conf/test.json create mode 100644 build/windows/installer/livenessprobe/livenessprobe.cpp delete mode 100644 build/windows/installer/scripts/livenessprobe.cmd create mode 100644 deployment/agent-deployment/ServiceGroupRoot/Parameters/ContainerInsights.Linux.Parameters.json create mode 100644 deployment/agent-deployment/ServiceGroupRoot/Parameters/ContainerInsights.Windows.Parameters.json create mode 100644 deployment/agent-deployment/ServiceGroupRoot/RolloutSpecs/RolloutSpecs.json create mode 100644 deployment/agent-deployment/ServiceGroupRoot/ScopeBindings/Public.ScopeBindings.json create mode 100644 deployment/agent-deployment/ServiceGroupRoot/Scripts/pushAgentToAcr.sh create mode 100644 deployment/agent-deployment/ServiceGroupRoot/ServiceModels/Public.ServiceModel.json create mode 100644 deployment/agent-deployment/ServiceGroupRoot/buildver.txt create mode 100644 kubernetes/linux/logrotate.conf create mode 100644 kubernetes/windows/Dockerfile-dev-base-image create mode 100644 kubernetes/windows/Dockerfile-dev-image create mode 100644 kubernetes/windows/dockerbuild/build-and-publish-dev-docker-image.ps1 create mode 100644 kubernetes/windows/dockerbuild/build-dev-base-image.ps1 create mode 100644 scripts/dcr-onboarding/ci-extension-dcr-streams.md create mode 100644 scripts/dcr-onboarding/ci-extension-dcr.json create mode 100755 scripts/troubleshoot/collect_logs.sh create mode 100644 source/plugins/go/src/extension/extension.go create mode 100644 source/plugins/go/src/extension/extension_test.go create mode 100644 source/plugins/go/src/extension/interfaces.go create mode 100644 source/plugins/go/src/extension/socket_writer.go create mode 100644 source/plugins/go/src/ingestion_token_utils.go create mode 100644 source/plugins/go/src/utils_test.go create mode 100644 source/plugins/ruby/extension.rb create mode 100644 source/plugins/ruby/extension_utils.rb create mode 100644 source/plugins/ruby/in_kube_nodes_test.rb create mode 100644 source/plugins/ruby/oms_common.rb create mode 100644 source/plugins/ruby/omslog.rb create mode 100644 test/e2e/conformance.yaml create mode 100644 test/e2e/src/core/setup_failure_handler.py create mode 100644 test/unit-tests/canned-api-responses/kube-nodes-malformed.txt create mode 100644 test/unit-tests/canned-api-responses/kube-nodes.txt create mode 100755 test/unit-tests/run_go_tests.sh create mode 100755 test/unit-tests/run_ruby_tests.sh create mode 100644 test/unit-tests/test_driver.rb diff --git a/.github/workflows/pr-checker.yml b/.github/workflows/pr-checker.yml index 0bb2da7f5..ec6e623b8 100644 --- a/.github/workflows/pr-checker.yml +++ b/.github/workflows/pr-checker.yml @@ -56,7 +56,7 @@ jobs: format: 'table' severity: 'CRITICAL,HIGH' vuln-type: 'os,library' - skip-dirs: 'opt/telegraf,usr/sbin/telegraf,opt/td-agent-bit/bin/out_oms.so' + skip-dirs: 'opt/telegraf,usr/sbin/telegraf' exit-code: '1' timeout: '5m0s' WINDOWS-build: @@ -94,4 +94,4 @@ jobs: cd ./kubernetes/windows/ && docker build . --file Dockerfile -t $env:IMAGETAG --build-arg IMAGE_TAG=$env:IMAGETAG_TELEMETRY - name: List-docker-images run: docker images --digests --all - + diff --git a/.github/workflows/run_unit_tests.yml b/.github/workflows/run_unit_tests.yml new file mode 100644 index 000000000..94ac4371a --- /dev/null +++ b/.github/workflows/run_unit_tests.yml @@ -0,0 +1,34 @@ +name: Run Unit Tests +on: + pull_request: + types: [opened, synchronize, reopened] + branches: + - ci_dev + - ci_prod + push: + branches: + - ci_dev + - ci_prod +jobs: + Golang-Tests: + runs-on: ubuntu-latest + steps: + - name: Check out repository code + uses: actions/checkout@v2 + - name: Run unit tests + run: | + cd ${{ github.workspace }} + ./test/unit-tests/run_go_tests.sh + Ruby-Tests: + runs-on: ubuntu-latest + steps: + - name: Check out repository code + uses: actions/checkout@v2 + - name: install fluent + run: | + sudo gem install fluentd -v "1.12.2" --no-document + sudo fluentd --setup ./fluent + - name: Run unit tests + run: | + cd ${{ github.workspace }} + ./test/unit-tests/run_ruby_tests.sh diff --git a/.github/workflows/stale.yml b/.github/workflows/stale.yml new file mode 100644 index 000000000..1d91df09d --- /dev/null +++ b/.github/workflows/stale.yml @@ -0,0 +1,28 @@ +name: Mark stale issues and pull requests + +on: + schedule: + - cron: "30 10 * * *" + +jobs: + stale: + + runs-on: ubuntu-latest + permissions: + issues: write + pull-requests: write + + steps: + - uses: actions/stale@v3 + with: + repo-token: ${{ secrets.GITHUB_TOKEN }} + days-before-issue-stale: 7 + days-before-pr-stale: 7 + stale-issue-message: 'This issue is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 5 days.' + stale-pr-message: 'This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 5 days.' + close-issue-message: 'This issue was closed because it has been stalled for 12 days with no activity.' + close-pr-message: 'This PR was closed because it has been stalled for 12 days with no activity.' + days-before-issue-close: 5 + days-before-pr-close: 5 + stale-issue-label: 'no-issue-activity' + stale-pr-label: 'no-pr-activity' diff --git a/.gitignore b/.gitignore index 2e2978e91..b0467519c 100644 --- a/.gitignore +++ b/.gitignore @@ -26,3 +26,7 @@ intermediate kubernetes/linux/Linux_ULINUX_1.0_x64_64_Release # ignore generated .h files for go source/plugins/go/src/*.h +*_mock.go +*_log.txt +*.log +*.byebug_history diff --git a/.pipelines/build-linux.sh b/.pipelines/build-linux.sh index 53f6a3a07..1441a7ede 100644 --- a/.pipelines/build-linux.sh +++ b/.pipelines/build-linux.sh @@ -15,6 +15,11 @@ echo "----------- Build Docker Provider -------------------------------" make cd $DIR +echo "------------ Bundle Shell Extension Scripts for Agent Release -------------------------" +cd $DIR/../deployment/agent-deployment/ServiceGroupRoot/Scripts +tar -czvf ../artifacts.tar.gz pushAgentToAcr.sh +cd $DIR + echo "------------ Bundle Shell Extension Scripts & HELM chart -------------------------" cd $DIR/../deployment/arc-k8s-extension/ServiceGroupRoot/Scripts tar -czvf ../artifacts.tar.gz ../../../../charts/azuremonitor-containers/ pushChartToAcr.sh diff --git a/.pipelines/pipeline.user.linux.official.all_tag.all_phase.all_config.ci_prod.yml b/.pipelines/pipeline.user.linux.official.all_tag.all_phase.all_config.ci_prod.yml index d47a60ffe..fa0d779b2 100644 --- a/.pipelines/pipeline.user.linux.official.all_tag.all_phase.all_config.ci_prod.yml +++ b/.pipelines/pipeline.user.linux.official.all_tag.all_phase.all_config.ci_prod.yml @@ -28,6 +28,27 @@ build: name: 'Build Docker Provider Shell Bundle' command: '.pipelines/build-linux.sh' fail_on_stderr: false + artifacts: + - from: 'deployment' + to: 'build' + include: + - '**' + - from: '.pipelines' + to: 'build' + include: + - '*.sh' + - from: 'kubernetes' + to: 'build' + include: + - '*.yaml' + - from: 'charts' + to: 'build' + include: + - '**' + - from: 'test/e2e' + to: 'build' + include: + - '*.yaml' package: commands: @@ -40,5 +61,9 @@ package: # to be named differently. Defaults to Dockerfile. # In effect, the -f option value passed to docker build will be repository_checkout_folder/src/DockerFinal/Foo.dockerfile. repository_name: 'cdpxlinux' # only supported ones are cdpx acr repos - tag: 'ciprod' # OPTIONAL: Defaults to latest. The tag for the built image. Final tag will be 1.0.0alpha, 1.0.0-timestamp-commitID. + tag: 'ciprod' # OPTIONAL: Defaults to latest. The tag for the built image. Final tag will be 1.0.0alpha, 1.0.0-timestamp-commitID. latest: false # OPTIONAL: Defaults to false. If tag is not set to latest and this flag is set, then tag as latest as well and push latest as well. + publish_unique_tag: true # If set, the image in the registry is tagged with the unique tag generated by CDPx + metadata_file: + artifact_path: 'linux-image-meta.json' # If defined, the drop outputs relative path to the file into which JSON metadata about the created image is emitted. + export_to_artifact_path: 'agentimage.tar.gz' # path for exported image and use this instead of fixed tag diff --git a/.pipelines/pipeline.user.linux.yml b/.pipelines/pipeline.user.linux.yml index 565661d64..9f12cbcbd 100644 --- a/.pipelines/pipeline.user.linux.yml +++ b/.pipelines/pipeline.user.linux.yml @@ -33,6 +33,22 @@ build: to: 'build' include: - '**' + - from: '.pipelines' + to: 'build' + include: + - '*.sh' + - from: 'kubernetes' + to: 'build' + include: + - '*.yaml' + - from: 'charts' + to: 'build' + include: + - '**' + - from: 'test/e2e' + to: 'build' + include: + - '*.yaml' package: commands: @@ -47,3 +63,7 @@ package: repository_name: 'cdpxlinux' # only supported ones are cdpx acr repos tag: 'cidev' # OPTIONAL: Defaults to latest. The tag for the built image. Final tag will be 1.0.0alpha, 1.0.0-timestamp-commitID. latest: false # OPTIONAL: Defaults to false. If tag is not set to latest and this flag is set, then tag as latest as well and push latest as well. + publish_unique_tag: true # If set, the image in the registry is tagged with the unique tag generated by CDPx + metadata_file: + artifact_path: 'linux-image-meta.json' # If defined, the drop outputs relative path to the file into which JSON metadata about the created image is emitted. + export_to_artifact_path: 'agentimage.tar.gz' # path for exported image and use this instead of fixed tag diff --git a/.pipelines/pipeline.user.windows.official.all_tag.all_phase.all_config.ci_prod.yml b/.pipelines/pipeline.user.windows.official.all_tag.all_phase.all_config.ci_prod.yml index e0286fbd6..d31def95c 100644 --- a/.pipelines/pipeline.user.windows.official.all_tag.all_phase.all_config.ci_prod.yml +++ b/.pipelines/pipeline.user.windows.official.all_tag.all_phase.all_config.ci_prod.yml @@ -5,7 +5,7 @@ environment: version: '2019' runtime: provider: 'appcontainer' - image: 'cdpxwin1809.azurecr.io/user/azure-monitor/container-insights:6.0' + image: 'cdpxwin1809.azurecr.io/user/azure-monitor/container-insights:latest' source_mode: 'map' version: @@ -53,3 +53,7 @@ package: repository_name: 'cdpxwin1809' # only supported ones are cdpx acr repos tag: 'win-ciprod' # OPTIONAL: Defaults to latest. The tag for the built image. Final tag will be 1.0.0alpha, 1.0.0-timestamp-commitID. latest: false # OPTIONAL: Defaults to false. If tag is not set to latest and this flag is set, then tag as latest as well and push latest as well. + publish_unique_tag: true # If set, the image in the registry is tagged with the unique tag generated by CDPx + metadata_file: + artifact_path: 'windows-image-meta.json' # If defined, the drop outputs relative path to the file into which JSON metadata about the created image is emitted. + export_to_artifact_path: 'agentimage.tar.zip' # path for exported image and use this instead of fixed tag diff --git a/.pipelines/pipeline.user.windows.yml b/.pipelines/pipeline.user.windows.yml index 2b7a54ae9..8be92a316 100644 --- a/.pipelines/pipeline.user.windows.yml +++ b/.pipelines/pipeline.user.windows.yml @@ -5,7 +5,7 @@ environment: version: '2019' runtime: provider: 'appcontainer' - image: 'cdpxwin1809.azurecr.io/user/azure-monitor/container-insights:6.0' + image: 'cdpxwin1809.azurecr.io/user/azure-monitor/container-insights:latest' source_mode: 'map' version: @@ -53,3 +53,7 @@ package: repository_name: 'cdpxwin1809' # only supported ones are cdpx acr repos tag: 'win-cidev' # OPTIONAL: Defaults to latest. The tag for the built image. Final tag will be 1.0.0alpha, 1.0.0-timestamp-commitID. latest: false # OPTIONAL: Defaults to false. If tag is not set to latest and this flag is set, then tag as latest as well and push latest as well. + publish_unique_tag: true # If set, the image in the registry is tagged with the unique tag generated by CDPx + metadata_file: + artifact_path: 'windows-image-meta.json' # If defined, the drop outputs relative path to the file into which JSON metadata about the created image is emitted. + export_to_artifact_path: 'agentimage.tar.zip' # path for exported image and use this instead of fixed tag diff --git a/.pipelines/release-agent.sh b/.pipelines/release-agent.sh new file mode 100644 index 000000000..b34dd9995 --- /dev/null +++ b/.pipelines/release-agent.sh @@ -0,0 +1,74 @@ +#!/bin/bash + +# Note - This script used in the pipeline as inline script + +# These are plain pipeline variable which can be modified anyone in the team +# AGENT_RELEASE=cidev +# AGENT_IMAGE_TAG_SUFFIX=07222021 + +#Name of the ACR for ciprod & cidev images +ACR_NAME=containerinsightsprod.azurecr.io +AGENT_IMAGE_FULL_PATH=${ACR_NAME}/public/azuremonitor/containerinsights/${AGENT_RELEASE}:${AGENT_RELEASE}${AGENT_IMAGE_TAG_SUFFIX} +AGENT_IMAGE_TAR_FILE_NAME=agentimage.tar.gz + +if [ -z $AGENT_IMAGE_TAG_SUFFIX ]; then + echo "-e error value of AGENT_RELEASE variable shouldnt be empty" + exit 1 +fi + +if [ -z $AGENT_RELEASE ]; then + echo "-e error AGENT_RELEASE shouldnt be empty" + exit 1 +fi + +echo "ACR NAME - ${ACR_NAME}" +echo "AGENT RELEASE - ${AGENT_RELEASE}" +echo "AGENT IMAGE TAG SUFFIX - ${AGENT_IMAGE_TAG_SUFFIX}" +echo "AGENT IMAGE FULL PATH - ${AGENT_IMAGE_FULL_PATH}" +echo "AGENT IMAGE TAR FILE PATH - ${AGENT_IMAGE_TAR_FILE_NAME}" + +echo "loading linuxagent image tarball" +IMAGE_NAME=$(docker load -i ${AGENT_IMAGE_TAR_FILE_NAME}) +echo IMAGE_NAME: $IMAGE_NAME +if [ $? -ne 0 ]; then + echo "-e error, on loading linux agent tarball from ${AGENT_IMAGE_TAR_FILE_NAME}" + echo "** Please check if this caused due to build error **" + exit 1 +else + echo "successfully loaded linux agent image tarball" +fi +# IMAGE_ID=$(docker images $IMAGE_NAME | awk '{print $3 }' | tail -1) +# echo "Image Id is : ${IMAGE_ID}" +prefix="Loadedimage:" +IMAGE_NAME=$(echo $IMAGE_NAME | tr -d '"' | tr -d "[:space:]") +IMAGE_NAME=${IMAGE_NAME/#$prefix} +echo "*** trimmed image name-:${IMAGE_NAME}" +echo "tagging the image $IMAGE_NAME as ${AGENT_IMAGE_FULL_PATH}" +# docker tag $IMAGE_NAME ${AGENT_IMAGE_FULL_PATH} +docker tag $IMAGE_NAME $AGENT_IMAGE_FULL_PATH + +if [ $? -ne 0 ]; then + echo "-e error tagging the image $IMAGE_NAME as ${AGENT_IMAGE_FULL_PATH}" + exit 1 +else + echo "successfully tagged the image $IMAGE_NAME as ${AGENT_IMAGE_FULL_PATH}" +fi + +# used pipeline identity to push the image to ciprod acr +echo "logging to acr: ${ACR_NAME}" +az acr login --name ${ACR_NAME} +if [ $? -ne 0 ]; then + echo "-e error log into acr failed: ${ACR_NAME}" + exit 1 +else + echo "successfully logged into acr:${ACR_NAME}" +fi + +echo "pushing ${AGENT_IMAGE_FULL_PATH}" +docker push ${AGENT_IMAGE_FULL_PATH} +if [ $? -ne 0 ]; then + echo "-e error on pushing the image ${AGENT_IMAGE_FULL_PATH}" + exit 1 +else + echo "Successfully pushed the image ${AGENT_IMAGE_FULL_PATH}" +fi diff --git a/Dev Guide.md b/Dev Guide.md new file mode 100644 index 000000000..7057a4afe --- /dev/null +++ b/Dev Guide.md @@ -0,0 +1,125 @@ +# Dev Guide + +More advanced information needed to develop or build the docker provider will live here + + + +## Testing +Last updated 8/18/2021 + +To run all unit tests run the commands `test/unit-tests/run_go_tests.sh` and `test/unit-tests/run_ruby_tests.sh` + +#### Conventions: +1. Unit tests should go in their own file, but in the same folder as the source code their testing. For example, the tests for `in_kube_nodes.rb` are in `in_kube_nodes_test.rb`. Both files are in the folder `source/plugin/ruby`. + +### Ruby +Sample tests are provided in [in_kube_nodes_test.rb](source/plugin/ruby/in_kube_nodes_test.rb). They are meant to demo the tooling used for unit tests (as opposed to being comprehensive tests). Basic techniques like mocking are demonstrated there. + +#### Conventions: +1. When modifying a fluentd plugin for unit testing, any mocked classes (like KubernetesApiClient, applicationInsightsUtility, env, etc.) should be passed in as optional arguments of initialize. For example: +``` + def initialize + super +``` +would be turned into +``` + def initialize (kubernetesApiClient=nil, applicationInsightsUtility=nil, extensionUtils=nil, env=nil) + super() +``` + +2. Having end-to-end tests of all fluentd plugins is a longshot. We care more about unit testing smaller blocks of functionality (like all the helper functions in KubeNodeInventory.rb). Unit tests for fluentd plugins are not expected. + +### Golang + +Since golang is statically compiled, mocking requires a lot more work than in ruby. Sample tests are provided in [utils_test.go](source/plugin/go/src/utils_test.go) and [extension_test.go](source/plugin/go/src/extension/extension_test.go). Again, they are meant to demo the tooling used for unit tests (as opposed to being comprehensive tests). Basic techniques like mocking are demonstrated there. + +#### Mocking: +Mocks are generated with gomock (mockgen). +* Mock files should be called *_mock.go (socket_writer.go => socket_writer_mock.go) +* Mocks should not be checked in to git. (they have been added to the .gitignore) +* The command to generate mock files should go in a `//go:generate` comment at the top of the mocked file (see [socket_writer.go](source/plugin/go/src/extension/socket_writer.go) for an example). This way mocks can be generated by the unit test script. +* Mocks also go in the same folder as the mocked files. This is unfortunate, but necessary to avoid circular package dependencies (anyone else feel free to figure out how to move mocks to a separate folder) + +Using mocks is also a little tricky. In order to mock functions in a package with gomock, they must be converted to reciever methods of a struct. This way the struct can be swapped out at runtime to change which implementaions of a method are called. See the example below: + +``` +// declare all functions to be mocked in this interface +type registrationPreCheckerInterface interface { + FUT(string) bool +} + +// Create a struct which implements the above interface +type regPreCheck struct{} + +func (r regPreCheck) FUT(email string) bool { + fmt.Println("real FUT() called") + return true +} + +// Create a global variable and assign it to the struct +var regPreCondVar registrationPreCheckerInterface + +func init() { + regPreCondVar = regPreCheck{} +} +``` + +Now any code wishing to call FUT() will call `regPreCondVar.FUT("")` + +A unit test can substitute its own implementaion of FUT() like so + +``` +// This will hold the mock of FUT we want to substitute +var FUTMock func(email string) bool + +// create a new struct which implements the earlier interface +type regPreCheckMock struct{} + +func (u regPreCheckMock) FUT(email string) bool { + return FUTMock(email) +} +``` + +Everything is set up. Now a unit test can substitute in a mock like so: + +``` +func someUnitTest() { + // This will call the actual implementaion of FUT() + regPreCondVar.FUT("") + + // Now the test creates another struct to substitue. After this like all calls to FUT() will be diverted + regPreCondVar = regPreCheckMock{} + + // substute another function to run instead of FUT() + FUTMock = func(email string) bool { + fmt.Println("FUT 1 called") + return false + } + // This will call the function defined right above + regPreCondVar.FUT("") + + // We can substitue another implementation + FUTMock = func(email string) bool { + fmt.Println("FUT 2 called") + return false + } + regPreCondVar.FUT("") + + // put the old behavior back + regPreCondVar = regPreCheck{} + // this will call the actual implementation of FUT() + regPreCondVar.FUT("") + +} +``` + +A concrete example of this can be found in [socket_writer.go](source/plugin/go/src/extension/socket_writer.go) and [extension_test.go](source/plugin/go/src/extension/extension_test.go). Again, if anybody has a better way feel free to update this guide. + + + +A simpler way to test a specific function is to write wrapper functions. Test code calls the inner function (ReadFileContentsImpl) and product code calls the wrapper function (ReadFileContents). The wrapper function provides any outside state which a unit test would want to control (like a function to read a file). This option makes product code more verbose, but probably easier to read too. Either way is acceptable. +``` +func ReadFileContents(fullPathToFileName string) (string, error) { + return ReadFileContentsImpl(fullPathToFileName, ioutil.ReadFile) +} +``` diff --git a/Documentation/AgentSettings/ReadMe.md b/Documentation/AgentSettings/ReadMe.md new file mode 100644 index 000000000..3e55d7d44 --- /dev/null +++ b/Documentation/AgentSettings/ReadMe.md @@ -0,0 +1,26 @@ +## Configurable agent settings for high scale prometheus metric scraping using pod annotations with prometheus sidecar. + +Container Insights agent runs native prometheus telegraf plugin to scrape prometheus metrics using pod annotations. +The metrics scraped from the telegraf plugin are sent to the fluent bit tcp listener. +In order to support higher volumes of prometheus metrics scraping some of the tcp listener settings can be tuned. +[Fluent Bit TCP listener](https://docs.fluentbit.io/manual/pipeline/inputs/tcp) + +* Chunk Size - This can be increased to process bigger chunks of data. + +* Buffer Size - This should be greater than or equal to the chunk size. + +* Mem Buf Limit - This can be increased to increase the buffer size. But the memory limit on the sidecar also needs to be increased accordingly. +Note that this can only be achieved using helm chart today. + + +** Note - The LA ingestion team also states that higher chunk sizes might not necessarily mean higher throughput since there are pipeline limitations. + +``` + agent-settings: |- + # prometheus scrape fluent bit settings for high scale + # buffer size should be greater than or equal to chunk size else we set it to chunk size. + [agent_settings.prometheus_fbit_settings] + tcp_listener_chunk_size = 10 + tcp_listener_buffer_size = 10 + tcp_listener_mem_buf_limit = 200 +``` diff --git a/README.md b/README.md index 555234c61..85b74695d 100644 --- a/README.md +++ b/README.md @@ -210,6 +210,32 @@ powershell -ExecutionPolicy bypass # switch to powershell if you are not on pow .\build-and-publish-docker-image.ps1 -image /: # trigger build code and image and publish docker hub or acr ``` +##### Developer Build optimizations +If you do not want to build the image from scratch every time you make changes during development,you can choose to build the docker images that are separated out by +* Base image and dependencies including agent bootstrap(setup.ps1) +* Agent conf and plugin changes + +To do this, the very first time you start developing you would need to execute below instructions in elevated command prompt of powershell. +This builds the base image(omsagent-win-base) with all the package dependencies +``` +cd %userprofile%\Docker-Provider\kubernetes\windows\dockerbuild # based on your repo path +docker login # if you want to publish the image to acr then login to acr via `docker login ` +powershell -ExecutionPolicy bypass # switch to powershell if you are not on powershell already +.\build-dev-base-image.ps1 # builds base image and dependencies +``` + +And then run the script to build the image consisting of code and conf changes. +``` +.\build-and-publish-dev-docker-image.ps1 -image /: # trigger build code and image and publish docker hub or acr +``` + +For the subsequent builds, you can just run - + +``` +.\build-and-publish-dev-docker-image.ps1 -image /: # trigger build code and image and publish docker hub or acr +``` +###### Note - If you have changes in setup.ps1 and want to test those changes, uncomment the section consisting of setup.ps1 in the Dockerfile-dev-image file. + #### Option 2 - Using WSL2 to Build the Windows agent ##### On WSL2, Build Certificate Generator Source code and Out OMS Go plugin code @@ -233,7 +259,7 @@ docker push /: # Azure DevOps Build Pipeline -Navigate to https://github-private.visualstudio.com/microsoft/_build?view=pipelines to see Linux and Windows Agent build pipelines. These pipelines are configured with CI triggers for ci_dev and ci_prod. +Navigate to https://github-private.visualstudio.com/microsoft/_build?definitionScope=%5CCDPX%5Cdocker-provider to see Linux and Windows Agent build pipelines. These pipelines are configured with CI triggers for ci_dev and ci_prod. Docker Images will be pushed to CDPX ACR repos and these needs to retagged and pushed to corresponding ACR or docker hub. Only onboarded Azure AD AppId has permission to pull the images from CDPx ACRs. @@ -276,13 +302,13 @@ For DEV and PROD branches, automatically deployed latest yaml with latest agent ## For executing tests 1. Deploy the omsagent.yaml with your agent image. In the yaml, make sure `ISTEST` environment variable set to `true` if its not set already -2. Update the Service Principal CLIENT_ID, CLIENT_SECRET and TENANT_ID placeholder values and apply e2e-tests.yaml to execute the tests +2. Update the Service Principal CLIENT_ID, CLIENT_SECRET and TENANT_ID placeholder values and apply e2e-tests.yaml to execute the tests > Note: Service Principal requires reader role on log analytics workspace and cluster resource to query LA and metrics ``` - cd ~/Docker-Provider/test/e2e # based on your repo path - kubectl apply -f e2e-tests.yaml # this will trigger job to run the tests in sonobuoy namespace - kubectl get po -n sonobuoy # to check the pods and jobs associated to tests - ``` + cd ~/Docker-Provider/test/e2e # based on your repo path + kubectl apply -f e2e-tests.yaml # this will trigger job to run the tests in sonobuoy namespace + kubectl get po -n sonobuoy # to check the pods and jobs associated to tests + ``` 3. Download (sonobuoy)[https://github.com/vmware-tanzu/sonobuoy/releases] on your dev box to view the results of the tests ``` results=$(sonobuoy retrieve) # downloads tar file which has logs and test results @@ -293,14 +319,14 @@ For DEV and PROD branches, automatically deployed latest yaml with latest agent ## For adding new tests 1. Add the test python file with your test code under `tests` directory -2. Build the docker image, recommended to use ACR & MCR +2. Build the docker image, recommended to use ACR & MCR ``` - cd ~/Docker-Provider/test/e2e/src # based on your repo path + cd ~/Docker-Provider/test/e2e/src # based on your repo path docker login -u -p # login to acr docker build -f ./core/Dockerfile -t /: . docker push /: ``` -3. update existing agentest image tag in e2e-tests.yaml with newly built image tag with MCR repo +3. update existing agentest image tag in e2e-tests.yaml & conformance.yaml with newly built image tag with MCR repo # Scenario Tests Clusters are used in release pipeline already has the yamls under test\scenario deployed. Make sure to validate these scenarios. diff --git a/ReleaseNotes.md b/ReleaseNotes.md index 4f060c925..bf20030b5 100644 --- a/ReleaseNotes.md +++ b/ReleaseNotes.md @@ -11,6 +11,32 @@ additional questions or comments. Note : The agent version(s) below has dates (ciprod), which indicate the agent build dates (not release dates) +### 10/08/2021 - +##### Version microsoft/oms:ciprod10082021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10082021 (linux) +##### Version microsoft/oms:win-ciprod10082021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod10082021 (windows) +##### Code change log +- Linux Agent + - MDSD Proxy support for non-AKS + - log rotation for mdsd log files {err,warn, info & qos} + - Onboarding status + - AAD Auth MSI changes (not usable externally yet) + - Upgrade k8s and adx go packages to fix vulnerabilities + - Fix missing telegraf metrics (TelegrafMetricsSentCount & TelegrafMetricsSendErrorCount) in mdsd route + - Improve fluentd liveness probe checks to handle both supervisor and worker process + - Fix telegraf startup issue when endpoint is unreachable +- Windows Agent + - Windows liveness probe optimization +- Common + - Add new metrics to MDM for allocatable % calculation of cpu and memory usage +- Other changes + - Helm chart updates for removal of rbac api version and deprecation of.Capabilities.KubeVersion.GitVersion to .Capabilities.KubeVersion.Version + - Updates to build and release ev2 + - Scripts to collect troubleshooting logs + - Unit test tooling + - Yaml updates in parity with aks rp yaml + - upgrade golang version for windows in pipelines + - Conformance test updates + ### 09/02/2021 - ##### Version microsoft/oms:ciprod08052021-1 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod08052021-1 (linux) ##### Code change log @@ -37,7 +63,7 @@ Note : The agent version(s) below has dates (ciprod), which indicate t ##### Version microsoft/oms:ciprod06112021-1 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021-1 (linux) ##### Version microsoft/oms:win-ciprod06112021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod06112021 (windows) ##### Code change log -- Hotfix for crash (which triggered when scaling down of windows nodes) in clean_cache method in in_kube_node_inventory plugin +- Hotfix for crash in clean_cache in in_kube_node_inventory plugin - We didn't rebuild windows container, so the image version for windows container stays the same as last release (ciprod:win-ciprod06112021) before this hotfix ### 06/11/2021 - diff --git a/ReleaseProcess.md b/ReleaseProcess.md index 8ec91546c..7bd858561 100644 --- a/ReleaseProcess.md +++ b/ReleaseProcess.md @@ -13,14 +13,12 @@ Here are the high-level instructions to get the CIPROD`
` image for 2. Make PR to ci_dev branch and once the PR approved, merge the changes to ci_dev 3. Latest bits of ci_dev automatically deployed to CIDEV cluster in build subscription so just validated E2E to make sure everthing works 4. If everything validated in DEV, make merge PR from ci_dev and ci_prod and merge once this reviewed by dev team -6. Update following pipeline variables under ReleaseCandiate with version of chart and image tag - - CIHELMCHARTVERSION # For example, 2.7.4 - - CIImageTagSuffix # ciprod08072020 or ciprod08072020-1 etc. -7. Merge ci_dev and ci_prod branch which will trigger automatic deployment of latest bits to CIPROD cluster with CIPROD`
` image to test and scale cluters, AKS, AKS-Engine - > Note: production image automatically pushed to CIPROD Public cloud ACR which will inturn replicated to Public cloud MCR. +5. Once the PR to ci_prod approved, please go-ahead and merge, and wait for ci_prod build successfully completed +6. Once the merged PR build successfully completed, update the value of AGENT_IMAGE_TAG_SUFFIX pipeline variable by editing the Release [ci-prod-release](https://github-private.visualstudio.com/microsoft/_release?_a=releases&view=mine&definitionId=38) + > Note - value format of AGENT_IMAGE_TAG_SUFFIX pipeline should be in `
` for our releases +7. Create a release by selecting the targetted build version of the _docker-provider_Official-ci_prod release 8. Validate all the scenarios against clusters in build subscription and scale clusters - # 2. Perf and scale testing Deploy latest omsagent yaml with release candidate agent image in to supported k8s versions and validate all the critical scenarios. In perticular, throughly validate the updates going as part of this release and also make sure no regressions. If this passes, deploy onto scale cluster and validate perf and scale aspects. Scale cluster in AME cloud and co-ordinate with agent team who has access to this cluster to deploy the release candiate onto this cluster. @@ -39,48 +37,49 @@ Image automatically synched to MCR CN from Public cloud MCR. Make PR against [AKS-Engine](https://github.com/Azure/aks-engine). Refer PR https://github.com/Azure/aks-engine/pull/2318 -## Arc for Kubernetes +## Arc for Kubernetes -Ev2 pipeline used to deploy the chart of the Arc K8s Container Insights Extension as per Safe Deployment Process. +Ev2 pipeline used to deploy the chart of the Arc K8s Container Insights Extension as per Safe Deployment Process. Here is the high level process ``` 1. Specify chart version of the release candidate and trigger [container-insights-arc-k8s-extension-ci_prod-release](https://github-private.visualstudio.com/microsoft/_release?_a=releases&view=all) 2. Get the approval from one of team member for the release - 3. Once the approved, release should be triggered automatically + 3. Once the approved, release should be triggered automatically 4. use `cimon-arck8s-eastus2euap` for validating latest release in canary region 5. TBD - Notify vendor team for the validation on all Arc K8s supported platforms ``` ## Microsoft Charts Repo release for On-prem K8s +> Note: This chart repo being used in the ARO v4 onboarding script as well. -Since HELM charts repo being deprecated, Microsoft charts repo being used for HELM chart release of on-prem K8s clusters. -To make chart release PR, fork [Microsoft-charts-repo]([https://github.com/microsoft/charts/tree/gh-pages) and make the PR against `gh-pages` branch of the upstream repo. +Since HELM charts repo being deprecated, Microsoft charts repo being used for HELM chart release of on-prem K8s clusters. +To make chart release PR, fork [Microsoft-charts-repo]([https://github.com/microsoft/charts/tree/gh-pages) and make the PR against `gh-pages` branch of the upstream repo. Refer PR - https://github.com/microsoft/charts/pull/23 for example. Once the PR merged, latest version of HELM chart should be available in couple of mins in https://microsoft.github.io/charts/repo and https://artifacthub.io/. Instructions to create PR ``` -# 1. create helm package for the release candidate +# 1. create helm package for the release candidate git clone git@github.com:microsoft/Docker-Provider.git git checkout ci_prod cd ~/Docker-Provider/charts/azuremonitor-containers # this path based on where you have cloned the repo - helm package . + helm package . -# 2. clone your fork repo and checkout gh_pages branch # gh_pages branch used as release branch - cd ~ +# 2. clone your fork repo and checkout gh_pages branch # gh_pages branch used as release branch + cd ~ git clone cd ~/charts # assumed the root dir of the clone is charts git checkout gh_pages -# 3. copy release candidate helm package - cd ~/charts/repo/azuremonitor-containers +# 3. copy release candidate helm package + cd ~/charts/repo/azuremonitor-containers # update chart version value with the version of chart being released - cp ~/Docker-Provider/charts/azuremonitor-containers/azuremonitor-containers-.tgz . + cp ~/Docker-Provider/charts/azuremonitor-containers/azuremonitor-containers-.tgz . cd ~/charts/repo - # update repo index file + # update repo index file helm repo index . - + # 4. Review the changes and make PR. Please note, you may need to revert unrelated changes automatically added by `helm repo index .` command ``` diff --git a/build/common/installer/scripts/td-agent-bit-conf-customizer.rb b/build/common/installer/scripts/td-agent-bit-conf-customizer.rb index 82c6c1d17..f29c87407 100644 --- a/build/common/installer/scripts/td-agent-bit-conf-customizer.rb +++ b/build/common/installer/scripts/td-agent-bit-conf-customizer.rb @@ -3,9 +3,7 @@ @td_agent_bit_conf_path = "/etc/opt/microsoft/docker-cimprov/td-agent-bit.conf" -@default_service_interval = "1" -@default_buffer_chunk_size = "1" -@default_buffer_max_size = "1" +@default_service_interval = "15" @default_mem_buf_limit = "10" def is_number?(value) @@ -25,9 +23,9 @@ def substituteFluentBitPlaceHolders serviceInterval = (!interval.nil? && is_number?(interval) && interval.to_i > 0 ) ? interval : @default_service_interval serviceIntervalSetting = "Flush " + serviceInterval - tailBufferChunkSize = (!bufferChunkSize.nil? && is_number?(bufferChunkSize) && bufferChunkSize.to_i > 0) ? bufferChunkSize : @default_buffer_chunk_size + tailBufferChunkSize = (!bufferChunkSize.nil? && is_number?(bufferChunkSize) && bufferChunkSize.to_i > 0) ? bufferChunkSize : nil - tailBufferMaxSize = (!bufferMaxSize.nil? && is_number?(bufferMaxSize) && bufferMaxSize.to_i > 0) ? bufferMaxSize : @default_buffer_max_size = "1" + tailBufferMaxSize = (!bufferMaxSize.nil? && is_number?(bufferMaxSize) && bufferMaxSize.to_i > 0) ? bufferMaxSize : nil if ((!tailBufferChunkSize.nil? && tailBufferMaxSize.nil?) || (!tailBufferChunkSize.nil? && !tailBufferMaxSize.nil? && tailBufferChunkSize.to_i > tailBufferMaxSize.to_i)) puts "config:warn buffer max size must be greater or equal to chunk size" diff --git a/build/common/installer/scripts/tomlparser-prom-agent-config.rb b/build/common/installer/scripts/tomlparser-prom-agent-config.rb new file mode 100644 index 000000000..be9d08e59 --- /dev/null +++ b/build/common/installer/scripts/tomlparser-prom-agent-config.rb @@ -0,0 +1,102 @@ +#!/usr/local/bin/ruby + +#this should be require relative in Linux and require in windows, since it is a gem install on windows +@os_type = ENV["OS_TYPE"] +if !@os_type.nil? && !@os_type.empty? && @os_type.strip.casecmp("windows") == 0 + require "tomlrb" +else + require_relative "tomlrb" +end + +require_relative "ConfigParseErrorLogger" + +@configMapMountPath = "/etc/config/settings/agent-settings" +@configSchemaVersion = "" + +@promFbitChunkSize = 10 +@promFbitBufferSize = 10 +@promFbitMemBufLimit = 200 + +def is_number?(value) + true if Integer(value) rescue false +end + +# Use parser to parse the configmap toml file to a ruby structure +def parseConfigMap + begin + # Check to see if config map is created + if (File.file?(@configMapMountPath)) + puts "config::configmap container-azm-ms-agentconfig for sidecar agent settings mounted, parsing values" + parsedConfig = Tomlrb.load_file(@configMapMountPath, symbolize_keys: true) + puts "config::Successfully parsed mounted config map" + return parsedConfig + else + puts "config::configmap container-azm-ms-agentconfig for sidecar agent settings not mounted, using defaults" + return nil + end + rescue => errorStr + ConfigParseErrorLogger.logError("Exception while parsing config map for sidecar agent settings : #{errorStr}, using defaults, please check config map for errors") + return nil + end +end + +# Use the ruby structure created after config parsing to set the right values to be used as environment variables +def populateSettingValuesFromConfigMap(parsedConfig) + begin + if !parsedConfig.nil? && !parsedConfig[:agent_settings].nil? + # fbit config settings + prom_fbit_config = parsedConfig[:agent_settings][:prometheus_fbit_settings] + if !prom_fbit_config.nil? + chunk_size = prom_fbit_config[:tcp_listener_chunk_size] + if !chunk_size.nil? && is_number?(chunk_size) && chunk_size.to_i > 0 + @promFbitChunkSize = chunk_size.to_i + puts "Using config map value: AZMON_SIDECAR_FBIT_CHUNK_SIZE = #{@promFbitChunkSize.to_s + "m"}" + end + buffer_size = prom_fbit_config[:tcp_listener_buffer_size] + if !buffer_size.nil? && is_number?(buffer_size) && buffer_size.to_i > 0 + @promFbitBufferSize = buffer_size.to_i + puts "Using config map value: AZMON_SIDECAR_FBIT_BUFFER_SIZE = #{@promFbitBufferSize.to_s + "m"}" + if @promFbitBufferSize < @promFbitChunkSize + @promFbitBufferSize = @promFbitChunkSize + puts "Setting Fbit buffer size equal to chunk size since it is set to less than chunk size - AZMON_SIDECAR_FBIT_BUFFER_SIZE = #{@promFbitBufferSize.to_s + "m"}" + end + end + mem_buf_limit = prom_fbit_config[:tcp_listener_mem_buf_limit] + if !mem_buf_limit.nil? && is_number?(mem_buf_limit) && mem_buf_limit.to_i > 0 + @promFbitMemBufLimit = mem_buf_limit.to_i + puts "Using config map value: AZMON_SIDECAR_FBIT_MEM_BUF_LIMIT = #{@promFbitMemBufLimit.to_s + "m"}" + end + end + end + rescue => errorStr + puts "config::error:Exception while reading config settings for sidecar agent configuration setting - #{errorStr}, using defaults" + end +end + +@configSchemaVersion = ENV["AZMON_AGENT_CFG_SCHEMA_VERSION"] +puts "****************Start Sidecar Agent Config Processing********************" +if !@configSchemaVersion.nil? && !@configSchemaVersion.empty? && @configSchemaVersion.strip.casecmp("v1") == 0 #note v1 is the only supported schema version , so hardcoding it + configMapSettings = parseConfigMap + if !configMapSettings.nil? + populateSettingValuesFromConfigMap(configMapSettings) + end +else + if (File.file?(@configMapMountPath)) + ConfigParseErrorLogger.logError("config::unsupported/missing config schema version - '#{@configSchemaVersion}' , using defaults, please use supported schema version") + end + @enable_health_model = false +end + +# Write the settings to file, so that they can be set as environment variables +file = File.open("side_car_fbit_config_env_var", "w") + +if !file.nil? + file.write("export AZMON_SIDECAR_FBIT_CHUNK_SIZE=#{@promFbitChunkSize.to_s + "m"}\n") + file.write("export AZMON_SIDECAR_FBIT_BUFFER_SIZE=#{@promFbitBufferSize.to_s + "m"}\n") + file.write("export AZMON_SIDECAR_FBIT_MEM_BUF_LIMIT=#{@promFbitMemBufLimit.to_s + "m"}\n") + # Close file after writing all environment variables + file.close +else + puts "Exception while opening file for writing config environment variables" + puts "****************End Sidecar Agent Config Processing********************" +end diff --git a/build/linux/installer/conf/td-agent-bit-prom-side-car.conf b/build/linux/installer/conf/td-agent-bit-prom-side-car.conf index 8a69f7995..2c85a4200 100644 --- a/build/linux/installer/conf/td-agent-bit-prom-side-car.conf +++ b/build/linux/installer/conf/td-agent-bit-prom-side-car.conf @@ -29,9 +29,9 @@ Tag oms.container.perf.telegraf.* Listen 0.0.0.0 Port 25229 - Chunk_Size 10m - Buffer_Size 10m - Mem_Buf_Limit 200m + Chunk_Size ${AZMON_SIDECAR_FBIT_CHUNK_SIZE} + Buffer_Size ${AZMON_SIDECAR_FBIT_BUFFER_SIZE} + Mem_Buf_Limit ${AZMON_SIDECAR_FBIT_MEM_BUF_LIMIT} [OUTPUT] Name oms diff --git a/build/linux/installer/conf/telegraf-prom-side-car.conf b/build/linux/installer/conf/telegraf-prom-side-car.conf index b3b4ba1d3..f5128d720 100644 --- a/build/linux/installer/conf/telegraf-prom-side-car.conf +++ b/build/linux/installer/conf/telegraf-prom-side-car.conf @@ -109,7 +109,7 @@ ## more about them here: ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md data_format = "json" - namedrop = ["agent_telemetry"] + namedrop = ["agent_telemetry", "file"] ############################################################################### # PROCESSOR PLUGINS # @@ -119,6 +119,12 @@ [processors.converter.fields] float = ["*"] +# Dummy plugin to test out toml parsing happens properly +[[inputs.file]] + interval = "24h" + files = ["test.json"] + data_format = "json" + #Prometheus Custom Metrics [[inputs.prometheus]] interval = "$AZMON_TELEGRAF_CUSTOM_PROM_INTERVAL" diff --git a/build/linux/installer/conf/telegraf-rs.conf b/build/linux/installer/conf/telegraf-rs.conf index ee1cf8819..038b40bc2 100644 --- a/build/linux/installer/conf/telegraf-rs.conf +++ b/build/linux/installer/conf/telegraf-rs.conf @@ -121,29 +121,9 @@ ## more about them here: ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md data_format = "json" - namedrop = ["agent_telemetry"] + namedrop = ["agent_telemetry", "file"] #tagdrop = ["AgentVersion","AKS_RESOURCE_ID", "ACS_RESOURCE_NAME", "Region","ClusterName","ClusterType", "Computer", "ControllerType"] -[[outputs.application_insights]] - ## Instrumentation key of the Application Insights resource. - instrumentation_key = "$TELEMETRY_APPLICATIONINSIGHTS_KEY" - - ## Timeout for closing (default: 5s). - # timeout = "5s" - - ## Enable additional diagnostic logging. - # enable_diagnostic_logging = false - - ## Context Tag Sources add Application Insights context tags to a tag value. - ## - ## For list of allowed context tag keys see: - ## https://github.com/Microsoft/ApplicationInsights-Go/blob/master/appinsights/contracts/contexttagkeys.go - # [outputs.application_insights.context_tag_sources] - # "ai.cloud.role" = "kubernetes_container_name" - # "ai.cloud.roleInstance" = "kubernetes_pod_name" - namepass = ["agent_telemetry"] - #tagdrop = ["nodeName"] - ############################################################################### # PROCESSOR PLUGINS # ############################################################################### @@ -389,7 +369,7 @@ # report_active = true # fieldpass = ["usage_active","cluster","node","host","device"] # taginclude = ["cluster","cpu","node"] - + # Read metrics about disk usage by mount point @@ -397,7 +377,7 @@ ## By default stats will be gathered for all mount points. ## Set mount_points will restrict the stats to only the specified mount points. # mount_points = ["/"] - + ## Ignore mount points by filesystem type. # ignore_fs = ["tmpfs", "devtmpfs", "devfs", "overlay", "aufs", "squashfs"] # fieldpass = ["free", "used", "used_percent"] @@ -538,16 +518,22 @@ #tagexclude = ["AgentVersion","AKS_RESOURCE_ID","ACS_RESOURCE_NAME", "Region", "ClusterName", "ClusterType", "Computer", "ControllerType"] # [inputs.prometheus.tagpass] +# Dummy plugin to test out toml parsing happens properly +[[inputs.file]] + interval = "24h" + files = ["test.json"] + data_format = "json" + #Prometheus Custom Metrics [[inputs.prometheus]] interval = "$AZMON_TELEGRAF_CUSTOM_PROM_INTERVAL" ## An array of urls to scrape metrics from. urls = $AZMON_TELEGRAF_CUSTOM_PROM_URLS - + ## An array of Kubernetes services to scrape metrics from. kubernetes_services = $AZMON_TELEGRAF_CUSTOM_PROM_K8S_SERVICES - + ## Scrape Kubernetes pods for the following prometheus annotations: ## - prometheus.io/scrape: Enable scraping for this pod ## - prometheus.io/scheme: If the metrics endpoint is secured then you will need to diff --git a/build/linux/installer/conf/telegraf.conf b/build/linux/installer/conf/telegraf.conf index 5a5bb2d8c..0e4824e70 100644 --- a/build/linux/installer/conf/telegraf.conf +++ b/build/linux/installer/conf/telegraf.conf @@ -120,7 +120,7 @@ ## more about them here: ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md data_format = "json" - namedrop = ["agent_telemetry"] + namedrop = ["agent_telemetry", "file"] #tagdrop = ["AgentVersion","AKS_RESOURCE_ID", "ACS_RESOURCE_NAME", "Region","ClusterName","ClusterType", "Computer", "ControllerType"] # Output to send MDM metrics to fluent bit and then route it to fluentD @@ -158,26 +158,6 @@ namepass = ["container.azm.ms/disk"] #fieldpass = ["used_percent"] -[[outputs.application_insights]] - ## Instrumentation key of the Application Insights resource. - instrumentation_key = "$TELEMETRY_APPLICATIONINSIGHTS_KEY" - - ## Timeout for closing (default: 5s). - # timeout = "5s" - - ## Enable additional diagnostic logging. - # enable_diagnostic_logging = false - - ## Context Tag Sources add Application Insights context tags to a tag value. - ## - ## For list of allowed context tag keys see: - ## https://github.com/Microsoft/ApplicationInsights-Go/blob/master/appinsights/contracts/contexttagkeys.go - # [outputs.application_insights.context_tag_sources] - # "ai.cloud.role" = "kubernetes_container_name" - # "ai.cloud.roleInstance" = "kubernetes_pod_name" - namepass = ["agent_telemetry"] - #tagdrop = ["nodeName"] - ############################################################################### # PROCESSOR PLUGINS # ############################################################################### @@ -425,7 +405,11 @@ # fieldpass = ["usage_active","cluster","node","host","device"] # taginclude = ["cluster","cpu","node"] - +# Dummy plugin to test out toml parsing happens properly +[[inputs.file]] + interval = "24h" + files = ["test.json"] + data_format = "json" # Read metrics about disk usage by mount point [[inputs.disk]] diff --git a/build/linux/installer/conf/test.json b/build/linux/installer/conf/test.json new file mode 100644 index 000000000..9e26dfeeb --- /dev/null +++ b/build/linux/installer/conf/test.json @@ -0,0 +1 @@ +{} \ No newline at end of file diff --git a/build/linux/installer/datafiles/base_container.data b/build/linux/installer/datafiles/base_container.data index de8ccbba0..4ed413028 100644 --- a/build/linux/installer/datafiles/base_container.data +++ b/build/linux/installer/datafiles/base_container.data @@ -36,13 +36,15 @@ MAINTAINER: 'Microsoft Corporation' /etc/opt/microsoft/docker-cimprov/td-agent-bit-rs.conf; build/linux/installer/conf/td-agent-bit-rs.conf; 644; root; root /etc/opt/microsoft/docker-cimprov/azm-containers-parser.conf; build/linux/installer/conf/azm-containers-parser.conf; 644; root; root /etc/opt/microsoft/docker-cimprov/out_oms.conf; build/linux/installer/conf/out_oms.conf; 644; root; root +/etc/opt/microsoft/docker-cimprov/test.json; build/linux/installer/conf/test.json; 644; root; root /etc/opt/microsoft/docker-cimprov/telegraf.conf; build/linux/installer/conf/telegraf.conf; 644; root; root /etc/opt/microsoft/docker-cimprov/telegraf-prom-side-car.conf; build/linux/installer/conf/telegraf-prom-side-car.conf; 644; root; root /etc/opt/microsoft/docker-cimprov/telegraf-rs.conf; build/linux/installer/conf/telegraf-rs.conf; 644; root; root /opt/microsoft/docker-cimprov/bin/TelegrafTCPErrorTelemetry.sh; build/linux/installer/scripts/TelegrafTCPErrorTelemetry.sh; 755; root; root /opt/livenessprobe.sh; build/linux/installer/scripts/livenessprobe.sh; 755; root; root /opt/tomlparser-prom-customconfig.rb; build/common/installer/scripts/tomlparser-prom-customconfig.rb; 755; root; root -/opt/tomlparser-mdm-metrics-config.rb; build/common/installer/scripts/tomlparser-mdm-metrics-config.rb; 755; root; root +/opt/tomlparser-prom-agent-config.rb; build/common/installer/scripts/tomlparser-prom-agent-config.rb; 755; root; root +/opt/tomlparser-mdm-metrics-config.rb; build/common/installer/scripts/tomlparser-mdm-metrics-config.rb; 755; root; root /opt/tomlparser-metric-collection-config.rb; build/linux/installer/scripts/tomlparser-metric-collection-config.rb; 755; root; root @@ -52,6 +54,8 @@ MAINTAINER: 'Microsoft Corporation' /opt/ConfigParseErrorLogger.rb; build/common/installer/scripts/ConfigParseErrorLogger.rb; 755; root; root /opt/tomlparser-npm-config.rb; build/linux/installer/scripts/tomlparser-npm-config.rb; 755; root; root /opt/tomlparser-osm-config.rb; build/linux/installer/scripts/tomlparser-osm-config.rb; 755; root; root +/opt/test.json; build/linux/installer/conf/test.json; 644; root; root + /etc/opt/microsoft/docker-cimprov/health/healthmonitorconfig.json; build/linux/installer/conf/healthmonitorconfig.json; 644; root; root @@ -144,8 +148,11 @@ MAINTAINER: 'Microsoft Corporation' /etc/fluent/plugin/MdmMetricsGenerator.rb; source/plugins/ruby/MdmMetricsGenerator.rb; 644; root; root /etc/fluent/plugin/MdmAlertTemplates.rb; source/plugins/ruby/MdmAlertTemplates.rb; 644; root; root -/etc/fluent/plugin/omslog.rb; source/plugins/utils/omslog.rb; 644; root; root -/etc/fluent/plugin/oms_common.rb; source/plugins/utils/oms_common.rb; 644; root; root +/etc/fluent/plugin/omslog.rb; source/plugins/ruby/omslog.rb; 644; root; root +/etc/fluent/plugin/oms_common.rb; source/plugins/ruby/oms_common.rb; 644; root; root +/etc/fluent/plugin/extension.rb; source/plugins/ruby/extension.rb; 644; root; root +/etc/fluent/plugin/extension_utils.rb; source/plugins/ruby/extension_utils.rb; 644; root; root + /etc/fluent/kube.conf; build/linux/installer/conf/kube.conf; 644; root; root /etc/fluent/container.conf; build/linux/installer/conf/container.conf; 644; root; root @@ -302,7 +309,7 @@ if ${{PERFORMING_UPGRADE_NOT}}; then rmdir /etc/opt/microsoft/docker-cimprov/conf 2> /dev/null rmdir /etc/opt/microsoft/docker-cimprov 2> /dev/null rmdir /etc/opt/microsoft 2> /dev/null - rmdir /etc/opt 2> /dev/null + rmdir /etc/opt 2> /dev/null fi %Preinstall_0 diff --git a/build/linux/installer/scripts/livenessprobe.sh b/build/linux/installer/scripts/livenessprobe.sh index 252f471e9..8ecb7fe44 100644 --- a/build/linux/installer/scripts/livenessprobe.sh +++ b/build/linux/installer/scripts/livenessprobe.sh @@ -11,13 +11,29 @@ fi #optionally test to exit non zero value if fluentd is not running #fluentd not used in sidecar container -if [ "${CONTAINER_TYPE}" != "PrometheusSidecar" ]; then +if [ "${CONTAINER_TYPE}" != "PrometheusSidecar" ]; then (ps -ef | grep "fluentd" | grep -v "grep") if [ $? -ne 0 ] then echo "fluentd is not running" > /dev/termination-log exit 1 fi + # fluentd launches by default supervisor and worker process + # so adding the liveness checks individually to handle scenario if any of the process dies + # supervisor process + (ps -ef | grep "fluentd" | grep "supervisor" | grep -v "grep") + if [ $? -ne 0 ] + then + echo "fluentd supervisor is not running" > /dev/termination-log + exit 1 + fi + # worker process + (ps -ef | grep "fluentd" | grep -v "supervisor" | grep -v "grep" ) + if [ $? -ne 0 ] + then + echo "fluentd worker is not running" > /dev/termination-log + exit 1 + fi fi #test to exit non zero value if fluentbit is not running diff --git a/build/windows/Makefile.ps1 b/build/windows/Makefile.ps1 index 737abc92a..9f3c438b0 100644 --- a/build/windows/Makefile.ps1 +++ b/build/windows/Makefile.ps1 @@ -3,6 +3,7 @@ # 1. Builds the certificate generator code in .NET and copy the binaries in zip file to ..\..\kubernetes\windows\omsagentwindows # 2. Builds the out_oms plugin code in go lang into the shared object(.so) file and copy the out_oms.so file to ..\..\kubernetes\windows\omsagentwindows # 3. copy the files under installer directory to ..\..\kubernetes\windows\omsagentwindows +# 4. Builds the livenessprobe cpp and copy the executable to the under directory ..\..\kubernetes\windows\omsagentwindows $dotnetcoreframework = "netcoreapp3.1" @@ -157,7 +158,7 @@ if ($isCDPxEnvironment) { Write-Host("getting latest go modules ...") go get - Write-Host("successfyullt got latest go modules") -ForegroundColor Green + Write-Host("successfully got latest go modules") -ForegroundColor Green go build -ldflags "-X 'main.revision=$buildVersionString' -X 'main.builddate=$buildVersionDate'" -buildmode=c-shared -o out_oms.so . } @@ -167,27 +168,35 @@ Write-Host("copying out_oms.so file to : $publishdir") Copy-Item -Path (Join-path -Path $outomsgoplugindir -ChildPath "out_oms.so") -Destination $publishdir -Force Write-Host("successfully copied out_oms.so file to : $publishdir") -ForegroundColor Green +# compile and build the liveness probe cpp code +Write-Host("Start:build livenessprobe cpp code") +$livenessprobesrcpath = Join-Path -Path $builddir -ChildPath "windows\installer\livenessprobe\livenessprobe.cpp" +$livenessprobeexepath = Join-Path -Path $builddir -ChildPath "windows\installer\livenessprobe\livenessprobe.exe" +g++ $livenessprobesrcpath -o $livenessprobeexepath -municode +Write-Host("End:build livenessprobe cpp code") +if (Test-Path -Path $livenessprobeexepath){ + Write-Host("livenessprobe.exe exists which indicates cpp build step succeeded") -ForegroundColor Green +} else { + Write-Host("livenessprobe.exe doesnt exist which indicates cpp build step failed") -ForegroundColor Red + exit +} $installerdir = Join-Path -Path $builddir -ChildPath "common\installer" Write-Host("copying common installer files conf and scripts from :" + $installerdir + " to :" + $publishdir + " ...") -$exclude = @('*.cs','*.csproj') +$exclude = @('*.cs','*.csproj', '*.cpp') Copy-Item -Path $installerdir -Destination $publishdir -Recurse -Force -Exclude $exclude Write-Host("successfully copied installer files conf and scripts from :" + $installerdir + " to :" + $publishdir + " ") -ForegroundColor Green $installerdir = Join-Path -Path $builddir -ChildPath "windows\installer" Write-Host("copying installer files conf and scripts from :" + $installerdir + " to :" + $publishdir + " ...") -$exclude = @('*.cs','*.csproj') +$exclude = @('*.cs','*.csproj', '*.cpp') Copy-Item -Path $installerdir -Destination $publishdir -Recurse -Force -Exclude $exclude Write-Host("successfully copied installer files conf and scripts from :" + $installerdir + " to :" + $publishdir + " ") -ForegroundColor Green $rubyplugindir = Join-Path -Path $rootdir -ChildPath "source\plugins\ruby" Write-Host("copying ruby source files from :" + $rubyplugindir + " to :" + $publishdir + " ...") Copy-Item -Path $rubyplugindir -Destination $publishdir -Recurse -Force +Get-ChildItem $Path | Where{$_.Name -Match ".*_test\.rb"} | Remove-Item Write-Host("successfully copied ruby source files from :" + $rubyplugindir + " to :" + $publishdir + " ") -ForegroundColor Green -$utilsplugindir = Join-Path -Path $rootdir -ChildPath "source\plugins\utils" -Write-Host("copying ruby util files from :" + $utilsplugindir + " to :" + $publishdir + " ...") -Copy-Item -Path $utilsplugindir -Destination $publishdir -Recurse -Force -Write-Host("successfully copied ruby util files from :" + $utilsplugindir + " to :" + $publishdir + " ") -ForegroundColor Green - -Set-Location $currentdir \ No newline at end of file +Set-Location $currentdir diff --git a/build/windows/installer/livenessprobe/livenessprobe.cpp b/build/windows/installer/livenessprobe/livenessprobe.cpp new file mode 100644 index 000000000..eea792686 --- /dev/null +++ b/build/windows/installer/livenessprobe/livenessprobe.cpp @@ -0,0 +1,137 @@ +#ifndef UNICODE +#define UNICODE +#endif + +#ifndef _UNICODE +#define _UNICODE +#endif + +#include +#include +#include + +#define SUCCESS 0x00000000 +#define NO_FLUENT_BIT_PROCESS 0x00000001 +#define FILESYSTEM_WATCHER_FILE_EXISTS 0x00000002 +#define CERTIFICATE_RENEWAL_REQUIRED 0x00000003 +#define FLUENTDWINAKS_SERVICE_NOT_RUNNING 0x00000004 +#define UNEXPECTED_ERROR 0xFFFFFFFF + +/* + check if the process running or not for given exe file name +*/ +bool IsProcessRunning(const wchar_t *const executableName) +{ + PROCESSENTRY32 entry; + entry.dwSize = sizeof(PROCESSENTRY32); + + const auto snapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, NULL); + + if (!Process32First(snapshot, &entry)) + { + CloseHandle(snapshot); + wprintf_s(L"ERROR:IsProcessRunning::Process32First failed"); + return false; + } + + do + { + if (!_wcsicmp(entry.szExeFile, executableName)) + { + CloseHandle(snapshot); + return true; + } + } while (Process32Next(snapshot, &entry)); + + CloseHandle(snapshot); + return false; +} + +/* + check if the file exists +*/ +bool IsFileExists(const wchar_t *const fileName) +{ + DWORD dwAttrib = GetFileAttributes(fileName); + return dwAttrib != INVALID_FILE_SIZE; +} + +/* + Get the status of the service for given service name +*/ +int GetServiceStatus(const wchar_t *const serivceName) +{ + SC_HANDLE theService, scm; + SERVICE_STATUS_PROCESS ssStatus; + DWORD dwBytesNeeded; + + scm = OpenSCManager(nullptr, nullptr, SC_MANAGER_ENUMERATE_SERVICE); + if (!scm) + { + wprintf_s(L"ERROR:GetServiceStatus::OpenSCManager failed"); + return UNEXPECTED_ERROR; + } + + theService = OpenService(scm, serivceName, SERVICE_QUERY_STATUS); + if (!theService) + { + CloseServiceHandle(scm); + wprintf_s(L"ERROR:GetServiceStatus::OpenService failed"); + return UNEXPECTED_ERROR; + } + + auto result = QueryServiceStatusEx(theService, SC_STATUS_PROCESS_INFO, + reinterpret_cast(&ssStatus), sizeof(SERVICE_STATUS_PROCESS), + &dwBytesNeeded); + + CloseServiceHandle(theService); + CloseServiceHandle(scm); + + if (result == 0) + { + wprintf_s(L"ERROR:GetServiceStatus:QueryServiceStatusEx failed"); + return UNEXPECTED_ERROR; + } + + return ssStatus.dwCurrentState; +} + +/** + +**/ +int _tmain(int argc, wchar_t *argv[]) +{ + if (argc < 5) + { + wprintf_s(L"ERROR:unexpected number arguments and expected is 5"); + return UNEXPECTED_ERROR; + } + + if (!IsProcessRunning(argv[1])) + { + wprintf_s(L"ERROR:Process:%s is not running\n", argv[1]); + return NO_FLUENT_BIT_PROCESS; + } + + DWORD dwStatus = GetServiceStatus(argv[2]); + + if (dwStatus != SERVICE_RUNNING) + { + wprintf_s(L"ERROR:Service:%s is not running\n", argv[2]); + return FLUENTDWINAKS_SERVICE_NOT_RUNNING; + } + + if (IsFileExists(argv[3])) + { + wprintf_s(L"INFO:File:%s exists indicates Config Map Updated since agent started.\n", argv[3]); + return FILESYSTEM_WATCHER_FILE_EXISTS; + } + + if (IsFileExists(argv[4])) + { + wprintf_s(L"INFO:File:%s exists indicates Certificate needs to be renewed.\n", argv[4]); + return CERTIFICATE_RENEWAL_REQUIRED; + } + + return SUCCESS; +} diff --git a/build/windows/installer/scripts/livenessprobe.cmd b/build/windows/installer/scripts/livenessprobe.cmd deleted file mode 100644 index 19d0b69d7..000000000 --- a/build/windows/installer/scripts/livenessprobe.cmd +++ /dev/null @@ -1,36 +0,0 @@ -REM "Checking if fluent-bit is running" - -tasklist /fi "imagename eq fluent-bit.exe" /fo "table" | findstr fluent-bit - -IF ERRORLEVEL 1 ( - echo "Fluent-Bit is not running" - exit /b 1 -) - -REM "Checking if config map has been updated since agent start" - -IF EXIST C:\etc\omsagentwindows\filesystemwatcher.txt ( - echo "Config Map Updated since agent started" - exit /b 1 -) - -REM "Checking if certificate needs to be renewed (aka agent restart required)" - -IF EXIST C:\etc\omsagentwindows\renewcertificate.txt ( - echo "Certificate needs to be renewed" - exit /b 1 -) - -REM "Checking if fluentd service is running" -sc query fluentdwinaks | findstr /i STATE | findstr RUNNING - -IF ERRORLEVEL 1 ( - echo "Fluentd Service is NOT Running" - exit /b 1 -) - -exit /b 0 - - - - diff --git a/build/windows/installer/scripts/rubyKeepCertificateAlive/in_heartbeat_request.rb b/build/windows/installer/scripts/rubyKeepCertificateAlive/in_heartbeat_request.rb index e255c4a71..e525d8681 100644 --- a/build/windows/installer/scripts/rubyKeepCertificateAlive/in_heartbeat_request.rb +++ b/build/windows/installer/scripts/rubyKeepCertificateAlive/in_heartbeat_request.rb @@ -36,14 +36,18 @@ def start def enumerate begin - puts "Calling certificate renewal code..." - maintenance = OMS::OnboardingHelper.new( - ENV["WSID"], - ENV["DOMAIN"], - ENV["CI_AGENT_GUID"] - ) - ret_code = maintenance.register_certs() - puts "Return code from register certs : #{ret_code}" + if !ENV["AAD_MSI_AUTH_MODE"].nil? && !ENV["AAD_MSI_AUTH_MODE"].empty? && ENV["AAD_MSI_AUTH_MODE"].downcase == "true" + puts "skipping certificate renewal code since AAD MSI auth configured" + else + puts "Calling certificate renewal code..." + maintenance = OMS::OnboardingHelper.new( + ENV["WSID"], + ENV["DOMAIN"], + ENV["CI_AGENT_GUID"] + ) + ret_code = maintenance.register_certs() + puts "Return code from register certs : #{ret_code}" + end rescue => errorStr puts "in_heartbeat_request::enumerate:Failed in enumerate: #{errorStr}" # STDOUT telemetry should alredy be going to Traces in AI. diff --git a/charts/azuremonitor-containers/templates/omsagent-crd.yaml b/charts/azuremonitor-containers/templates/omsagent-crd.yaml index bbaf89a52..46c5341cc 100644 --- a/charts/azuremonitor-containers/templates/omsagent-crd.yaml +++ b/charts/azuremonitor-containers/templates/omsagent-crd.yaml @@ -1,4 +1,4 @@ -{{- if semverCompare "<1.19-0" .Capabilities.KubeVersion.GitVersion }} +{{- if semverCompare "<1.19-0" .Capabilities.KubeVersion.Version }} apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: diff --git a/charts/azuremonitor-containers/templates/omsagent-daemonset-windows.yaml b/charts/azuremonitor-containers/templates/omsagent-daemonset-windows.yaml index 580ef9d15..efed76f7d 100644 --- a/charts/azuremonitor-containers/templates/omsagent-daemonset-windows.yaml +++ b/charts/azuremonitor-containers/templates/omsagent-daemonset-windows.yaml @@ -1,4 +1,4 @@ -{{- if and (ne .Values.omsagent.secret.key "") (ne .Values.omsagent.secret.wsid "") (or (ne .Values.omsagent.env.clusterName "") (ne .Values.omsagent.env.clusterId ""))}} +{{- if and (ne .Values.omsagent.secret.key "") (ne .Values.omsagent.secret.wsid "") (or (ne .Values.omsagent.env.clusterName "") (ne .Values.omsagent.env.clusterId "") (ne .Values.Azure.Cluster.ResourceId "") )}} apiVersion: apps/v1 kind: DaemonSet metadata: @@ -32,7 +32,7 @@ spec: options: - name: ndots value: "3" -{{- if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion }} +{{- if semverCompare ">=1.14-0" .Capabilities.KubeVersion.Version }} nodeSelector: kubernetes.io/os: windows {{- else }} diff --git a/charts/azuremonitor-containers/templates/omsagent-rbac.yaml b/charts/azuremonitor-containers/templates/omsagent-rbac.yaml index c0a6e3722..d9bca069d 100644 --- a/charts/azuremonitor-containers/templates/omsagent-rbac.yaml +++ b/charts/azuremonitor-containers/templates/omsagent-rbac.yaml @@ -10,7 +10,11 @@ metadata: heritage: {{ .Release.Service }} --- kind: ClusterRole +{{- if .Capabilities.APIVersions.Has "rbac.authorization.k8s.io/v1" }} +apiVersion: rbac.authorization.k8s.io/v1 +{{- else }} apiVersion: rbac.authorization.k8s.io/v1beta1 +{{- end }} metadata: name: omsagent-reader labels: @@ -33,7 +37,7 @@ rules: verbs: ["get", "create", "patch"] - nonResourceURLs: ["/metrics"] verbs: ["get"] -#arc k8s extension model grants access as part of the extension msi +#arc k8s extension model grants access as part of the extension msi #remove this explicit permission once the extension available in public preview {{- if (empty .Values.Azure.Extension.Name) }} - apiGroups: [""] @@ -43,7 +47,11 @@ rules: {{- end }} --- kind: ClusterRoleBinding +{{- if .Capabilities.APIVersions.Has "rbac.authorization.k8s.io/v1" }} +apiVersion: rbac.authorization.k8s.io/v1 +{{- else }} apiVersion: rbac.authorization.k8s.io/v1beta1 +{{- end }} metadata: name: omsagentclusterrolebinding labels: diff --git a/deployment/agent-deployment/ServiceGroupRoot/Parameters/ContainerInsights.Linux.Parameters.json b/deployment/agent-deployment/ServiceGroupRoot/Parameters/ContainerInsights.Linux.Parameters.json new file mode 100644 index 000000000..70d0950a2 --- /dev/null +++ b/deployment/agent-deployment/ServiceGroupRoot/Parameters/ContainerInsights.Linux.Parameters.json @@ -0,0 +1,68 @@ +{ + "$schema": "http://schema.express.azure.com/schemas/2015-01-01-alpha/RolloutParameters.json", + "contentVersion": "1.0.0.0", + "wait": [ + { + "name": "waitSdpBakeTime", + "properties": { + "duration": "PT24H" + } + } + ], + "shellExtensions": [ + { + "name": "PushAgentToACR", + "type": "ShellExtensionType", + "properties": { + "maxexecutiontime": "PT1H" + }, + "package": { + "reference": { + "path": "artifacts.tar.gz" + } + }, + "launch": { + "command": [ + "/bin/bash", + "pushAgentToAcr.sh" + ], + "environmentVariables": [ + { + "name": "ACR_NAME", + "value": "__ACR_NAME__" + }, + { + "name": "AGENT_RELEASE", + "value": "__AGENT_RELEASE__" + }, + { + "name": "AGENT_IMAGE_TAG_SUFFIX", + "value": "__AGENT_IMAGE_TAG_SUFFIX__" + }, + { + "name": "AGENT_IMAGE_FULL_PATH", + "value": "public/azuremonitor/containerinsights/__AGENT_RELEASE__:__AGENT_RELEASE____AGENT_IMAGE_TAG_SUFFIX__" + }, + { + "name": "CDPX_REGISTRY", + "value": "__CDPX_LINUX_REGISTRY__" + }, + { + "name": "CDPX_REPO_NAME", + "value": "__CDPX_LINUX_REPO_NAME__" + }, + { + "name": "CDPX_TAG", + "value": "__CDPX_LINUX_TAG__" + } + ], + "identity": { + "type": "userAssigned", + "userAssignedIdentities": [ + "__MANAGED_IDENTITY__" + ] + } + } + } + ] + } \ No newline at end of file diff --git a/deployment/agent-deployment/ServiceGroupRoot/Parameters/ContainerInsights.Windows.Parameters.json b/deployment/agent-deployment/ServiceGroupRoot/Parameters/ContainerInsights.Windows.Parameters.json new file mode 100644 index 000000000..b6a31ed10 --- /dev/null +++ b/deployment/agent-deployment/ServiceGroupRoot/Parameters/ContainerInsights.Windows.Parameters.json @@ -0,0 +1,68 @@ +{ + "$schema": "http://schema.express.azure.com/schemas/2015-01-01-alpha/RolloutParameters.json", + "contentVersion": "1.0.0.0", + "wait": [ + { + "name": "waitSdpBakeTime", + "properties": { + "duration": "PT24H" + } + } + ], + "shellExtensions": [ + { + "name": "PushAgentToACR", + "type": "ShellExtensionType", + "properties": { + "maxexecutiontime": "PT1H" + }, + "package": { + "reference": { + "path": "artifacts.tar.gz" + } + }, + "launch": { + "command": [ + "/bin/bash", + "pushAgentToAcr.sh" + ], + "environmentVariables": [ + { + "name": "ACR_NAME", + "value": "__ACR_NAME__" + }, + { + "name": "AGENT_RELEASE", + "value": "__AGENT_RELEASE__" + }, + { + "name": "AGENT_IMAGE_TAG_SUFFIX", + "value": "__AGENT_IMAGE_TAG_SUFFIX__" + }, + { + "name": "AGENT_IMAGE_FULL_PATH", + "value": "public/azuremonitor/containerinsights/__AGENT_RELEASE__:win-__AGENT_RELEASE____AGENT_IMAGE_TAG_SUFFIX__" + }, + { + "name": "CDPX_REGISTRY", + "value": "__CDPX_WINDOWS_REGISTRY__" + }, + { + "name": "CDPX_REPO_NAME", + "value": "__CDPX_WINDOWS_REPO_NAME__" + }, + { + "name": "CDPX_TAG", + "value": "__CDPX_WINDOWS_TAG__" + } + ], + "identity": { + "type": "userAssigned", + "userAssignedIdentities": [ + "__MANAGED_IDENTITY__" + ] + } + } + } + ] + } \ No newline at end of file diff --git a/deployment/agent-deployment/ServiceGroupRoot/RolloutSpecs/RolloutSpecs.json b/deployment/agent-deployment/ServiceGroupRoot/RolloutSpecs/RolloutSpecs.json new file mode 100644 index 000000000..f015cf5d3 --- /dev/null +++ b/deployment/agent-deployment/ServiceGroupRoot/RolloutSpecs/RolloutSpecs.json @@ -0,0 +1,36 @@ +{ + "$schema": "https://ev2schema.azure.net/schemas/2020-01-01/rolloutSpecification.json", + "ContentVersion": "1.0.0.0", + "RolloutMetadata": { + "ServiceModelPath": "ServiceModels//Public.ServiceModel.json", + "ScopeBindingsPath": "ScopeBindings//Public.ScopeBindings.json", + "Name": "ContainerInsightsAgent", + "RolloutType": "Major", + "BuildSource": { + "Parameters": { + "VersionFile": "buildver.txt" + } + }, + "Notification": { + "Email": { + "To": "omscontainers@microsoft.com" + } + } + }, + "OrchestratedSteps": [ + { + "name": "PushLinuxAgent", + "targetType": "ServiceResource", + "targetName": "PushLinuxAgent", + "actions": [ "Shell/PushAgentToACR" ], + "dependsOn": [ ] + }, + { + "name": "PushWindowsAgent", + "targetType": "ServiceResource", + "targetName": "PushWindowsAgent", + "actions": [ "Shell/PushAgentToACR" ], + "dependsOn": [ ] + } + ] + } \ No newline at end of file diff --git a/deployment/agent-deployment/ServiceGroupRoot/ScopeBindings/Public.ScopeBindings.json b/deployment/agent-deployment/ServiceGroupRoot/ScopeBindings/Public.ScopeBindings.json new file mode 100644 index 000000000..cbc6db8b3 --- /dev/null +++ b/deployment/agent-deployment/ServiceGroupRoot/ScopeBindings/Public.ScopeBindings.json @@ -0,0 +1,51 @@ +{ + "$schema": "https://ev2schema.azure.net/schemas/2020-01-01/scopeBindings.json", + "contentVersion": "0.0.0.1", + "scopeBindings": [ + { + "scopeTagName": "Global", + "bindings": [ + { + "find": "__ACR_NAME__", + "replaceWith": "$(ACRName)" + }, + { + "find": "__AGENT_RELEASE__", + "replaceWith": "$(AgentRelease)" + }, + { + "find": "__AGENT_IMAGE_TAG_SUFFIX__", + "replaceWith": "$(AgentImageTagSuffix)" + }, + { + "find": "__MANAGED_IDENTITY__", + "replaceWith": "$(ManagedIdentity)" + }, + { + "find": "__CDPX_LINUX_REGISTRY__", + "replaceWith": "$(CDPXLinuxRegistry)" + }, + { + "find": "__CDPX_WINDOWS_REGISTRY__", + "replaceWith": "$(CDPXWindowsRegistry)" + }, + { + "find": "__CDPX_LINUX_TAG__", + "replaceWith": "$(CDPXLinuxTag)" + }, + { + "find": "__CDPX_WINDOWS_TAG__", + "replaceWith": "$(CDPXWindowsTag)" + }, + { + "find": "__CDPX_LINUX_REPO_NAME__", + "replaceWith": "$(CDPXLinuxRepoName)" + }, + { + "find": "__CDPX_WINDOWS_REPO_NAME__", + "replaceWith": "$(CDPXWindowsRepoName)" + } + ] + } + ] +} \ No newline at end of file diff --git a/deployment/agent-deployment/ServiceGroupRoot/Scripts/pushAgentToAcr.sh b/deployment/agent-deployment/ServiceGroupRoot/Scripts/pushAgentToAcr.sh new file mode 100644 index 000000000..d39cedde0 --- /dev/null +++ b/deployment/agent-deployment/ServiceGroupRoot/Scripts/pushAgentToAcr.sh @@ -0,0 +1,72 @@ +#!/bin/bash +set -e + +# Note - This script used in the pipeline as inline script + +if [ -z $AGENT_IMAGE_TAG_SUFFIX ]; then + echo "-e error value of AGENT_IMAGE_TAG_SUFFIX variable shouldnt be empty. check release variables" + exit 1 +fi + +if [ -z $AGENT_RELEASE ]; then + echo "-e error AGENT_RELEASE shouldnt be empty. check release variables" + exit 1 +fi + +#Make sure that tag being pushed will not overwrite an existing tag in mcr +MCR_TAG_RESULT="`wget -qO- https://mcr.microsoft.com/v2/azuremonitor/containerinsights/ciprod/tags/list`" +if [ $? -ne 0 ]; then + echo "-e error unable to get list of mcr tags for azuremonitor/containerinsights/ciprod repository" + exit 1 +fi +TAG_EXISTS=$(echo $MCR_TAG_RESULT | jq '.tags | contains(["'"$AGENT_RELEASE$AGENT_IMAGE_TAG_SUFFIX"'"])') + +if $TAG_EXISTS; then + echo "-e error ${AGENT_IMAGE_TAG_SUFFIX} already exists in mcr. make sure the image tag is unique" + exit 1 +fi + +if [ -z $AGENT_IMAGE_FULL_PATH ]; then + echo "-e error AGENT_IMAGE_FULL_PATH shouldnt be empty. check release variables" + exit 1 +fi + +if [ -z $CDPX_TAG ]; then + echo "-e error value of CDPX_TAG shouldn't be empty. check release variables" + exit 1 +fi + +if [ -z $CDPX_REGISTRY ]; then + echo "-e error value of CDPX_REGISTRY shouldn't be empty. check release variables" + exit 1 +fi + +if [ -z $CDPX_REPO_NAME ]; then + echo "-e error value of CDPX_REPO_NAME shouldn't be empty. check release variables" + exit 1 +fi + +if [ -z $ACR_NAME ]; then + echo "-e error value of ACR_NAME shouldn't be empty. check release variables" + exit 1 +fi + + +#Login to az cli and authenticate to acr +echo "Login cli using managed identity" +az login --identity +if [ $? -eq 0 ]; then + echo "Logged in successfully" +else + echo "-e error failed to login to az with managed identity credentials" + exit 1 +fi + +echo "Pushing ${AGENT_IMAGE_FULL_PATH} to ${ACR_NAME}" +az acr import --name $ACR_NAME --registry $CDPX_REGISTRY --source official/${CDPX_REPO_NAME}:${CDPX_TAG} --image $AGENT_IMAGE_FULL_PATH +if [ $? -eq 0 ]; then + echo "Retagged and pushed image successfully" +else + echo "-e error failed to retag and push image to destination ACR" + exit 1 +fi \ No newline at end of file diff --git a/deployment/agent-deployment/ServiceGroupRoot/ServiceModels/Public.ServiceModel.json b/deployment/agent-deployment/ServiceGroupRoot/ServiceModels/Public.ServiceModel.json new file mode 100644 index 000000000..8c5c7c1b6 --- /dev/null +++ b/deployment/agent-deployment/ServiceGroupRoot/ServiceModels/Public.ServiceModel.json @@ -0,0 +1,56 @@ +{ + "$schema": "https://ev2schema.azure.net/schemas/2020-01-01/serviceModel.json", + "contentVersion": "1.0.0.2", + "ServiceMetadata": { + "ServiceGroup": "ContainerInsightsAgent", + "Environment": "Prod" + }, + "ServiceResourceGroupDefinitions": [ + { + "Name": "CI-Agent-ServiceResourceGroupDefinition", + "ServiceResourceDefinitions": [ + { + "Name": "ShellExtension", + "ComposedOf": { + "Extension": { + "Shell": [ + { + "type": "ShellExtensionType", + "properties": { + "imageName": "adm-ubuntu-1804-l", + "imageVersion": "v18" + } + } + ] + } + } + } + ] + } + ], + "ServiceResourceGroups": [ + { + "AzureResourceGroupName": "ContainerInsights-Agent-Release", + "Location": "eastus2", + "InstanceOf": "CI-Agent-ServiceResourceGroupDefinition", + "AzureSubscriptionId": "30c56c3a-54da-46ea-b004-06eb33432687", + "ScopeTags": [ + { + "Name": "Global" + } + ], + "ServiceResources": [ + { + "Name": "PushLinuxAgent", + "InstanceOf": "ShellExtension", + "RolloutParametersPath": "Parameters\\ContainerInsights.Linux.Parameters.json" + }, + { + "Name": "PushWindowsAgent", + "InstanceOf": "ShellExtension", + "RolloutParametersPath": "Parameters\\ContainerInsights.Windows.Parameters.json" + } + ] + } + ] + } \ No newline at end of file diff --git a/deployment/agent-deployment/ServiceGroupRoot/buildver.txt b/deployment/agent-deployment/ServiceGroupRoot/buildver.txt new file mode 100644 index 000000000..bd2666abb --- /dev/null +++ b/deployment/agent-deployment/ServiceGroupRoot/buildver.txt @@ -0,0 +1 @@ +1.0.0.0 \ No newline at end of file diff --git a/deployment/arc-k8s-extension/ServiceGroupRoot/Parameters/ContainerInsightsExtension.Parameters.json b/deployment/arc-k8s-extension/ServiceGroupRoot/Parameters/ContainerInsightsExtension.Parameters.json index a8a99e9f6..c38c67e00 100644 --- a/deployment/arc-k8s-extension/ServiceGroupRoot/Parameters/ContainerInsightsExtension.Parameters.json +++ b/deployment/arc-k8s-extension/ServiceGroupRoot/Parameters/ContainerInsightsExtension.Parameters.json @@ -31,26 +31,6 @@ "name": "RELEASE_STAGE", "value": "__RELEASE_STAGE__" }, - { - "name": "ACR_APP_ID", - "reference": { - "provider": "AzureKeyVault", - "parameters": { - "secretId": "https://cibuildandreleasekv.vault.azure.net/secrets/ciprodacrappid/e8f47bf7505741ebaf65a4db16ff9fa7" - } - }, - "asSecureValue": "true" - }, - { - "name": "ACR_APP_SECRET", - "reference": { - "provider": "AzureKeyVault", - "parameters": { - "secretId": "https://cibuildandreleasekv.vault.azure.net/secrets/ciprodacrappsecret/8718afcdac114accb8b26f613cef1e1e" - } - }, - "asSecureValue": "true" - }, { "name": "ACR_NAME", "value": "__ACR_NAME__" @@ -59,7 +39,13 @@ "name": "CHART_VERSION", "value": "__CHART_VERSION__" } - ] + ], + "identity": { + "type": "userAssigned", + "userAssignedIdentities": [ + "__MANAGED_IDENTITY__" + ] + } } } ] diff --git a/deployment/arc-k8s-extension/ServiceGroupRoot/RolloutSpecs/Public.Canary.RolloutSpec.json b/deployment/arc-k8s-extension/ServiceGroupRoot/RolloutSpecs/Public.Canary.RolloutSpec.json index cde103633..ea396bbe4 100644 --- a/deployment/arc-k8s-extension/ServiceGroupRoot/RolloutSpecs/Public.Canary.RolloutSpec.json +++ b/deployment/arc-k8s-extension/ServiceGroupRoot/RolloutSpecs/Public.Canary.RolloutSpec.json @@ -2,8 +2,8 @@ "$schema": "http://schema.express.azure.com/schemas/2015-01-01-alpha/RolloutSpec.json", "ContentVersion": "1.0.0.0", "RolloutMetadata": { - "ServiceModelPath": "ServiceModels//Public.ServiceModel.json", - "ScopeBindingsPath": "ScopeBindings//Public.ScopeBindings.json", + "ServiceModelPath": "ServiceModels//Public.ServiceModel.json", + "ScopeBindingsPath": "ScopeBindings//Public.ScopeBindings.json", "Name": "ContainerInsightsExtension-Canary", "RolloutType": "Major", "BuildSource": { @@ -15,7 +15,7 @@ "email": { "to": "omscontainers@microsoft.com" } - } + } }, "orchestratedSteps": [ { diff --git a/deployment/arc-k8s-extension/ServiceGroupRoot/ScopeBindings/Public.ScopeBindings.json b/deployment/arc-k8s-extension/ServiceGroupRoot/ScopeBindings/Public.ScopeBindings.json index 516eba3e2..97f103efa 100644 --- a/deployment/arc-k8s-extension/ServiceGroupRoot/ScopeBindings/Public.ScopeBindings.json +++ b/deployment/arc-k8s-extension/ServiceGroupRoot/ScopeBindings/Public.ScopeBindings.json @@ -1,22 +1,26 @@ { "$schema": "https://ev2schema.azure.net/schemas/2020-01-01/scopeBindings.json", "contentVersion": "0.0.0.1", - "scopeBindings": [ + "scopeBindings": [ { "scopeTagName": "Canary", "bindings": [ { "find": "__RELEASE_STAGE__", "replaceWith": "Canary" - }, + }, { "find": "__ACR_NAME__", "replaceWith": "$(ACRName)" - }, + }, { "find": "__CHART_VERSION__", "replaceWith": "$(ChartVersion)" - } + }, + { + "find": "__MANAGED_IDENTITY__", + "replaceWith": "$(ManagedIdentity)" + } ] }, { @@ -25,15 +29,19 @@ { "find": "__RELEASE_STAGE__", "replaceWith": "Pilot" - }, + }, { "find": "__ACR_NAME__", "replaceWith": "$(ACRName)" - }, + }, { "find": "__CHART_VERSION__", "replaceWith": "$(ChartVersion)" - } + }, + { + "find": "__MANAGED_IDENTITY__", + "replaceWith": "$(ManagedIdentity)" + } ] }, { @@ -42,15 +50,19 @@ { "find": "__RELEASE_STAGE__", "replaceWith": "MediumLow" - }, + }, { "find": "__ACR_NAME__", "replaceWith": "$(ACRName)" - }, + }, { "find": "__CHART_VERSION__", "replaceWith": "$(ChartVersion)" - } + }, + { + "find": "__MANAGED_IDENTITY__", + "replaceWith": "$(ManagedIdentity)" + } ] }, { @@ -59,15 +71,19 @@ { "find": "__RELEASE_STAGE__", "replaceWith": "MediumHigh" - }, + }, { "find": "__ACR_NAME__", "replaceWith": "$(ACRName)" - }, + }, { "find": "__CHART_VERSION__", "replaceWith": "$(ChartVersion)" - } + }, + { + "find": "__MANAGED_IDENTITY__", + "replaceWith": "$(ManagedIdentity)" + } ] }, { @@ -76,15 +92,19 @@ { "find": "__RELEASE_STAGE__", "replaceWith": "HighLoad" - }, + }, { "find": "__ACR_NAME__", "replaceWith": "$(ACRName)" - }, + }, { "find": "__CHART_VERSION__", "replaceWith": "$(ChartVersion)" - } + }, + { + "find": "__MANAGED_IDENTITY__", + "replaceWith": "$(ManagedIdentity)" + } ] }, { @@ -93,15 +113,19 @@ { "find": "__RELEASE_STAGE__", "replaceWith": "FF" - }, + }, { "find": "__ACR_NAME__", "replaceWith": "$(ACRName)" - }, + }, { "find": "__CHART_VERSION__", "replaceWith": "$(ChartVersion)" - } + }, + { + "find": "__MANAGED_IDENTITY__", + "replaceWith": "$(ManagedIdentity)" + } ] }, { @@ -110,16 +134,20 @@ { "find": "__RELEASE_STAGE__", "replaceWith": "MC" - }, + }, { "find": "__ACR_NAME__", "replaceWith": "$(ACRName)" - }, + }, { "find": "__CHART_VERSION__", "replaceWith": "$(ChartVersion)" - } + }, + { + "find": "__MANAGED_IDENTITY__", + "replaceWith": "$(ManagedIdentity)" + } ] - } + } ] } diff --git a/deployment/arc-k8s-extension/ServiceGroupRoot/Scripts/pushChartToAcr.sh b/deployment/arc-k8s-extension/ServiceGroupRoot/Scripts/pushChartToAcr.sh index 520557592..0451a038b 100644 --- a/deployment/arc-k8s-extension/ServiceGroupRoot/Scripts/pushChartToAcr.sh +++ b/deployment/arc-k8s-extension/ServiceGroupRoot/Scripts/pushChartToAcr.sh @@ -10,7 +10,7 @@ export REPO_TYPE="stable" export CANARY_REGION_REPO_PATH="azuremonitor/containerinsights/canary/${REPO_TYPE}/azuremonitor-containers" # pilot region export PILOT_REGION_REPO_PATH="azuremonitor/containerinsights/prod1/${REPO_TYPE}/azuremonitor-containers" -# light load regions +# light load regions export LIGHT_LOAD_REGION_REPO_PATH="azuremonitor/containerinsights/prod2/${REPO_TYPE}/azuremonitor-containers" # medium load regions export MEDIUM_LOAD_REGION_REPO_PATH="azuremonitor/containerinsights/prod3/${REPO_TYPE}/azuremonitor-containers" @@ -18,7 +18,7 @@ export MEDIUM_LOAD_REGION_REPO_PATH="azuremonitor/containerinsights/prod3/${REPO export HIGH_LOAD_REGION_REPO_PATH="azuremonitor/containerinsights/prod4/${REPO_TYPE}/azuremonitor-containers" # FairFax regions export FF_REGION_REPO_PATH="azuremonitor/containerinsights/prod5/${REPO_TYPE}/azuremonitor-containers" -# Mooncake regions +# Mooncake regions export MC_REGION_REPO_PATH="azuremonitor/containerinsights/prod6/${REPO_TYPE}/azuremonitor-containers" # pull chart from previous stage mcr and push chart to next stage acr @@ -35,7 +35,7 @@ pull_chart_from_source_mcr_to_push_to_dest_acr() { echo "-e error dest acr path must be provided " exit 1 fi - + echo "Pulling chart from MCR:${srcMcrFullPath} ..." helm chart pull ${srcMcrFullPath} if [ $? -eq 0 ]; then @@ -43,34 +43,34 @@ pull_chart_from_source_mcr_to_push_to_dest_acr() { else echo "-e error Pulling chart from MCR:${srcMcrFullPath} failed. Please review Ev2 pipeline logs for more details on the error." exit 1 - fi + fi - echo "Exporting chart to current directory ..." + echo "Exporting chart to current directory ..." helm chart export ${srcMcrFullPath} if [ $? -eq 0 ]; then echo "Exporting chart to current directory completed successfully." else echo "-e error Exporting chart to current directory failed. Please review Ev2 pipeline logs for more details on the error." exit 1 - fi + fi - echo "save the chart locally with dest acr full path : ${destAcrFullPath} ..." - helm chart save azuremonitor-containers/ ${destAcrFullPath} - if [ $? -eq 0 ]; then + echo "save the chart locally with dest acr full path : ${destAcrFullPath} ..." + helm chart save azuremonitor-containers/ ${destAcrFullPath} + if [ $? -eq 0 ]; then echo "save the chart locally with dest acr full path : ${destAcrFullPath} completed successfully." - else + else echo "-e error save the chart locally with dest acr full path : ${destAcrFullPath} failed. Please review Ev2 pipeline logs for more details on the error." exit 1 - fi - + fi + echo "pushing the chart to acr path: ${destAcrFullPath} ..." - helm chart push ${destAcrFullPath} - if [ $? -eq 0 ]; then + helm chart push ${destAcrFullPath} + if [ $? -eq 0 ]; then echo "pushing the chart to acr path: ${destAcrFullPath} completed successfully." - else + else echo "-e error pushing the chart to acr path: ${destAcrFullPath} failed. Please review Ev2 pipeline logs for more details on the error." exit 1 - fi + fi } # push to local release candidate chart to canary region @@ -81,23 +81,23 @@ push_local_chart_to_canary_region() { exit 1 fi - echo "save the chart locally with dest acr full path : ${destAcrFullPath} ..." + echo "save the chart locally with dest acr full path : ${destAcrFullPath} ..." helm chart save charts/azuremonitor-containers/ $destAcrFullPath - if [ $? -eq 0 ]; then + if [ $? -eq 0 ]; then echo "save the chart locally with dest acr full path : ${destAcrFullPath} completed." - else + else echo "-e error save the chart locally with dest acr full path : ${destAcrFullPath} failed. Please review Ev2 pipeline logs for more details on the error." exit 1 - fi + fi echo "pushing the chart to acr path: ${destAcrFullPath} ..." helm chart push $destAcrFullPath - if [ $? -eq 0 ]; then + if [ $? -eq 0 ]; then echo "pushing the chart to acr path: ${destAcrFullPath} completed successfully." - else + else echo "-e error pushing the chart to acr path: ${destAcrFullPath} failed.Please review Ev2 pipeline logs for more details on the error." exit 1 - fi + fi } echo "START - Release stage : ${RELEASE_STAGE}" @@ -106,71 +106,87 @@ echo "START - Release stage : ${RELEASE_STAGE}" echo "Using acr : ${ACR_NAME}" echo "Using acr repo type: ${REPO_TYPE}" +#Login to az cli and authenticate to acr +echo "Login cli using managed identity" +az login --identity +if [ $? -eq 0 ]; then + echo "Logged in successfully" +else + echo "-e error az login with managed identity credentials failed. Please review the Ev2 pipeline logs for more details on the error." + exit 1 +fi + +ACCESS_TOKEN=$(az acr login --name ${ACR_NAME} --expose-token --output tsv --query accessToken) +if [ $? -ne 0 ]; then + echo "-e error az acr login failed. Please review the Ev2 pipeline logs for more details on the error." + exit 1 +fi + echo "login to acr:${ACR_NAME} using helm ..." -echo $ACR_APP_SECRET | helm registry login $ACR_NAME --username $ACR_APP_ID --password-stdin +echo $ACCESS_TOKEN | helm registry login $ACR_NAME -u 00000000-0000-0000-0000-000000000000 --password-stdin if [ $? -eq 0 ]; then echo "login to acr:${ACR_NAME} using helm completed successfully." else echo "-e error login to acr:${ACR_NAME} using helm failed. Please review Ev2 pipeline logs for more details on the error." exit 1 -fi +fi case $RELEASE_STAGE in Canary) echo "START: Release stage - Canary" - destAcrFullPath=${ACR_NAME}/public/${CANARY_REGION_REPO_PATH}:${CHART_VERSION} - push_local_chart_to_canary_region $destAcrFullPath + destAcrFullPath=${ACR_NAME}/public/${CANARY_REGION_REPO_PATH}:${CHART_VERSION} + push_local_chart_to_canary_region $destAcrFullPath echo "END: Release stage - Canary" ;; - Pilot | Prod1) - echo "START: Release stage - Pilot" - srcMcrFullPath=${MCR_NAME}/${CANARY_REGION_REPO_PATH}:${CHART_VERSION} - destAcrFullPath=${ACR_NAME}/public/${PILOT_REGION_REPO_PATH}:${CHART_VERSION} - pull_chart_from_source_mcr_to_push_to_dest_acr $srcMcrFullPath $destAcrFullPath - echo "END: Release stage - Pilot" + Pilot | Prod1) + echo "START: Release stage - Pilot" + srcMcrFullPath=${MCR_NAME}/${CANARY_REGION_REPO_PATH}:${CHART_VERSION} + destAcrFullPath=${ACR_NAME}/public/${PILOT_REGION_REPO_PATH}:${CHART_VERSION} + pull_chart_from_source_mcr_to_push_to_dest_acr $srcMcrFullPath $destAcrFullPath + echo "END: Release stage - Pilot" ;; - LightLoad | Pord2) - echo "START: Release stage - Light Load Regions" + LightLoad | Pord2) + echo "START: Release stage - Light Load Regions" srcMcrFullPath=${MCR_NAME}/${PILOT_REGION_REPO_PATH}:${CHART_VERSION} destAcrFullPath=${ACR_NAME}/public/${LIGHT_LOAD_REGION_REPO_PATH}:${CHART_VERSION} - pull_chart_from_source_mcr_to_push_to_dest_acr $srcMcrFullPath $destAcrFullPath - echo "END: Release stage - Light Load Regions" + pull_chart_from_source_mcr_to_push_to_dest_acr $srcMcrFullPath $destAcrFullPath + echo "END: Release stage - Light Load Regions" ;; - - MediumLoad | Prod3) - echo "START: Release stage - Medium Load Regions" + + MediumLoad | Prod3) + echo "START: Release stage - Medium Load Regions" srcMcrFullPath=${MCR_NAME}/${LIGHT_LOAD_REGION_REPO_PATH}:${CHART_VERSION} destAcrFullPath=${ACR_NAME}/public/${MEDIUM_LOAD_REGION_REPO_PATH}:${CHART_VERSION} - pull_chart_from_source_mcr_to_push_to_dest_acr $srcMcrFullPath $destAcrFullPath + pull_chart_from_source_mcr_to_push_to_dest_acr $srcMcrFullPath $destAcrFullPath echo "END: Release stage - Medium Load Regions" ;; - HighLoad | Prod4) - echo "START: Release stage - High Load Regions" - srcMcrFullPath=${MCR_NAME}/${MEDIUM_LOAD_REGION_REPO_PATH}:${CHART_VERSION} - destAcrFullPath=${ACR_NAME}/public/${HIGH_LOAD_REGION_REPO_PATH}:${CHART_VERSION} - pull_chart_from_source_mcr_to_push_to_dest_acr $srcMcrFullPath $destAcrFullPath - echo "END: Release stage - High Load Regions" - ;; - - FF | Prod5) - echo "START: Release stage - FF" - srcMcrFullPath=${MCR_NAME}/${HIGH_LOAD_REGION_REPO_PATH}:${CHART_VERSION} + HighLoad | Prod4) + echo "START: Release stage - High Load Regions" + srcMcrFullPath=${MCR_NAME}/${MEDIUM_LOAD_REGION_REPO_PATH}:${CHART_VERSION} + destAcrFullPath=${ACR_NAME}/public/${HIGH_LOAD_REGION_REPO_PATH}:${CHART_VERSION} + pull_chart_from_source_mcr_to_push_to_dest_acr $srcMcrFullPath $destAcrFullPath + echo "END: Release stage - High Load Regions" + ;; + + FF | Prod5) + echo "START: Release stage - FF" + srcMcrFullPath=${MCR_NAME}/${HIGH_LOAD_REGION_REPO_PATH}:${CHART_VERSION} destAcrFullPath=${ACR_NAME}/public/${FF_REGION_REPO_PATH}:${CHART_VERSION} - pull_chart_from_source_mcr_to_push_to_dest_acr $srcMcrFullPath $destAcrFullPath - echo "END: Release stage - FF" - ;; + pull_chart_from_source_mcr_to_push_to_dest_acr $srcMcrFullPath $destAcrFullPath + echo "END: Release stage - FF" + ;; MC | Prod6) - echo "START: Release stage - MC" - srcMcrFullPath=${MCR_NAME}/${FF_REGION_REPO_PATH}:${CHART_VERSION} + echo "START: Release stage - MC" + srcMcrFullPath=${MCR_NAME}/${FF_REGION_REPO_PATH}:${CHART_VERSION} destAcrFullPath=${ACR_NAME}/public/${MC_REGION_REPO_PATH}:${CHART_VERSION} pull_chart_from_source_mcr_to_push_to_dest_acr $srcMcrFullPath $destAcrFullPath - echo "END: Release stage - MC" - ;; + echo "END: Release stage - MC" + ;; *) echo -n "unknown release stage" diff --git a/deployment/arc-k8s-extension/ServiceGroupRoot/ServiceModels/Public.ServiceModel.json b/deployment/arc-k8s-extension/ServiceGroupRoot/ServiceModels/Public.ServiceModel.json index 71081661a..c53bb5aca 100644 --- a/deployment/arc-k8s-extension/ServiceGroupRoot/ServiceModels/Public.ServiceModel.json +++ b/deployment/arc-k8s-extension/ServiceGroupRoot/ServiceModels/Public.ServiceModel.json @@ -28,17 +28,17 @@ ] } ], - "ServiceResourceGroups": [ + "ServiceResourceGroups": [ { "AzureResourceGroupName": "ContainerInsightsExtension-Canary-Release", "Location": "eastus2", "InstanceOf": "ARC-Extension-ServiceResourceGroupDefinition", - "AzureSubscriptionId": "5fab7b6f-6150-42fe-89e1-0f07a0a9a46f", + "AzureSubscriptionId": "30c56c3a-54da-46ea-b004-06eb33432687", "ScopeTags": [ { "Name": "Canary" } - ], + ], "ServiceResources": [ { "Name": "PushChartToACR-Canary", @@ -51,12 +51,12 @@ "AzureResourceGroupName": "ContainerInsightsExtension-Pilot-Release", "Location": "eastus2", "InstanceOf": "ARC-Extension-ServiceResourceGroupDefinition", - "AzureSubscriptionId": "5fab7b6f-6150-42fe-89e1-0f07a0a9a46f", + "AzureSubscriptionId": "30c56c3a-54da-46ea-b004-06eb33432687", "ScopeTags": [ { "Name": "Pilot" } - ], + ], "ServiceResources": [ { "Name": "PushChartToACR-Pilot", @@ -69,12 +69,12 @@ "AzureResourceGroupName": "ContainerInsightsExtension-LightLoad-Release", "Location": "eastus2", "InstanceOf": "ARC-Extension-ServiceResourceGroupDefinition", - "AzureSubscriptionId": "5fab7b6f-6150-42fe-89e1-0f07a0a9a46f", + "AzureSubscriptionId": "30c56c3a-54da-46ea-b004-06eb33432687", "ScopeTags": [ { "Name": "LightLoad" } - ], + ], "ServiceResources": [ { "Name": "PushChartToACR-LightLoad", @@ -87,12 +87,12 @@ "AzureResourceGroupName": "ContainerInsightsExtension-MediumLoad-Release", "Location": "eastus2", "InstanceOf": "ARC-Extension-ServiceResourceGroupDefinition", - "AzureSubscriptionId": "5fab7b6f-6150-42fe-89e1-0f07a0a9a46f", + "AzureSubscriptionId": "30c56c3a-54da-46ea-b004-06eb33432687", "ScopeTags": [ { "Name": "MediumLoad" } - ], + ], "ServiceResources": [ { "Name": "PushChartToACR-MediumLoad", @@ -105,12 +105,12 @@ "AzureResourceGroupName": "ContainerInsightsExtension-HighLoad-Release", "Location": "eastus2", "InstanceOf": "ARC-Extension-ServiceResourceGroupDefinition", - "AzureSubscriptionId": "5fab7b6f-6150-42fe-89e1-0f07a0a9a46f", + "AzureSubscriptionId": "30c56c3a-54da-46ea-b004-06eb33432687", "ScopeTags": [ { "Name": "HighLoad" } - ], + ], "ServiceResources": [ { "Name": "PushChartToACR-HighLoad", @@ -123,12 +123,12 @@ "AzureResourceGroupName": "ContainerInsightsExtension-FF-Release", "Location": "eastus2", "InstanceOf": "ARC-Extension-ServiceResourceGroupDefinition", - "AzureSubscriptionId": "5fab7b6f-6150-42fe-89e1-0f07a0a9a46f", + "AzureSubscriptionId": "30c56c3a-54da-46ea-b004-06eb33432687", "ScopeTags": [ { "Name": "FF" } - ], + ], "ServiceResources": [ { "Name": "PushChartToACR-FF", @@ -141,12 +141,12 @@ "AzureResourceGroupName": "ContainerInsightsExtension-MC-Release", "Location": "eastus2", "InstanceOf": "ARC-Extension-ServiceResourceGroupDefinition", - "AzureSubscriptionId": "5fab7b6f-6150-42fe-89e1-0f07a0a9a46f", + "AzureSubscriptionId": "30c56c3a-54da-46ea-b004-06eb33432687", "ScopeTags": [ { "Name": "MC" } - ], + ], "ServiceResources": [ { "Name": "PushChartToACR-MC", @@ -154,6 +154,6 @@ "RolloutParametersPath": "Parameters\\ContainerInsightsExtension.Parameters.json" } ] - } + } ] } diff --git a/kubernetes/container-azm-ms-agentconfig.yaml b/kubernetes/container-azm-ms-agentconfig.yaml index 543f270c1..dff8223ad 100644 --- a/kubernetes/container-azm-ms-agentconfig.yaml +++ b/kubernetes/container-azm-ms-agentconfig.yaml @@ -129,12 +129,30 @@ data: # Alertable metrics configuration settings for completed jobs count [alertable_metrics_configuration_settings.job_completion_threshold] - # Threshold for completed job count , metric will be sent only for those jobs which were completed earlier than the following threshold + # Threshold for completed job count , metric will be sent only for those jobs which were completed earlier than the following threshold job_completion_threshold_time_minutes = 360 integrations: |- [integrations.azure_network_policy_manager] collect_basic_metrics = false collect_advanced_metrics = false + +# Doc - https://github.com/microsoft/Docker-Provider/blob/ci_prod/Documentation/AgentSettings/ReadMe.md + agent-settings: |- + # prometheus scrape fluent bit settings for high scale + # buffer size should be greater than or equal to chunk size else we set it to chunk size. + [agent_settings.prometheus_fbit_settings] + tcp_listener_chunk_size = 10 + tcp_listener_buffer_size = 10 + tcp_listener_mem_buf_limit = 200 + + # The following settings are "undocumented", we don't recommend uncommenting them unless directed by Microsoft. + # They increase the maximum stdout/stderr log collection rate but will also cause higher cpu/memory usage. + # [agent_settings.fbit_config] + # log_flush_interval_secs = "1" # default value is 15 + # tail_mem_buf_limit_megabytes = "10" # default value is 10 + # tail_buf_chunksize_megabytes = "1" # default value is 32kb (comment out this line for default) + # tail_buf_maxsize_megabytes = "1" # defautl value is 32kb (comment out this line for default) + metadata: name: container-azm-ms-agentconfig namespace: kube-system diff --git a/kubernetes/linux/Dockerfile b/kubernetes/linux/Dockerfile index 77ab97d21..fd408b9b2 100644 --- a/kubernetes/linux/Dockerfile +++ b/kubernetes/linux/Dockerfile @@ -2,7 +2,7 @@ FROM ubuntu:18.04 MAINTAINER OMSContainers@microsoft.com LABEL vendor=Microsoft\ Corp \ com.microsoft.product="Azure Monitor for containers" -ARG IMAGE_TAG=ciprod08052021-1 +ARG IMAGE_TAG=ciprod10082021 ENV AGENT_VERSION ${IMAGE_TAG} ENV tmpdir /opt ENV APPLICATIONINSIGHTS_AUTH NzAwZGM5OGYtYTdhZC00NThkLWI5NWMtMjA3ZjM3NmM3YmRi @@ -17,7 +17,7 @@ ENV KUBE_CLIENT_BACKOFF_BASE 1 ENV KUBE_CLIENT_BACKOFF_DURATION 0 ENV RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR 0.9 RUN /usr/bin/apt-get update && /usr/bin/apt-get install -y libc-bin wget openssl curl sudo python-ctypes init-system-helpers net-tools rsyslog cron vim dmidecode apt-transport-https gnupg && rm -rf /var/lib/apt/lists/* -COPY setup.sh main.sh defaultpromenvvariables defaultpromenvvariables-rs defaultpromenvvariables-sidecar mdsd.xml envmdsd $tmpdir/ +COPY setup.sh main.sh defaultpromenvvariables defaultpromenvvariables-rs defaultpromenvvariables-sidecar mdsd.xml envmdsd logrotate.conf $tmpdir/ WORKDIR ${tmpdir} # copy docker provider shell bundle to use the agent image diff --git a/kubernetes/linux/defaultpromenvvariables-sidecar b/kubernetes/linux/defaultpromenvvariables-sidecar index 3301488d8..68388f88e 100644 --- a/kubernetes/linux/defaultpromenvvariables-sidecar +++ b/kubernetes/linux/defaultpromenvvariables-sidecar @@ -7,3 +7,6 @@ export AZMON_TELEGRAF_CUSTOM_PROM_PLUGINS_WITH_NAMESPACE_FILTER="" export AZMON_TELEGRAF_OSM_PROM_PLUGINS="" export AZMON_TELEGRAF_CUSTOM_PROM_KUBERNETES_LABEL_SELECTOR="kubernetes_label_selector = ''" export AZMON_TELEGRAF_CUSTOM_PROM_KUBERNETES_FIELD_SELECTOR="kubernetes_field_selector = ''" +export AZMON_SIDECAR_FBIT_CHUNK_SIZE="10m" +export AZMON_SIDECAR_FBIT_BUFFER_SIZE="10m" +export AZMON_SIDECAR_FBIT_MEM_BUF_LIMIT="200m" diff --git a/kubernetes/linux/logrotate.conf b/kubernetes/linux/logrotate.conf new file mode 100644 index 000000000..921371fd0 --- /dev/null +++ b/kubernetes/linux/logrotate.conf @@ -0,0 +1,39 @@ +/var/opt/microsoft/linuxmonagent/log/mdsd.err { + copytruncate + rotate 7 + missingok + notifempty + delaycompress + compress + size 10M +} + +/var/opt/microsoft/linuxmonagent/log/mdsd.warn { + copytruncate + rotate 7 + missingok + notifempty + delaycompress + compress + size 10M +} + +/var/opt/microsoft/linuxmonagent/log/mdsd.info { + copytruncate + rotate 7 + missingok + notifempty + delaycompress + compress + size 10M +} + +/var/opt/microsoft/linuxmonagent/log/mdsd.qos { + copytruncate + rotate 7 + missingok + notifempty + delaycompress + compress + size 10M +} diff --git a/kubernetes/linux/main.sh b/kubernetes/linux/main.sh index b9e338fa9..57a6deab8 100644 --- a/kubernetes/linux/main.sh +++ b/kubernetes/linux/main.sh @@ -12,7 +12,7 @@ waitforlisteneronTCPport() { echo "${FUNCNAME[0]} called with incorrect arguments<$1 , $2>. Required arguments <#port, #wait-time-in-seconds>" return -1 else - + if [[ $port =~ $numeric ]] && [[ $waittimesecs =~ $numeric ]]; then #local varlistener=$(netstat -lnt | awk '$6 == "LISTEN" && $4 ~ ":25228$"') while true @@ -38,6 +38,51 @@ waitforlisteneronTCPport() { fi } +checkAgentOnboardingStatus() { + local sleepdurationsecs=1 + local totalsleptsecs=0 + local isaadmsiauthmode=$1 + local waittimesecs=$2 + local numeric='^[0-9]+$' + + if [ -z "$1" ] || [ -z "$2" ]; then + echo "${FUNCNAME[0]} called with incorrect arguments<$1 , $2>. Required arguments <#isaadmsiauthmode, #wait-time-in-seconds>" + return -1 + else + + if [[ $waittimesecs =~ $numeric ]]; then + successMessage="Onboarding success" + failureMessage="Failed to register certificate with OMS Homing service, giving up" + if [ "${isaadmsiauthmode}" == "true" ]; then + successMessage="Loaded data sources" + failureMessage="Failed to load data sources into config" + fi + while true + do + if [ $totalsleptsecs -gt $waittimesecs ]; then + echo "${FUNCNAME[0]} giving up checking agent onboarding status after $totalsleptsecs secs" + return 1 + fi + + if grep "$successMessage" "${MDSD_LOG}/mdsd.info"; then + echo "Onboarding success" + return 0 + elif grep "$failureMessage" "${MDSD_LOG}/mdsd.err"; then + echo "Onboarding Failure: Reason: Failed to onboard the agent" + echo "Onboarding Failure: Please verify log analytics workspace configuration such as existence of the workspace, workspace key and workspace enabled for public ingestion" + return 1 + fi + sleep $sleepdurationsecs + totalsleptsecs=$(($totalsleptsecs+1)) + done + else + echo "${FUNCNAME[0]} called with non-numeric arguments<$2>. Required arguments <#wait-time-in-seconds>" + return -1 + fi + fi +} + + #using /var/opt/microsoft/docker-cimprov/state instead of /var/opt/microsoft/omsagent/state since the latter gets deleted during onboarding mkdir -p /var/opt/microsoft/docker-cimprov/state @@ -57,7 +102,11 @@ else export customResourceId=$AKS_RESOURCE_ID echo "export customResourceId=$AKS_RESOURCE_ID" >> ~/.bashrc source ~/.bashrc - echo "customResourceId:$customResourceId" + echo "customResourceId:$customResourceId" + export customRegion=$AKS_REGION + echo "export customRegion=$AKS_REGION" >> ~/.bashrc + source ~/.bashrc + echo "customRegion:$customRegion" fi #set agent config schema version @@ -146,6 +195,21 @@ if [ -e "/etc/omsagent-secret/WSID" ]; then else echo "successfully validated provided proxy endpoint is valid and expected format" fi + + echo $pwd > /opt/microsoft/docker-cimprov/proxy_password + + export MDSD_PROXY_MODE=application + echo "export MDSD_PROXY_MODE=$MDSD_PROXY_MODE" >> ~/.bashrc + export MDSD_PROXY_ADDRESS=$proto$hostport + echo "export MDSD_PROXY_ADDRESS=$MDSD_PROXY_ADDRESS" >> ~/.bashrc + export MDSD_PROXY_USERNAME=$user + echo "export MDSD_PROXY_USERNAME=$MDSD_PROXY_USERNAME" >> ~/.bashrc + export MDSD_PROXY_PASSWORD_FILE=/opt/microsoft/docker-cimprov/proxy_password + echo "export MDSD_PROXY_PASSWORD_FILE=$MDSD_PROXY_PASSWORD_FILE" >> ~/.bashrc + + #TODO: Compression + proxy creates a deserialization error in ODS. This needs a fix in MDSD + export MDSD_ODS_COMPRESSION_LEVEL=0 + echo "export MDSD_ODS_COMPRESSION_LEVEL=$MDSD_ODS_COMPRESSION_LEVEL" >> ~/.bashrc fi if [ ! -z "$PROXY_ENDPOINT" ]; then @@ -194,9 +258,15 @@ fi if [ -z $domain ]; then ClOUD_ENVIRONMENT="unknown" elif [ $domain == "opinsights.azure.com" ]; then - CLOUD_ENVIRONMENT="public" -else - CLOUD_ENVIRONMENT="national" + CLOUD_ENVIRONMENT="azurepubliccloud" +elif [ $domain == "opinsights.azure.cn" ]; then + CLOUD_ENVIRONMENT="azurechinacloud" +elif [ $domain == "opinsights.azure.us" ]; then + CLOUD_ENVIRONMENT="azureusgovernmentcloud" +elif [ $domain == "opinsights.azure.eaglex.ic.gov" ]; then + CLOUD_ENVIRONMENT="usnat" +elif [ $domain == "opinsights.azure.microsoft.scloud" ]; then + CLOUD_ENVIRONMENT="ussec" fi export CLOUD_ENVIRONMENT=$CLOUD_ENVIRONMENT echo "export CLOUD_ENVIRONMENT=$CLOUD_ENVIRONMENT" >> ~/.bashrc @@ -233,9 +303,9 @@ if [ ${#APPLICATIONINSIGHTS_AUTH_URL} -ge 1 ]; then # (check if APPLICATIONINSI fi -aikey=$(echo $APPLICATIONINSIGHTS_AUTH | base64 --decode) -export TELEMETRY_APPLICATIONINSIGHTS_KEY=$aikey -echo "export TELEMETRY_APPLICATIONINSIGHTS_KEY=$aikey" >> ~/.bashrc +aikey=$(echo $APPLICATIONINSIGHTS_AUTH | base64 --decode) +export TELEMETRY_APPLICATIONINSIGHTS_KEY=$aikey +echo "export TELEMETRY_APPLICATIONINSIGHTS_KEY=$aikey" >> ~/.bashrc source ~/.bashrc @@ -306,6 +376,21 @@ if [ -e "telemetry_prom_config_env_var" ]; then source telemetry_prom_config_env_var fi +#Parse sidecar agent settings for custom configuration +if [ ! -e "/etc/config/kube.conf" ]; then + if [ "${CONTAINER_TYPE}" == "PrometheusSidecar" ]; then + #Parse the agent configmap to create a file with new custom settings. + /usr/bin/ruby2.6 tomlparser-prom-agent-config.rb + #Sourcing config environment variable file if it exists + if [ -e "side_car_fbit_config_env_var" ]; then + cat side_car_fbit_config_env_var | while read line; do + echo $line >> ~/.bashrc + done + source side_car_fbit_config_env_var + fi + fi +fi + #Parse the configmap to set the right environment variables for MDM metrics configuration for Alerting. if [ "${CONTAINER_TYPE}" != "PrometheusSidecar" ]; then @@ -406,7 +491,7 @@ export KUBELET_RUNTIME_OPERATIONS_ERRORS_METRIC="kubelet_docker_operations_error if [ "$CONTAINER_RUNTIME" != "docker" ]; then # these metrics are avialble only on k8s versions <1.18 and will get deprecated from 1.18 export KUBELET_RUNTIME_OPERATIONS_METRIC="kubelet_runtime_operations" - export KUBELET_RUNTIME_OPERATIONS_ERRORS_METRIC="kubelet_runtime_operations_errors" + export KUBELET_RUNTIME_OPERATIONS_ERRORS_METRIC="kubelet_runtime_operations_errors" fi echo "set caps for ruby process to read container env from proc" @@ -431,24 +516,24 @@ echo "DOCKER_CIMPROV_VERSION=$DOCKER_CIMPROV_VERSION" export DOCKER_CIMPROV_VERSION=$DOCKER_CIMPROV_VERSION echo "export DOCKER_CIMPROV_VERSION=$DOCKER_CIMPROV_VERSION" >> ~/.bashrc echo "*** activating oneagent in legacy auth mode ***" -CIWORKSPACE_id="$(cat /etc/omsagent-secret/WSID)" +CIWORKSPACE_id="$(cat /etc/omsagent-secret/WSID)" #use the file path as its secure than env -CIWORKSPACE_keyFile="/etc/omsagent-secret/KEY" +CIWORKSPACE_keyFile="/etc/omsagent-secret/KEY" cat /etc/mdsd.d/envmdsd | while read line; do echo $line >> ~/.bashrc done source /etc/mdsd.d/envmdsd echo "setting mdsd workspaceid & key for workspace:$CIWORKSPACE_id" export CIWORKSPACE_id=$CIWORKSPACE_id -echo "export CIWORKSPACE_id=$CIWORKSPACE_id" >> ~/.bashrc +echo "export CIWORKSPACE_id=$CIWORKSPACE_id" >> ~/.bashrc export CIWORKSPACE_keyFile=$CIWORKSPACE_keyFile echo "export CIWORKSPACE_keyFile=$CIWORKSPACE_keyFile" >> ~/.bashrc export OMS_TLD=$domain -echo "export OMS_TLD=$OMS_TLD" >> ~/.bashrc +echo "export OMS_TLD=$OMS_TLD" >> ~/.bashrc export MDSD_FLUENT_SOCKET_PORT="29230" echo "export MDSD_FLUENT_SOCKET_PORT=$MDSD_FLUENT_SOCKET_PORT" >> ~/.bashrc -#skip imds lookup since not used in legacy auth path +#skip imds lookup since not used in legacy auth path export SKIP_IMDS_LOOKUP_FOR_LEGACY_AUTH="true" echo "export SKIP_IMDS_LOOKUP_FOR_LEGACY_AUTH=$SKIP_IMDS_LOOKUP_FOR_LEGACY_AUTH" >> ~/.bashrc @@ -456,8 +541,8 @@ source ~/.bashrc dpkg -l | grep mdsd | awk '{print $2 " " $3}' -if [ "${CONTAINER_TYPE}" == "PrometheusSidecar" ]; then - echo "starting mdsd with mdsd-port=26130, fluentport=26230 and influxport=26330 in legacy auth mode in sidecar container..." +if [ "${CONTAINER_TYPE}" == "PrometheusSidecar" ]; then + echo "starting mdsd with mdsd-port=26130, fluentport=26230 and influxport=26330 in legacy auth mode in sidecar container..." #use tenant name to avoid unix socket conflict and different ports for port conflict #roleprefix to use container specific mdsd socket export TENANT_NAME="${CONTAINER_TYPE}" @@ -468,22 +553,22 @@ if [ "${CONTAINER_TYPE}" == "PrometheusSidecar" ]; then mkdir /var/run/mdsd-${CONTAINER_TYPE} # add -T 0xFFFF for full traces mdsd -r ${MDSD_ROLE_PREFIX} -p 26130 -f 26230 -i 26330 -e ${MDSD_LOG}/mdsd.err -w ${MDSD_LOG}/mdsd.warn -o ${MDSD_LOG}/mdsd.info -q ${MDSD_LOG}/mdsd.qos & -else +else echo "starting mdsd in legacy auth mode in main container..." # add -T 0xFFFF for full traces - mdsd -e ${MDSD_LOG}/mdsd.err -w ${MDSD_LOG}/mdsd.warn -o ${MDSD_LOG}/mdsd.info -q ${MDSD_LOG}/mdsd.qos & + mdsd -e ${MDSD_LOG}/mdsd.err -w ${MDSD_LOG}/mdsd.warn -o ${MDSD_LOG}/mdsd.info -q ${MDSD_LOG}/mdsd.qos & fi -# no dependency on fluentd for prometheus side car container -if [ "${CONTAINER_TYPE}" != "PrometheusSidecar" ]; then +# no dependency on fluentd for prometheus side car container +if [ "${CONTAINER_TYPE}" != "PrometheusSidecar" ]; then if [ ! -e "/etc/config/kube.conf" ]; then echo "*** starting fluentd v1 in daemonset" fluentd -c /etc/fluent/container.conf -o /var/opt/microsoft/docker-cimprov/log/fluentd.log --log-rotate-age 5 --log-rotate-size 20971520 & else echo "*** starting fluentd v1 in replicaset" fluentd -c /etc/fluent/kube.conf -o /var/opt/microsoft/docker-cimprov/log/fluentd.log --log-rotate-age 5 --log-rotate-size 20971520 & - fi -fi + fi +fi #If config parsing was successful, a copy of the conf file with replaced custom settings file is created if [ ! -e "/etc/config/kube.conf" ]; then @@ -515,6 +600,121 @@ else fi fi +#skip imds lookup since not used either legacy or aad msi auth path +export SKIP_IMDS_LOOKUP_FOR_LEGACY_AUTH="true" +echo "export SKIP_IMDS_LOOKUP_FOR_LEGACY_AUTH=$SKIP_IMDS_LOOKUP_FOR_LEGACY_AUTH" >> ~/.bashrc +# this used by mdsd to determine cloud specific LA endpoints +export OMS_TLD=$domain +echo "export OMS_TLD=$OMS_TLD" >> ~/.bashrc +cat /etc/mdsd.d/envmdsd | while read line; do + echo $line >> ~/.bashrc +done +source /etc/mdsd.d/envmdsd +MDSD_AAD_MSI_AUTH_ARGS="" +# check if its AAD Auth MSI mode via USING_AAD_MSI_AUTH +export AAD_MSI_AUTH_MODE=false +if [ "${USING_AAD_MSI_AUTH}" == "true" ]; then + echo "*** activating oneagent in aad auth msi mode ***" + # msi auth specific args + MDSD_AAD_MSI_AUTH_ARGS="-a -A" + export AAD_MSI_AUTH_MODE=true + echo "export AAD_MSI_AUTH_MODE=true" >> ~/.bashrc + # this used by mdsd to determine the cloud specific AMCS endpoints + export customEnvironment=$CLOUD_ENVIRONMENT + echo "export customEnvironment=$customEnvironment" >> ~/.bashrc + export MDSD_FLUENT_SOCKET_PORT="28230" + echo "export MDSD_FLUENT_SOCKET_PORT=$MDSD_FLUENT_SOCKET_PORT" >> ~/.bashrc + export ENABLE_MCS="true" + echo "export ENABLE_MCS=$ENABLE_MCS" >> ~/.bashrc + export MONITORING_USE_GENEVA_CONFIG_SERVICE="false" + echo "export MONITORING_USE_GENEVA_CONFIG_SERVICE=$MONITORING_USE_GENEVA_CONFIG_SERVICE" >> ~/.bashrc + export MDSD_USE_LOCAL_PERSISTENCY="false" + echo "export MDSD_USE_LOCAL_PERSISTENCY=$MDSD_USE_LOCAL_PERSISTENCY" >> ~/.bashrc +else + echo "*** activating oneagent in legacy auth mode ***" + CIWORKSPACE_id="$(cat /etc/omsagent-secret/WSID)" + #use the file path as its secure than env + CIWORKSPACE_keyFile="/etc/omsagent-secret/KEY" + echo "setting mdsd workspaceid & key for workspace:$CIWORKSPACE_id" + export CIWORKSPACE_id=$CIWORKSPACE_id + echo "export CIWORKSPACE_id=$CIWORKSPACE_id" >> ~/.bashrc + export CIWORKSPACE_keyFile=$CIWORKSPACE_keyFile + echo "export CIWORKSPACE_keyFile=$CIWORKSPACE_keyFile" >> ~/.bashrc + export MDSD_FLUENT_SOCKET_PORT="29230" + echo "export MDSD_FLUENT_SOCKET_PORT=$MDSD_FLUENT_SOCKET_PORT" >> ~/.bashrc +fi +source ~/.bashrc + +dpkg -l | grep mdsd | awk '{print $2 " " $3}' + +if [ "${CONTAINER_TYPE}" == "PrometheusSidecar" ]; then + echo "starting mdsd with mdsd-port=26130, fluentport=26230 and influxport=26330 in sidecar container..." + #use tenant name to avoid unix socket conflict and different ports for port conflict + #roleprefix to use container specific mdsd socket + export TENANT_NAME="${CONTAINER_TYPE}" + echo "export TENANT_NAME=$TENANT_NAME" >> ~/.bashrc + export MDSD_ROLE_PREFIX=/var/run/mdsd-${CONTAINER_TYPE}/default + echo "export MDSD_ROLE_PREFIX=$MDSD_ROLE_PREFIX" >> ~/.bashrc + source ~/.bashrc + mkdir /var/run/mdsd-${CONTAINER_TYPE} + # add -T 0xFFFF for full traces + mdsd ${MDSD_AAD_MSI_AUTH_ARGS} -r ${MDSD_ROLE_PREFIX} -p 26130 -f 26230 -i 26330 -e ${MDSD_LOG}/mdsd.err -w ${MDSD_LOG}/mdsd.warn -o ${MDSD_LOG}/mdsd.info -q ${MDSD_LOG}/mdsd.qos & +else + echo "starting mdsd mode in main container..." + # add -T 0xFFFF for full traces + mdsd ${MDSD_AAD_MSI_AUTH_ARGS} -e ${MDSD_LOG}/mdsd.err -w ${MDSD_LOG}/mdsd.warn -o ${MDSD_LOG}/mdsd.info -q ${MDSD_LOG}/mdsd.qos 2>> /dev/null & +fi + +# Set up a cron job for logrotation +if [ ! -f /etc/cron.d/ci-agent ]; then + echo "setting up cronjob for ci agent log rotation" + echo "*/5 * * * * root /usr/sbin/logrotate -s /var/lib/logrotate/ci-agent-status /etc/logrotate.d/ci-agent >/dev/null 2>&1" > /etc/cron.d/ci-agent +fi + +# no dependency on fluentd for prometheus side car container +if [ "${CONTAINER_TYPE}" != "PrometheusSidecar" ]; then + if [ ! -e "/etc/config/kube.conf" ]; then + echo "*** starting fluentd v1 in daemonset" + fluentd -c /etc/fluent/container.conf -o /var/opt/microsoft/docker-cimprov/log/fluentd.log --log-rotate-age 5 --log-rotate-size 20971520 & + else + echo "*** starting fluentd v1 in replicaset" + fluentd -c /etc/fluent/kube.conf -o /var/opt/microsoft/docker-cimprov/log/fluentd.log --log-rotate-age 5 --log-rotate-size 20971520 & + fi +fi + +#If config parsing was successful, a copy of the conf file with replaced custom settings file is created +if [ ! -e "/etc/config/kube.conf" ]; then + if [ "${CONTAINER_TYPE}" == "PrometheusSidecar" ] && [ -e "/opt/telegraf-test-prom-side-car.conf" ]; then + echo "****************Start Telegraf in Test Mode**************************" + /opt/telegraf --config /opt/telegraf-test-prom-side-car.conf --input-filter file -test + if [ $? -eq 0 ]; then + mv "/opt/telegraf-test-prom-side-car.conf" "/etc/opt/microsoft/docker-cimprov/telegraf-prom-side-car.conf" + echo "Moving test conf file to telegraf side-car conf since test run succeeded" + fi + echo "****************End Telegraf Run in Test Mode**************************" + else + if [ -e "/opt/telegraf-test.conf" ]; then + echo "****************Start Telegraf in Test Mode**************************" + /opt/telegraf --config /opt/telegraf-test.conf --input-filter file -test + if [ $? -eq 0 ]; then + mv "/opt/telegraf-test.conf" "/etc/opt/microsoft/docker-cimprov/telegraf.conf" + echo "Moving test conf file to telegraf daemonset conf since test run succeeded" + fi + echo "****************End Telegraf Run in Test Mode**************************" + fi + fi +else + if [ -e "/opt/telegraf-test-rs.conf" ]; then + echo "****************Start Telegraf in Test Mode**************************" + /opt/telegraf --config /opt/telegraf-test-rs.conf --input-filter file -test + if [ $? -eq 0 ]; then + mv "/opt/telegraf-test-rs.conf" "/etc/opt/microsoft/docker-cimprov/telegraf-rs.conf" + echo "Moving test conf file to telegraf replicaset conf since test run succeeded" + fi + echo "****************End Telegraf Run in Test Mode**************************" + fi +fi + #telegraf & fluentbit requirements if [ ! -e "/etc/config/kube.conf" ]; then if [ "${CONTAINER_TYPE}" == "PrometheusSidecar" ]; then @@ -616,8 +816,10 @@ service rsyslog stop echo "getting rsyslog status..." service rsyslog status +checkAgentOnboardingStatus $AAD_MSI_AUTH_MODE 30 + shutdown() { - pkill -f mdsd + pkill -f mdsd } trap "shutdown" SIGTERM diff --git a/kubernetes/linux/setup.sh b/kubernetes/linux/setup.sh index 8bb722377..371d26fa5 100644 --- a/kubernetes/linux/setup.sh +++ b/kubernetes/linux/setup.sh @@ -9,13 +9,16 @@ sed -i -e 's/# en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen && \ dpkg-reconfigure --frontend=noninteractive locales && \ update-locale LANG=en_US.UTF-8 -#install oneagent - Official bits (08/04/2021) -wget https://github.com/microsoft/Docker-Provider/releases/download/08042021-oneagent/azure-mdsd_1.10.1-build.master.251_x86_64.deb +#install oneagent - Official bits (10/7/2021) +wget https://github.com/microsoft/Docker-Provider/releases/download/1.14/azure-mdsd_1.14.0-build.master.279_x86_64.deb /usr/bin/dpkg -i $TMPDIR/azure-mdsd*.deb cp -f $TMPDIR/mdsd.xml /etc/mdsd.d cp -f $TMPDIR/envmdsd /etc/mdsd.d +#log rotate conf for mdsd and can be extended for other log files as well +cp -f $TMPDIR/logrotate.conf /etc/logrotate.d/ci-agent + #download inotify tools for watching configmap changes sudo apt-get update sudo apt-get install inotify-tools -y diff --git a/kubernetes/omsagent.yaml b/kubernetes/omsagent.yaml index 88d6ad608..97e32c0e1 100644 --- a/kubernetes/omsagent.yaml +++ b/kubernetes/omsagent.yaml @@ -368,16 +368,22 @@ spec: value: "3" containers: - name: omsagent - image: "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod08052021-1" + image: "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10082021" imagePullPolicy: IfNotPresent resources: limits: cpu: 500m - memory: 600Mi + memory: 750Mi requests: cpu: 75m - memory: 225Mi + memory: 325Mi env: + - name: FBIT_SERVICE_FLUSH_INTERVAL + value: "15" + - name: FBIT_TAIL_BUFFER_CHUNK_SIZE + value: "1" + - name: FBIT_TAIL_BUFFER_MAX_SIZE + value: "1" # azure devops pipeline uses AKS_RESOURCE_ID and AKS_REGION hence ensure to uncomment these - name: AKS_RESOURCE_ID value: "VALUE_AKS_RESOURCE_ID_VALUE" @@ -400,6 +406,8 @@ spec: value: "" - name: AZMON_CONTAINERLOGS_ONEAGENT_REGIONS value: "koreacentral,norwayeast,eastus2" + - name: USING_AAD_MSI_AUTH + value: "false" securityContext: privileged: true ports: @@ -445,59 +453,65 @@ spec: periodSeconds: 60 timeoutSeconds: 15 #Only in sidecar scraping mode - - name: omsagent-prometheus - image: "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod08052021-1" - imagePullPolicy: IfNotPresent - resources: - limits: - cpu: 500m - memory: 1Gi - requests: - cpu: 75m - memory: 225Mi - env: - # azure devops pipeline uses AKS_RESOURCE_ID and AKS_REGION hence ensure to uncomment these - - name: AKS_RESOURCE_ID - value: "VALUE_AKS_RESOURCE_ID_VALUE" - - name: AKS_REGION - value: "VALUE_AKS_RESOURCE_REGION_VALUE" - #Uncomment below two lines for ACS clusters and set the cluster names manually. Also comment out the above two lines for ACS clusters - #- name: ACS_RESOURCE_NAME - # value: "my_acs_cluster_name" - - name: CONTAINER_TYPE - value: "PrometheusSidecar" - - name: CONTROLLER_TYPE - value: "DaemonSet" - - name: NODE_IP - valueFrom: - fieldRef: - fieldPath: status.hostIP - # Update this with the user assigned msi client id for omsagent - - name: USER_ASSIGNED_IDENTITY_CLIENT_ID - value: "" - securityContext: - privileged: true - volumeMounts: - - mountPath: /etc/kubernetes/host - name: azure-json-path - - mountPath: /etc/omsagent-secret - name: omsagent-secret - readOnly: true - - mountPath: /etc/config/settings - name: settings-vol-config - readOnly: true - - mountPath: /etc/config/osm-settings - name: osm-settings-vol-config - readOnly: true - livenessProbe: - exec: - command: - - /bin/bash - - -c - - /opt/livenessprobe.sh - initialDelaySeconds: 60 - periodSeconds: 60 - timeoutSeconds: 15 + # - name: omsagent-prometheus + # image: "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021" + # imagePullPolicy: IfNotPresent + # resources: + # limits: + # cpu: 500m + # memory: 1Gi + # requests: + # cpu: 75m + # memory: 225Mi + # env: + # # azure devops pipeline uses AKS_RESOURCE_ID and AKS_REGION hence ensure to uncomment these + # - name: AKS_CLUSTER_NAME + # value: "VALUE_AKS_CLUSTER_NAME" + # - name: AKS_RESOURCE_ID + # value: "VALUE_AKS_RESOURCE_ID_VALUE" + # - name: AKS_REGION + # value: "VALUE_AKS_RESOURCE_REGION_VALUE" + # - name: AKS_NODE_RESOURCE_GROUP + # value: "VALUE_AKS_NODE_RESOURCE_GROUP" + # #Uncomment below two lines for ACS clusters and set the cluster names manually. Also comment out the above two lines for ACS clusters + # #- name: ACS_RESOURCE_NAME + # # value: "my_acs_cluster_name" + # - name: CONTAINER_TYPE + # value: "PrometheusSidecar" + # - name: CONTROLLER_TYPE + # value: "DaemonSet" + # - name: NODE_IP + # valueFrom: + # fieldRef: + # fieldPath: status.hostIP + # # Update this with the user assigned msi client id for omsagent + # - name: USER_ASSIGNED_IDENTITY_CLIENT_ID + # value: "" + # - name: USING_AAD_MSI_AUTH + # value: "false" + # securityContext: + # privileged: true + # volumeMounts: + # - mountPath: /etc/kubernetes/host + # name: azure-json-path + # - mountPath: /etc/omsagent-secret + # name: omsagent-secret + # readOnly: true + # - mountPath: /etc/config/settings + # name: settings-vol-config + # readOnly: true + # - mountPath: /etc/config/osm-settings + # name: osm-settings-vol-config + # readOnly: true + # livenessProbe: + # exec: + # command: + # - /bin/bash + # - -c + # - /opt/livenessprobe.sh + # initialDelaySeconds: 60 + # periodSeconds: 60 + # timeoutSeconds: 15 affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: @@ -589,7 +603,7 @@ spec: serviceAccountName: omsagent containers: - name: omsagent - image: "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod08052021-1" + image: "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10082021" imagePullPolicy: IfNotPresent resources: limits: @@ -620,7 +634,9 @@ spec: value: "" # Add the below environment variable to true only in sidecar enabled regions, else set it to false - name: SIDECAR_SCRAPING_ENABLED - value: "true" + value: "false" + - name: USING_AAD_MSI_AUTH + value: "false" securityContext: privileged: true ports: @@ -760,7 +776,7 @@ spec: value: "3" containers: - name: omsagent-win - image: "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod06112021-2" + image: "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod10082021" imagePullPolicy: IfNotPresent resources: limits: @@ -789,13 +805,13 @@ spec: fieldRef: fieldPath: status.hostIP - name: SIDECAR_SCRAPING_ENABLED - value: "true" + value: "false" # Update this with the user assigned msi client id for omsagent - name: USER_ASSIGNED_IDENTITY_CLIENT_ID value: "" # Add this only for clouds that require cert bootstrapping - - name: REQUIRES_CERT_BOOTSTRAP - value: "true" + # - name: REQUIRES_CERT_BOOTSTRAP + # value: "true" volumeMounts: - mountPath: C:\ProgramData\docker\containers name: docker-windows-containers @@ -823,7 +839,11 @@ spec: command: - cmd - /c - - C:\opt\omsagentwindows\scripts\cmd\livenessProbe.cmd + - C:\opt\omsagentwindows\scripts\cmd\livenessprobe.exe + - fluent-bit.exe + - fluentdwinaks + - "C:\\etc\\omsagentwindows\\filesystemwatcher.txt" + - "C:\\etc\\omsagentwindows\\renewcertificate.txt" periodSeconds: 60 initialDelaySeconds: 180 timeoutSeconds: 15 diff --git a/kubernetes/windows/Dockerfile b/kubernetes/windows/Dockerfile index 0ba64cd75..76667f389 100644 --- a/kubernetes/windows/Dockerfile +++ b/kubernetes/windows/Dockerfile @@ -3,7 +3,7 @@ MAINTAINER OMSContainers@microsoft.com LABEL vendor=Microsoft\ Corp \ com.microsoft.product="Azure Monitor for containers" -ARG IMAGE_TAG=win-ciprod06112021-2 +ARG IMAGE_TAG=win-ciprod10082021 # Do not split this into multiple RUN! # Docker creates a layer for every RUN-Statement @@ -46,7 +46,7 @@ RUN ./setup.ps1 COPY main.ps1 /opt/omsagentwindows/scripts/powershell COPY ./omsagentwindows/installer/scripts/filesystemwatcher.ps1 /opt/omsagentwindows/scripts/powershell -COPY ./omsagentwindows/installer/scripts/livenessprobe.cmd /opt/omsagentwindows/scripts/cmd/ +COPY ./omsagentwindows/installer/livenessprobe/livenessprobe.exe /opt/omsagentwindows/scripts/cmd/ COPY setdefaulttelegrafenvvariables.ps1 /opt/omsagentwindows/scripts/powershell # copy ruby scripts to /opt folder @@ -71,7 +71,6 @@ COPY ./omsagentwindows/installer/scripts/rubyKeepCertificateAlive/*.rb /etc/flue #Copy fluentd ruby plugins COPY ./omsagentwindows/ruby/ /etc/fluent/plugin/ -COPY ./omsagentwindows/utils/*.rb /etc/fluent/plugin/ ENV AGENT_VERSION ${IMAGE_TAG} ENV OS_TYPE "windows" diff --git a/kubernetes/windows/Dockerfile-dev-base-image b/kubernetes/windows/Dockerfile-dev-base-image new file mode 100644 index 000000000..0081f9c53 --- /dev/null +++ b/kubernetes/windows/Dockerfile-dev-base-image @@ -0,0 +1,43 @@ +FROM mcr.microsoft.com/windows/servercore:ltsc2019 +MAINTAINER OMSContainers@microsoft.com +LABEL vendor=Microsoft\ Corp \ + com.microsoft.product="Azure Monitor for containers" + +# Do not split this into multiple RUN! +# Docker creates a layer for every RUN-Statement +RUN powershell -Command "Set-ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))" +# Fluentd depends on cool.io whose fat gem is only available for Ruby < 2.5, so need to specify --platform ruby when install Ruby > 2.5 and install msys2 to get dev tools +RUN choco install -y ruby --version 2.6.5.1 --params "'/InstallDir:C:\ruby26'" \ +&& choco install -y msys2 --version 20210604.0.0 --params "'/NoPath /NoUpdate /InstallDir:C:\ruby26\msys64'" \ +&& choco install -y vim + +# gangams - optional MSYS2 update via ridk failing in merged docker file so skipping that since we dont need optional update +RUN refreshenv \ +&& ridk install 3 \ +&& echo gem: --no-document >> C:\ProgramData\gemrc \ +&& gem install cool.io -v 1.5.4 --platform ruby \ +&& gem install oj -v 3.3.10 \ +&& gem install json -v 2.2.0 \ +&& gem install fluentd -v 1.12.2 \ +&& gem install win32-service -v 1.0.1 \ +&& gem install win32-ipc -v 0.7.0 \ +&& gem install win32-event -v 0.6.3 \ +&& gem install windows-pr -v 1.2.6 \ +&& gem install tomlrb -v 1.3.0 \ +&& gem install gyoku -v 1.3.1 \ +&& gem sources --clear-all + +# Remove gem cache and chocolatey +RUN powershell -Command "Remove-Item -Force C:\ruby26\lib\ruby\gems\2.6.0\cache\*.gem; Remove-Item -Recurse -Force 'C:\ProgramData\chocolatey'" + +SHELL ["powershell"] + +ENV tmpdir /opt/omsagentwindows/scripts/powershell + +WORKDIR /opt/omsagentwindows/scripts/powershell + +# copy certificate generator binaries zip +COPY ./omsagentwindows/*.zip /opt/omsagentwindows/ + +COPY setup.ps1 /opt/omsagentwindows/scripts/powershell +RUN ./setup.ps1 \ No newline at end of file diff --git a/kubernetes/windows/Dockerfile-dev-image b/kubernetes/windows/Dockerfile-dev-image new file mode 100644 index 000000000..35aa83bd9 --- /dev/null +++ b/kubernetes/windows/Dockerfile-dev-image @@ -0,0 +1,44 @@ +FROM omsagent-win-base +MAINTAINER OMSContainers@microsoft.com +LABEL vendor=Microsoft\ Corp \ + com.microsoft.product="Azure Monitor for containers" + +#Uncomment below to test setup.ps1 changes +#COPY setup.ps1 /opt/omsagentwindows/scripts/powershell +#RUN ./setup.ps1 +COPY main.ps1 /opt/omsagentwindows/scripts/powershell +COPY ./omsagentwindows/installer/scripts/filesystemwatcher.ps1 /opt/omsagentwindows/scripts/powershell +COPY ./omsagentwindows/installer/scripts/livenessprobe.cmd /opt/omsagentwindows/scripts/cmd/ +COPY setdefaulttelegrafenvvariables.ps1 /opt/omsagentwindows/scripts/powershell + +# copy ruby scripts to /opt folder +COPY ./omsagentwindows/installer/scripts/*.rb /opt/omsagentwindows/scripts/ruby/ + +# copy out_oms.so file +COPY ./omsagentwindows/out_oms.so /opt/omsagentwindows/out_oms.so + +# copy fluent, fluent-bit and out_oms conf files +COPY ./omsagentwindows/installer/conf/fluent.conf /etc/fluent/ +# copy fluent docker and cri parser conf files +COPY ./omsagentwindows/installer/conf/fluent-cri-parser.conf /etc/fluent/ +COPY ./omsagentwindows/installer/conf/fluent-docker-parser.conf /etc/fluent/ +COPY ./omsagentwindows/installer/conf/fluent-bit.conf /etc/fluent-bit +COPY ./omsagentwindows/installer/conf/out_oms.conf /etc/omsagentwindows + +# copy telegraf conf file +COPY ./omsagentwindows/installer/conf/telegraf.conf /etc/telegraf/ + +# copy keepcert alive ruby scripts +COPY ./omsagentwindows/installer/scripts/rubyKeepCertificateAlive/*.rb /etc/fluent/plugin/ + +#Copy fluentd ruby plugins +COPY ./omsagentwindows/ruby/ /etc/fluent/plugin/ + +ENV AGENT_VERSION ${IMAGE_TAG} +ENV OS_TYPE "windows" +ENV APPLICATIONINSIGHTS_AUTH "NzAwZGM5OGYtYTdhZC00NThkLWI5NWMtMjA3ZjM3NmM3YmRi" +ENV AZMON_COLLECT_ENV False +ENV CI_CERT_LOCATION "C://oms.crt" +ENV CI_KEY_LOCATION "C://oms.key" + +ENTRYPOINT ["powershell", "C:\\opt\\omsagentwindows\\scripts\\powershell\\main.ps1"] diff --git a/kubernetes/windows/dockerbuild/build-and-publish-dev-docker-image.ps1 b/kubernetes/windows/dockerbuild/build-and-publish-dev-docker-image.ps1 new file mode 100644 index 000000000..0fde7f379 --- /dev/null +++ b/kubernetes/windows/dockerbuild/build-and-publish-dev-docker-image.ps1 @@ -0,0 +1,64 @@ +<# + .DESCRIPTION + Builds the Windows Agent code and Docker Image and pushes the docker image to specified repo + + .PARAMETER image + docker image. format should be /: +#> +param( + [Parameter(mandatory = $true)] + [string]$image +) + +$currentdir = $PSScriptRoot +Write-Host("current script dir : " + $currentdir + " ") + +if ($false -eq (Test-Path -Path $currentdir)) { + Write-Host("Invalid current dir : " + $currentdir + " ") -ForegroundColor Red + exit +} + +if ([string]::IsNullOrEmpty($image)) { + Write-Host "Image parameter shouldnt be null or empty" -ForegroundColor Red + exit +} + +$imageparts = $image.split(":") +if (($imageparts.Length -ne 2)){ + Write-Host "Image not in valid format. Expected format should be /:" -ForegroundColor Red + exit +} + +$imagetag = $imageparts[1].ToLower() +$imagerepo = $imageparts[0] + +if ($imagetag.StartsWith("win-") -eq $false) +{ + Write-Host "adding win- prefix image tag since its not provided" + $imagetag = "win-$imagetag" +} + +Write-Host "image tag used is :$imagetag" + +Write-Host "start:Building the cert generator and out oms code via Makefile.ps1" +..\..\..\build\windows\Makefile.ps1 +Write-Host "end:Building the cert generator and out oms code via Makefile.ps1" + +$dockerFileDir = Split-Path -Path $currentdir +Write-Host("builddir dir : " + $dockerFileDir + " ") +if ($false -eq (Test-Path -Path $dockerFileDir)) { + Write-Host("Invalid dockerFile Dir : " + $dockerFileDir + " ") -ForegroundColor Red + exit +} + +Write-Host "changing directory to DockerFile dir: $dockerFileDir" +Set-Location -Path $dockerFileDir + +$updateImage = ${imagerepo} + ":" + ${imageTag} +Write-Host "STAT:Triggering docker image build: $image" +docker build -t $updateImage --build-arg IMAGE_TAG=$imageTag -f Dockerfile-dev-image . +Write-Host "END:Triggering docker image build: $updateImage" + +Write-Host "STAT:pushing docker image : $updateImage" +docker push $updateImage +Write-Host "EnD:pushing docker image : $updateImage" diff --git a/kubernetes/windows/dockerbuild/build-dev-base-image.ps1 b/kubernetes/windows/dockerbuild/build-dev-base-image.ps1 new file mode 100644 index 000000000..142e20c3f --- /dev/null +++ b/kubernetes/windows/dockerbuild/build-dev-base-image.ps1 @@ -0,0 +1,32 @@ +<# + .DESCRIPTION + Builds the Docker Image locally for the server core ltsc base and installs dependencies + +#> + +$currentdir = $PSScriptRoot +Write-Host("current script dir : " + $currentdir + " ") + +if ($false -eq (Test-Path -Path $currentdir)) { + Write-Host("Invalid current dir : " + $currentdir + " ") -ForegroundColor Red + exit +} + +Write-Host "start:Building the cert generator and out oms code via Makefile.ps1" +..\..\..\build\windows\Makefile.ps1 +Write-Host "end:Building the cert generator and out oms code via Makefile.ps1" + +$dockerFileDir = Split-Path -Path $currentdir +Write-Host("builddir dir : " + $dockerFileDir + " ") +if ($false -eq (Test-Path -Path $dockerFileDir)) { + Write-Host("Invalid dockerFile Dir : " + $dockerFileDir + " ") -ForegroundColor Red + exit +} + +Write-Host "changing directory to DockerFile dir: $dockerFileDir" +Set-Location -Path $dockerFileDir + +$updateImage = "omsagent-win-base" +Write-Host "STAT:Triggering base docker image build: $updateImage" +docker build -t $updateImage -f Dockerfile-dev-base-image . +Write-Host "END:Triggering docker image build: $updateImage" \ No newline at end of file diff --git a/kubernetes/windows/main.ps1 b/kubernetes/windows/main.ps1 index 1bb9a3468..7f41c860f 100644 --- a/kubernetes/windows/main.ps1 +++ b/kubernetes/windows/main.ps1 @@ -43,17 +43,49 @@ function Start-FileSystemWatcher { function Set-EnvironmentVariables { $domain = "opinsights.azure.com" - $cloud_environment = "public" + $mcs_endpoint = "monitor.azure.com" + $cloud_environment = "azurepubliccloud" if (Test-Path /etc/omsagent-secret/DOMAIN) { # TODO: Change to omsagent-secret before merging $domain = Get-Content /etc/omsagent-secret/DOMAIN - $cloud_environment = "national" + if (![string]::IsNullOrEmpty($domain)) { + if ($domain -eq "opinsights.azure.com") { + $cloud_environment = "azurepubliccloud" + $mcs_endpoint = "monitor.azure.com" + } elseif ($domain -eq "opinsights.azure.cn") { + $cloud_environment = "azurechinacloud" + $mcs_endpoint = "monitor.azure.cn" + } elseif ($domain -eq "opinsights.azure.us") { + $cloud_environment = "azureusgovernmentcloud" + $mcs_endpoint = "monitor.azure.us" + } elseif ($domain -eq "opinsights.azure.eaglex.ic.gov") { + $cloud_environment = "usnat" + $mcs_endpoint = "monitor.azure.eaglex.ic.gov" + } elseif ($domain -eq "opinsights.azure.microsoft.scloud") { + $cloud_environment = "ussec" + $mcs_endpoint = "monitor.azure.microsoft.scloud" + } else { + Write-Host "Invalid or Unsupported domain name $($domain). EXITING....." + exit 1 + } + } else { + Write-Host "Domain name either null or empty. EXITING....." + exit 1 + } } + Write-Host "Log analytics domain: $($domain)" + Write-Host "MCS endpoint: $($mcs_endpoint)" + Write-Host "Cloud Environment: $($cloud_environment)" + # Set DOMAIN [System.Environment]::SetEnvironmentVariable("DOMAIN", $domain, "Process") [System.Environment]::SetEnvironmentVariable("DOMAIN", $domain, "Machine") + # Set MCS Endpoint + [System.Environment]::SetEnvironmentVariable("MCS_ENDPOINT", $mcs_endpoint, "Process") + [System.Environment]::SetEnvironmentVariable("MCS_ENDPOINT", $mcs_endpoint, "Machine") + # Set CLOUD_ENVIRONMENT [System.Environment]::SetEnvironmentVariable("CLOUD_ENVIRONMENT", $cloud_environment, "Process") [System.Environment]::SetEnvironmentVariable("CLOUD_ENVIRONMENT", $cloud_environment, "Machine") @@ -158,7 +190,6 @@ function Set-EnvironmentVariables { Write-Host $_.Exception } } - # Check if the fetched IKey was properly encoded. if not then turn off telemetry if ($aiKeyFetched -match '^[A-Za-z0-9=]+$') { Write-Host "Using cloud-specific instrumentation key" @@ -229,6 +260,21 @@ function Set-EnvironmentVariables { Write-Host "Failed to set environment variable HOSTNAME for target 'machine' since it is either null or empty" } + # check if its AAD Auth MSI mode via USING_AAD_MSI_AUTH environment variable + $isAADMSIAuth = [System.Environment]::GetEnvironmentVariable("USING_AAD_MSI_AUTH", "process") + if (![string]::IsNullOrEmpty($isAADMSIAuth)) { + [System.Environment]::SetEnvironmentVariable("AAD_MSI_AUTH_MODE", $isAADMSIAuth, "Process") + [System.Environment]::SetEnvironmentVariable("AAD_MSI_AUTH_MODE", $isAADMSIAuth, "Machine") + Write-Host "Successfully set environment variable AAD_MSI_AUTH_MODE - $($isAADMSIAuth) for target 'machine'..." + } + + # check if use token proxy endpoint set via USE_IMDS_TOKEN_PROXY_END_POINT environment variable + $useIMDSTokenProxyEndpoint = [System.Environment]::GetEnvironmentVariable("USE_IMDS_TOKEN_PROXY_END_POINT", "process") + if (![string]::IsNullOrEmpty($useIMDSTokenProxyEndpoint)) { + [System.Environment]::SetEnvironmentVariable("USE_IMDS_TOKEN_PROXY_END_POINT", $useIMDSTokenProxyEndpoint, "Process") + [System.Environment]::SetEnvironmentVariable("USE_IMDS_TOKEN_PROXY_END_POINT", $useIMDSTokenProxyEndpoint, "Machine") + Write-Host "Successfully set environment variable USE_IMDS_TOKEN_PROXY_END_POINT - $($useIMDSTokenProxyEndpoint) for target 'machine'..." + } $nodeIp = [System.Environment]::GetEnvironmentVariable("NODE_IP", "process") if (![string]::IsNullOrEmpty($nodeIp)) { [System.Environment]::SetEnvironmentVariable("NODE_IP", $nodeIp, "machine") @@ -427,7 +473,15 @@ function Start-Telegraf { else { Write-Host "Failed to set environment variable KUBERNETES_SERVICE_PORT for target 'machine' since it is either null or empty" } - + $nodeIp = [System.Environment]::GetEnvironmentVariable("NODE_IP", "process") + if (![string]::IsNullOrEmpty($nodeIp)) { + [System.Environment]::SetEnvironmentVariable("NODE_IP", $nodeIp, "machine") + Write-Host "Successfully set environment variable NODE_IP - $($nodeIp) for target 'machine'..." + } + else { + Write-Host "Failed to set environment variable NODE_IP for target 'machine' since it is either null or empty" + } + Write-Host "Installing telegraf service" C:\opt\telegraf\telegraf.exe --service install --config "C:\etc\telegraf\telegraf.conf" @@ -524,8 +578,13 @@ if (![string]::IsNullOrEmpty($requiresCertBootstrap) -and ` Bootstrap-CACertificates } -Generate-Certificates -Test-CertificatePath +$isAADMSIAuth = [System.Environment]::GetEnvironmentVariable("USING_AAD_MSI_AUTH") +if (![string]::IsNullOrEmpty($isAADMSIAuth) -and $isAADMSIAuth.ToLower() -eq 'true') { + Write-Host "skipping agent onboarding via cert since AAD MSI Auth configured" +} else { + Generate-Certificates + Test-CertificatePath +} Start-Fluent-Telegraf # List all powershell processes running. This should have main.ps1 and filesystemwatcher.ps1 diff --git a/scripts/build/windows/install-build-pre-requisites.ps1 b/scripts/build/windows/install-build-pre-requisites.ps1 index 3bb56ac2a..7f1c9b54f 100755 --- a/scripts/build/windows/install-build-pre-requisites.ps1 +++ b/scripts/build/windows/install-build-pre-requisites.ps1 @@ -13,8 +13,8 @@ function Install-Go { exit } - $url = "https://dl.google.com/go/go1.14.1.windows-amd64.msi" - $output = Join-Path -Path $tempGo -ChildPath "go1.14.1.windows-amd64.msi" + $url = "https://dl.google.com/go/go1.15.14.windows-amd64.msi" + $output = Join-Path -Path $tempGo -ChildPath "go1.15.14.windows-amd64.msi" Write-Host("downloading go msi into directory path : " + $output + " ...") Invoke-WebRequest -Uri $url -OutFile $output -ErrorAction Stop Write-Host("downloading of go msi into directory path : " + $output + " completed") diff --git a/scripts/dcr-onboarding/ci-extension-dcr-streams.md b/scripts/dcr-onboarding/ci-extension-dcr-streams.md new file mode 100644 index 000000000..cbac41838 --- /dev/null +++ b/scripts/dcr-onboarding/ci-extension-dcr-streams.md @@ -0,0 +1,186 @@ +# 1 - ContainerLogV2 +> Note- Please note, this table uses NG schema +``` +stream-id: Microsoft-ContainerLogV2 +data-type: CONTAINERINSIGHTS_CONTAINERLOGV2 +intelligence-pack: ContainerInsights +solutions: ContainerInsights +platform: Any +la-table-name: ContainerLogV2 +alias-stream-id: Microsoft-ContainerLogV2 +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 2 - InsightsMetrics +``` +stream-id: Microsoft-InsightsMetrics +data-type: INSIGHTS_METRICS_BLOB +intelligence-pack: ContainerInsights +solutions: ContainerInsights +platform: Any +la-table-name: InsightsMetrics +alias-stream-id: Microsoft-InsightsMetrics +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 3 - ContainerInventory + +``` +stream-id: Microsoft-ContainerInventory +data-type: CONTAINER_INVENTORY_BLOB +intelligence-pack: ContainerInsights +solutions: ContainerInsights +platform: Any +la-table-name: ContainerInventory +alias-stream-id: Microsoft-ContainerInventory +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 4 - ContainerLog + +``` +stream-id: Microsoft-ContainerLog +data-type: CONTAINER_LOG_BLOB +intelligence-pack: Containers +solutions: ContainerInsights +platform: Any +la-table-name: ContainerLog +alias-stream-id: Microsoft-ContainerLog +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 5 - ContainerNodeInventory + +``` +stream-id: Microsoft-ContainerNodeInventory +data-type: CONTAINER_NODE_INVENTORY_BLOB +intelligence-pack: ContainerInsights +solutions: ContainerInsights +platform: Any +la-table-name: ContainerNodeInventory +alias-stream-id: Microsoft-ContainerNodeInventory +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 6 - KubePodInventory +``` +stream-id: Microsoft-KubePodInventory +data-type: KUBE_POD_INVENTORY_BLOB +intelligence-pack: ContainerInsights +solutions: ContainerInsights +platform: Any +la-table-name: KubePodInventory +alias-stream-id: Microsoft-KubePodInventory +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 7 - KubeNodeInventory +``` +stream-id: Microsoft-KubeNodeInventory +data-type: KUBE_NODE_INVENTORY_BLOB +intelligence-pack: ContainerInsights +solutions: ContainerInsights +platform: Any +la-table-name: KubeNodeInventory +alias-stream-id: Microsoft-KubeNodeInventory +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 8 - KubePVInventory +``` +stream-id: Microsoft-KubePVInventory +data-type: KUBE_PV_INVENTORY_BLOB +intelligence-pack: ContainerInsights +solutions: ContainerInsights +platform: Any +la-table-name: KubePVInventory +alias-stream-id: Microsoft-KubePVInventory +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 9 - KubeEvents +``` +stream-id: Microsoft-KubeEvents +data-type: KUBE_EVENTS_BLOB +intelligence-pack: ContainerInsights +solutions: ContainerInsights +platform: Any +la-table-name: KubeEvents +alias-stream-id: Microsoft-KubeEvents +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 10 - KubeServices +``` +stream-id: Microsoft-KubeServices +data-type: KUBE_SERVICES_BLOB +intelligence-pack: ContainerInsights +solutions: ContainerInsights +platform: Any +la-table-name: KubeServices +alias-stream-id: Microsoft-KubeServices +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 11 - KubeMonAgentEvents +``` +stream-id: Microsoft-KubeMonAgentEvents +data-type: KUBE_MON_AGENT_EVENTS_BLOB +intelligence-pack: Containers +solutions: ContainerInsights +platform: Any +la-table-name: KubeMonAgentEvents +alias-stream-id: Microsoft-KubeMonAgentEvents +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 12 - KubeHealth +``` +stream-id: Microsoft-KubeHealth +data-type: KUBE_HEALTH_BLOB +intelligence-pack: ContainerInsights +solutions: ContainerInsights +platform: Any +la-table-name: KubeHealth +alias-stream-id: Microsoft-KubeHealth +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` + +# 13 - Perf +``` +> Note - This stream already exists +stream-id: Microsoft-Perf +data-type: LINUX_PERF_BLOB +intelligence-pack: LogManagement +solutions: ContainerInsights +platform: Any +la-table-name: LogManagement +alias-stream-id: Microsoft-Perf +contact-alias: OMScontainers@microsoft.com +stage: to review +tags: agent +``` diff --git a/scripts/dcr-onboarding/ci-extension-dcr.json b/scripts/dcr-onboarding/ci-extension-dcr.json new file mode 100644 index 000000000..f3fbec79b --- /dev/null +++ b/scripts/dcr-onboarding/ci-extension-dcr.json @@ -0,0 +1,59 @@ +{ + "location": "", + "properties": { + "dataSources": { + "extensions": [ + { + "name": "ContainerInsightsExtension", + "streams": [ + "Microsoft-Perf", + "Microsoft-ContainerInventory", + "Microsoft-ContainerLog", + "Microsoft-ContainerLogV2", + "Microsoft-ContainerNodeInventory", + "Microsoft-KubeEvents", + "Microsoft-KubeHealth", + "Microsoft-KubeMonAgentEvents", + "Microsoft-KubeNodeInventory", + "Microsoft-KubePodInventory", + "Microsoft-KubePVInventory", + "Microsoft-KubeServices", + "Microsoft-InsightsMetrics" + + ], + "extensionName": "ContainerInsights" + } + ] + }, + "destinations": { + "logAnalytics": [ + { + "workspaceResourceId": "/subscriptions//resourcegroups//providers/microsoft.operationalinsights/workspaces/", + "name": "ciworkspace" + } + ] + }, + "dataFlows": [ + { + "streams": [ + "Microsoft-Perf", + "Microsoft-ContainerInventory", + "Microsoft-ContainerLog", + "Microsoft-ContainerLogV2", + "Microsoft-ContainerNodeInventory", + "Microsoft-KubeEvents", + "Microsoft-KubeHealth", + "Microsoft-KubeMonAgentEvents", + "Microsoft-KubeNodeInventory", + "Microsoft-KubePodInventory", + "Microsoft-KubePVInventory", + "Microsoft-KubeServices", + "Microsoft-InsightsMetrics" + ], + "destinations": [ + "ciworkspace" + ] + } + ] + } +} diff --git a/scripts/onboarding/managed/disable-monitoring.sh b/scripts/onboarding/managed/disable-monitoring.sh index 29b755331..40b0793bc 100644 --- a/scripts/onboarding/managed/disable-monitoring.sh +++ b/scripts/onboarding/managed/disable-monitoring.sh @@ -116,7 +116,7 @@ remove_monitoring_tags() if [ "$isUsingServicePrincipal" = true ] ; then echo "login to the azure using provided service principal creds" - az login --service-principal --username $servicePrincipalClientId --password $servicePrincipalClientSecret --tenant $servicePrincipalTenantId + az login --service-principal --username="$servicePrincipalClientId" --password="$servicePrincipalClientSecret" --tenant="$servicePrincipalTenantId" else echo "login to the azure interactively" az login --use-device-code diff --git a/scripts/onboarding/managed/enable-monitoring.ps1 b/scripts/onboarding/managed/enable-monitoring.ps1 index 828d061ac..e79ef2138 100644 --- a/scripts/onboarding/managed/enable-monitoring.ps1 +++ b/scripts/onboarding/managed/enable-monitoring.ps1 @@ -62,11 +62,10 @@ $isArcK8sCluster = $false $isAksCluster = $false $isUsingServicePrincipal = $false -# released chart version in mcr -$mcr = "mcr.microsoft.com" -$mcrChartVersion = "2.8.3" -$mcrChartRepoPath = "azuremonitor/containerinsights/preview/azuremonitor-containers" -$helmLocalRepoName = "." +# microsoft helm chart repo +$microsoftHelmRepo="https://microsoft.github.io/charts/repo" +$microsoftHelmRepoName="microsoft" + $omsAgentDomainName="opinsights.azure.com" if ([string]::IsNullOrEmpty($azureCloudName) -eq $true) { @@ -547,16 +546,12 @@ Write-Host "Helm version" : $helmVersion Write-Host("Installing or upgrading if exists, Azure Monitor for containers HELM chart ...") try { - Write-Host("pull the chart from mcr.microsoft.com") - [System.Environment]::SetEnvironmentVariable("HELM_EXPERIMENTAL_OCI", 1, "Process") - - Write-Host("pull the chart from mcr.microsoft.com") - helm chart pull ${mcr}/${mcrChartRepoPath}:${mcrChartVersion} - - Write-Host("export the chart from local cache to current directory") - helm chart export ${mcr}/${mcrChartRepoPath}:${mcrChartVersion} --destination . + Write-Host("Add helm chart repo- ${microsoftHelmRepoName} with repo path: ${microsoftHelmRepo}") + helm repo add ${microsoftHelmRepoName} ${microsoftHelmRepo} + Write-Host("Updating the helm chart repo- ${microsoftHelmRepoName} to get latest chart versions") + helm repo update ${microsoftHelmRepoName} - $helmChartRepoPath = "${helmLocalRepoName}" + "/" + "${helmChartName}" + $helmChartRepoPath = "${microsoftHelmRepoName}" + "/" + "${helmChartName}" Write-Host("helmChartRepoPath is : ${helmChartRepoPath}") diff --git a/scripts/onboarding/managed/enable-monitoring.sh b/scripts/onboarding/managed/enable-monitoring.sh index f27f944fd..5fc241517 100644 --- a/scripts/onboarding/managed/enable-monitoring.sh +++ b/scripts/onboarding/managed/enable-monitoring.sh @@ -43,11 +43,9 @@ defaultAzureCloud="AzureCloud" # default domain will be for public cloud omsAgentDomainName="opinsights.azure.com" -# released chart version in mcr -mcrChartVersion="2.8.3" -mcr="mcr.microsoft.com" -mcrChartRepoPath="azuremonitor/containerinsights/preview/azuremonitor-containers" -helmLocalRepoName="." +# microsoft helm chart repo +microsoftHelmRepo="https://microsoft.github.io/charts/repo" +microsoftHelmRepoName="microsoft" helmChartName="azuremonitor-containers" # default release name used during onboarding @@ -435,9 +433,10 @@ create_default_log_analytics_workspace() { workspaceResourceGroup="DefaultResourceGroup-"$workspaceRegionCode isRGExists=$(az group exists -g $workspaceResourceGroup) + isRGExists=$(echo $isRGExists | tr -d '"\r\n') workspaceName="DefaultWorkspace-"$subscriptionId"-"$workspaceRegionCode - if $isRGExists; then + if [ "${isRGExists}" == "true" ]; then echo "using existing default resource group:"$workspaceResourceGroup else echo "creating resource group: $workspaceResourceGroup in region: $workspaceRegion" @@ -455,7 +454,7 @@ create_default_log_analytics_workspace() { fi workspaceResourceId=$(az resource show -g $workspaceResourceGroup -n $workspaceName --resource-type $workspaceResourceProvider --query id -o json) - workspaceResourceId=$(echo $workspaceResourceId | tr -d '"') + workspaceResourceId=$(echo $workspaceResourceId | tr -d '"' | tr -d '"\r\n') echo "workspace resource Id: ${workspaceResourceId}" } @@ -495,10 +494,16 @@ install_helm_chart() { adminUserName=$(az aro list-credentials -g $clusterResourceGroup -n $clusterName --query 'kubeadminUsername' -o tsv) adminPassword=$(az aro list-credentials -g $clusterResourceGroup -n $clusterName --query 'kubeadminPassword' -o tsv) apiServer=$(az aro show -g $clusterResourceGroup -n $clusterName --query apiserverProfile.url -o tsv) + # certain az cli versions adds /r/n so trim them + adminUserName=$(echo $adminUserName | tr -d '"\r\n') + adminPassword=$(echo $adminPassword | tr -d '"\r\n') + apiServer=$(echo $apiServer | tr -d '"\r\n') echo "login to the cluster via oc login" oc login $apiServer -u $adminUserName -p $adminPassword - echo "creating project azure-monitor-for-containers" + echo "creating project: azure-monitor-for-containers" oc new-project $openshiftProjectName + echo "swicthing to project: azure-monitor-for-containers" + oc project $openshiftProjectName echo "getting config-context of aro v4 cluster" kubeconfigContext=$(oc config current-context) fi @@ -513,15 +518,7 @@ install_helm_chart() { clusterRegion=$(az resource show --ids ${clusterResourceId} --query location -o tsv) echo "cluster region is : ${clusterRegion}" - echo "pull the chart version ${mcrChartVersion} from ${mcr}/${mcrChartRepoPath}" - export HELM_EXPERIMENTAL_OCI=1 - helm chart pull $mcr/$mcrChartRepoPath:$mcrChartVersion - - echo "export the chart from local cache to current directory" - helm chart export $mcr/$mcrChartRepoPath:$mcrChartVersion --destination . - - helmChartRepoPath=$helmLocalRepoName/$helmChartName - + helmChartRepoPath=$microsoftHelmRepoName/$helmChartName echo "helm chart repo path: ${helmChartRepoPath}" if [ ! -z "$proxyEndpoint" ]; then @@ -550,7 +547,7 @@ install_helm_chart() { login_to_azure() { if [ "$isUsingServicePrincipal" = true ]; then echo "login to the azure using provided service principal creds" - az login --service-principal --username $servicePrincipalClientId --password $servicePrincipalClientSecret --tenant $servicePrincipalTenantId + az login --service-principal --username="$servicePrincipalClientId" --password="$servicePrincipalClientSecret" --tenant="$servicePrincipalTenantId" else echo "login to the azure interactively" az login --use-device-code @@ -581,6 +578,14 @@ enable_aks_monitoring_addon() { echo "status after enabling of aks monitoringa addon:$status" } +# add helm chart repo and update repo to get latest chart version +add_and_update_helm_chart_repo() { + echo "adding helm repo: ${microsoftHelmRepoName} with repo path: ${microsoftHelmRepo}" + helm repo add ${microsoftHelmRepoName} ${microsoftHelmRepo} + echo "updating helm repo: ${microsoftHelmRepoName} to get local charts updated with latest ones" + helm repo update +} + # parse and validate args parse_args $@ @@ -644,6 +649,9 @@ else attach_monitoring_tags fi +# add helm repo & update to get the latest chart version +add_and_update_helm_chart_repo + # install helm chart install_helm_chart diff --git a/scripts/onboarding/managed/upgrade-monitoring.sh b/scripts/onboarding/managed/upgrade-monitoring.sh index 5456a7072..edd48c938 100644 --- a/scripts/onboarding/managed/upgrade-monitoring.sh +++ b/scripts/onboarding/managed/upgrade-monitoring.sh @@ -19,14 +19,14 @@ set -e set -o pipefail -# released chart version for Azure Arc enabled Kubernetes public preview -mcrChartVersion="2.8.3" -mcr="mcr.microsoft.com" -mcrChartRepoPath="azuremonitor/containerinsights/preview/azuremonitor-containers" - +# microsoft helm chart repo +microsoftHelmRepo="https://microsoft.github.io/charts/repo" +microsoftHelmRepoName="microsoft" # default to public cloud since only supported cloud is azure public clod defaultAzureCloud="AzureCloud" -helmLocalRepoName="." +# microsoft helm chart repo +microsoftHelmRepo="https://microsoft.github.io/charts/repo" +microsoftHelmRepoName="microsoft" helmChartName="azuremonitor-containers" # default release name used during onboarding @@ -38,6 +38,9 @@ arcK8sResourceProvider="Microsoft.Kubernetes/connectedClusters" # default of resourceProvider is Azure Arc enabled Kubernetes and this will get updated based on the provider cluster resource resourceProvider="Microsoft.Kubernetes/connectedClusters" +# resource provider for azure redhat openshift v4 cluster +aroV4ResourceProvider="Microsoft.RedHatOpenShift/OpenShiftClusters" + # Azure Arc enabled Kubernetes cluster resource isArcK8sCluster=false @@ -235,10 +238,14 @@ upgrade_helm_chart_release() { adminUserName=$(az aro list-credentials -g $clusterResourceGroup -n $clusterName --query 'kubeadminUsername' -o tsv) adminPassword=$(az aro list-credentials -g $clusterResourceGroup -n $clusterName --query 'kubeadminPassword' -o tsv) apiServer=$(az aro show -g $clusterResourceGroup -n $clusterName --query apiserverProfile.url -o tsv) + # certain az cli versions adds /r/n so trim them + adminUserName=$(echo $adminUserName |tr -d '"\r\n') + adminPassword=$(echo $adminPassword |tr -d '"\r\n') + apiServer=$(echo $apiServer |tr -d '"\r\n') echo "login to the cluster via oc login" oc login $apiServer -u $adminUserName -p $adminPassword - echo "creating project azure-monitor-for-containers" - oc new-project $openshiftProjectName + echo "switching to project azure-monitor-for-containers" + oc project $openshiftProjectName echo "getting config-context of aro v4 cluster" kubeconfigContext=$(oc config current-context) fi @@ -249,15 +256,7 @@ upgrade_helm_chart_release() { echo "installing Azure Monitor for containers HELM chart on to the cluster with kubecontext:${kubeconfigContext} ..." fi - export HELM_EXPERIMENTAL_OCI=1 - - echo "pull the chart from ${mcr}/${mcrChartRepoPath}:${mcrChartVersion}" - helm chart pull ${mcr}/${mcrChartRepoPath}:${mcrChartVersion} - - echo "export the chart from local cache to current directory" - helm chart export ${mcr}/${mcrChartRepoPath}:${mcrChartVersion} --destination . - - helmChartRepoPath=$helmLocalRepoName/$helmChartName + helmChartRepoPath=$microsoftHelmRepoName/$helmChartName echo "upgrading the release: $releaseName to chart version : ${mcrChartVersion}" helm get values $releaseName -o yaml | helm upgrade --install $releaseName $helmChartRepoPath -f - @@ -267,7 +266,7 @@ upgrade_helm_chart_release() { login_to_azure() { if [ "$isUsingServicePrincipal" = true ]; then echo "login to the azure using provided service principal creds" - az login --service-principal --username $servicePrincipalClientId --password $servicePrincipalClientSecret --tenant $servicePrincipalTenantId + az login --service-principal --username="$servicePrincipalClientId" --password="$servicePrincipalClientSecret" --tenant="$servicePrincipalTenantId" else echo "login to the azure interactively" az login --use-device-code @@ -296,6 +295,14 @@ validate_and_configure_supported_cloud() { fi } +# add helm chart repo and update repo to get latest chart version +add_and_update_helm_chart_repo() { + echo "adding helm repo: ${microsoftHelmRepoName} with repo path: ${microsoftHelmRepo}" + helm repo add ${microsoftHelmRepoName} ${microsoftHelmRepo} + echo "updating helm repo: ${microsoftHelmRepoName} to get local charts updated with latest ones" + helm repo update +} + # parse and validate args parse_args $@ @@ -322,6 +329,9 @@ fi # validate the cluster has monitoring tags validate_monitoring_tags +# add helm repo & update to get the latest chart version +add_and_update_helm_chart_repo + # upgrade helm chart release upgrade_helm_chart_release diff --git a/scripts/troubleshoot/collect_logs.sh b/scripts/troubleshoot/collect_logs.sh new file mode 100755 index 000000000..99a9ad302 --- /dev/null +++ b/scripts/troubleshoot/collect_logs.sh @@ -0,0 +1,54 @@ +#!/bin/bash + +# This script pulls logs from the replicaset agent pod and a random daemonset pod. This script is to make troubleshooting faster + +CYAN='\033[0;36m' +NC='\033[0m' # No Color + +mkdir azure-monitor-logs-tmp +cd azure-monitor-logs-tmp + +export ds_pod=$(kubectl get pods -n kube-system -o custom-columns=NAME:.metadata.name | grep -E omsagent-[a-z0-9]{5} | head -n 1) +export ds_win_pod=$(kubectl get pods -n kube-system -o custom-columns=NAME:.metadata.name | grep -E omsagent-win-[a-z0-9]{5} | head -n 1) +export rs_pod=$(kubectl get pods -n kube-system -o custom-columns=NAME:.metadata.name | grep -E omsagent-rs-[a-z0-9]{5} | head -n 1) + +echo -e "Collecting logs from ${ds_pod}, ${ds_win_pod}, and ${rs_pod}" +echo -e "${CYAN}Note: some errors about pods and files not existing are expected in clusters without windows nodes or sidecar prometheus scraping. They can safely be disregarded ${NC}" + +# grab `kubectl describe` and `kubectl log` +echo "collecting kubectl describe and kubectl log output" + +kubectl describe pod ${ds_pod} --namespace=kube-system > describe_${ds_pod}.txt +kubectl logs ${ds_pod} --container omsagent --namespace=kube-system > logs_${ds_pod}.txt +kubectl logs ${ds_pod} --container omsagent-prometheus --namespace=kube-system > logs_${ds_pod}_prom.txt + +kubectl describe pod ${ds_win_pod} --namespace=kube-system > describe_${ds_win_pod}.txt +kubectl logs ${ds_win_pod} --container omsagent-win --namespace=kube-system > logs_${ds_win_pod}.txt + +kubectl describe pod ${rs_pod} --namespace=kube-system > describe_${rs_pod}.txt +kubectl logs ${rs_pod} --container omsagent --namespace=kube-system > logs_${rs_pod}.txt + + +# now collect log files from in containers +echo "Collecting log files from inside agent containers" + +kubectl cp ${ds_pod}:/var/opt/microsoft/docker-cimprov/log omsagent-daemonset --namespace=kube-system --container omsagent +kubectl cp ${ds_pod}:/var/opt/microsoft/linuxmonagent/log omsagent-daemonset-mdsd --namespace=kube-system --container omsagent + +kubectl cp ${ds_pod}:/var/opt/microsoft/docker-cimprov/log omsagent-prom-daemonset --namespace=kube-system --container omsagent-prometheus +kubectl cp ${ds_pod}:/var/opt/microsoft/linuxmonagent/log omsagent-prom-daemonset-mdsd --namespace=kube-system --container omsagent-prometheus + +# for some reason copying logs out of /etc/omsagentwindows doesn't work (gives a permission error), but exec then cat does work. +# skip collecting these logs for now, would be good to come back and fix this next time a windows support case comes up +# kubectl cp ${ds_win_pod}:/etc/omsagentwindows omsagent-win-daemonset --namespace=kube-system +kubectl cp ${ds_win_pod}:/etc/fluent-bit omsagent-win-daemonset-fbit --namespace=kube-system + +kubectl cp ${rs_pod}:/var/opt/microsoft/docker-cimprov/log omsagent-replicaset --namespace=kube-system +kubectl cp ${rs_pod}:/var/opt/microsoft/linuxmonagent/log omsagent-replicaset-mdsd --namespace=kube-system + +zip -r -q ../azure-monitor-logs.zip * + +cd .. +rm -rf azure-monitor-logs-tmp +echo +echo "log files have been written to azure-monitor-logs.zip" diff --git a/source/plugins/go/src/extension/extension.go b/source/plugins/go/src/extension/extension.go new file mode 100644 index 000000000..4d78380bc --- /dev/null +++ b/source/plugins/go/src/extension/extension.go @@ -0,0 +1,103 @@ +package extension + +import ( + "encoding/json" + "fmt" + "log" + "strings" + "sync" + + uuid "github.com/google/uuid" + "github.com/ugorji/go/codec" +) + +type Extension struct { + datatypeStreamIdMap map[string]string +} + +var singleton *Extension +var once sync.Once +var extensionconfiglock sync.Mutex +var logger *log.Logger +var containerType string + +func GetInstance(flbLogger *log.Logger, containertype string) *Extension { + once.Do(func() { + singleton = &Extension{make(map[string]string)} + flbLogger.Println("Extension Instance created") + }) + logger = flbLogger + containerType = containertype + return singleton +} + +func (e *Extension) GetOutputStreamId(datatype string) string { + extensionconfiglock.Lock() + defer extensionconfiglock.Unlock() + if len(e.datatypeStreamIdMap) > 0 && e.datatypeStreamIdMap[datatype] != "" { + message := fmt.Sprintf("OutputstreamId: %s for the datatype: %s", e.datatypeStreamIdMap[datatype], datatype) + logger.Printf(message) + return e.datatypeStreamIdMap[datatype] + } + var err error + e.datatypeStreamIdMap, err = getDataTypeToStreamIdMapping() + if err != nil { + message := fmt.Sprintf("Error getting datatype to streamid mapping: %s", err.Error()) + logger.Printf(message) + } + return e.datatypeStreamIdMap[datatype] +} + +func getDataTypeToStreamIdMapping() (map[string]string, error) { + logger.Printf("extensionconfig::getDataTypeToStreamIdMapping:: getting extension config from fluent socket - start") + guid := uuid.New() + datatypeOutputStreamMap := make(map[string]string) + + taggedData := map[string]interface{}{"Request": "AgentTaggedData", "RequestId": guid.String(), "Tag": "ContainerInsights", "Version": "1"} + jsonBytes, err := json.Marshal(taggedData) + // TODO: this error is unhandled + + var data []byte + enc := codec.NewEncoderBytes(&data, new(codec.MsgpackHandle)) + if err := enc.Encode(string(jsonBytes)); err != nil { + return datatypeOutputStreamMap, err + } + + fs := &FluentSocket{} + fs.sockAddress = "/var/run/mdsd/default_fluent.socket" + if containerType != "" && strings.Compare(strings.ToLower(containerType), "prometheussidecar") == 0 { + fs.sockAddress = fmt.Sprintf("/var/run/mdsd-%s/default_fluent.socket", containerType) + } + responseBytes, err := FluentSocketWriter.writeAndRead(fs, data) + defer FluentSocketWriter.disconnect(fs) + logger.Printf("Info::mdsd::Making call to FluentSocket: %s to write and read the config data", fs.sockAddress) + if err != nil { + return datatypeOutputStreamMap, err + } + response := string(responseBytes) // TODO: why is this converted to a string then back into a []byte? + + var responseObjet AgentTaggedDataResponse + err = json.Unmarshal([]byte(response), &responseObjet) + if err != nil { + logger.Printf("Error::mdsd::Failed to unmarshal config data. Error message: %s", string(err.Error())) + return datatypeOutputStreamMap, err + } + + var extensionData TaggedData + json.Unmarshal([]byte(responseObjet.TaggedData), &extensionData) + + extensionConfigs := extensionData.ExtensionConfigs + logger.Printf("Info::mdsd::build the datatype and streamid map -- start") + for _, extensionConfig := range extensionConfigs { + outputStreams := extensionConfig.OutputStreams + for dataType, outputStreamID := range outputStreams { + logger.Printf("Info::mdsd::datatype: %s, outputstreamId: %s", dataType, outputStreamID) + datatypeOutputStreamMap[dataType] = outputStreamID.(string) + } + } + logger.Printf("Info::mdsd::build the datatype and streamid map -- end") + + logger.Printf("extensionconfig::getDataTypeToStreamIdMapping:: getting extension config from fluent socket-end") + + return datatypeOutputStreamMap, nil +} diff --git a/source/plugins/go/src/extension/extension_test.go b/source/plugins/go/src/extension/extension_test.go new file mode 100644 index 000000000..c3b5ef472 --- /dev/null +++ b/source/plugins/go/src/extension/extension_test.go @@ -0,0 +1,74 @@ +package extension + +import ( + "fmt" + "log" + "os" + reflect "reflect" + "testing" + + "github.com/golang/mock/gomock" +) + +type FluentSocketWriterMock struct{} + +func Test_getDataTypeToStreamIdMapping(t *testing.T) { + + type test_struct struct { + testName string + mdsdResponse string + fluentSocket FluentSocket + output map[string]string + err error + } + + // This is a pretty useless unit test, but it demonstrates the concept (putting together a real test + // would require some large json structs). If getDataTypeToStreamIdMapping() is ever updated, that + // would be a good opertunity to add some real test cases. + tests := []test_struct{ + { + "basic test", + "{}", + FluentSocket{}, + map[string]string{}, + nil, + }, + } + + for _, tt := range tests { + t.Run(tt.testName, func(t *testing.T) { + mockCtrl := gomock.NewController(t) + defer mockCtrl.Finish() + mock := NewMockIFluentSocketWriter(mockCtrl) + sock := &FluentSocket{} + sock.sockAddress = "/var/run/mdsd/default_fluent.socket" + mock.EXPECT().writeAndRead(sock, gomock.Any()).Return([]byte(tt.mdsdResponse), nil).Times(1) + mock.EXPECT().disconnect(sock).Return(nil).Times(1) + + // This is where calls to the normal socket writer calls are redirected to the mock. + ActualFluentSocketWriter := FluentSocketWriter // save the old struct so that we can put it back later + FluentSocketWriter = mock + + logfile, err := os.Create("logFile.txt") + if err != nil { + fmt.Println(err.Error()) + } + + // use an actual logger here. Using a real logger then cleaning up the log file later is easier than mocking the logger. + GetInstance(log.New(logfile, "", 0), "ContainerType") + defer os.Remove("logFile.txt") + + got, reterr := getDataTypeToStreamIdMapping() + if reterr != nil { + t.Errorf("got error") + t.Errorf(err.Error()) + } + if !reflect.DeepEqual(got, tt.output) { + t.Errorf("getDataTypeToStreamIdMapping() = %v, want %v", got, tt.output) + } + + // stop redirecting method calls to the mock + FluentSocketWriter = ActualFluentSocketWriter + }) + } +} diff --git a/source/plugins/go/src/extension/interfaces.go b/source/plugins/go/src/extension/interfaces.go new file mode 100644 index 000000000..c70ef17b8 --- /dev/null +++ b/source/plugins/go/src/extension/interfaces.go @@ -0,0 +1,34 @@ +package extension + +// AgentTaggedDataResponse struct for response from AgentTaggedData request +type AgentTaggedDataResponse struct { + Request string `json:"Request"` + RequestID string `json:"RequestId"` + Version string `json:"Version"` + Success bool `json:"Success"` + Description string `json:"Description"` + TaggedData string `json:"TaggedData"` +} + +// TaggedData structure for respone +type TaggedData struct { + SchemaVersion int `json:"schemaVersion"` + Version int `json:"version"` + ExtensionName string `json:"extensionName"` + ExtensionConfigs []ExtensionConfig `json:"extensionConfigurations"` + OutputStreamDefinitions map[string]StreamDefinition `json:"outputStreamDefinitions"` +} + +// StreamDefinition structure for named pipes +type StreamDefinition struct { + NamedPipe string `json:"namedPipe"` +} + +// ExtensionConfig structure for extension definition in DCR +type ExtensionConfig struct { + ID string `json:"id"` + OriginIds []string `json:"originIds"` + ExtensionSettings map[string]interface{} `json:"extensionSettings"` + InputStreams map[string]interface{} `json:"inputStreams"` + OutputStreams map[string]interface{} `json:"outputStreams"` +} diff --git a/source/plugins/go/src/extension/socket_writer.go b/source/plugins/go/src/extension/socket_writer.go new file mode 100644 index 000000000..bfd35f5e6 --- /dev/null +++ b/source/plugins/go/src/extension/socket_writer.go @@ -0,0 +1,110 @@ +package extension + +import ( + "net" +) + +//go:generate mockgen -destination=socket_writer_mock.go -package=extension Docker-Provider/source/plugins/go/src/extension IFluentSocketWriter + +//MaxRetries for trying to write data to the socket +const MaxRetries = 5 + +//ReadBufferSize for reading data from sockets +//Current CI extension config size is ~5KB and going with 20KB to handle any future scenarios +const ReadBufferSize = 20480 + +//FluentSocketWriter writes data to AMA's default fluent socket +type FluentSocket struct { + socket net.Conn + sockAddress string +} + +// begin mocking boilerplate +type IFluentSocketWriter interface { + connect(fluentSocket *FluentSocket) error + disconnect(fluentSocket *FluentSocket) error + writeWithRetries(fluentSocket *FluentSocket, data []byte) (int, error) + read(fluentSocket *FluentSocket) ([]byte, error) + write(fluentSocket *FluentSocket, payload []byte) (int, error) + writeAndRead(fluentSocket *FluentSocket, payload []byte) ([]byte, error) +} + +type FluentSocketWriterImpl struct{} + +// Methods in this file can by mocked by replacing FluentSocketWriter with a different struct. The methods +// in this file are all tied to the FluentSocketWriterImpl struct, but other structs could implement +// IFluentSocketWriter and be used instead +var FluentSocketWriter IFluentSocketWriter + +func init() { + FluentSocketWriter = FluentSocketWriterImpl{} +} + +// end mocking boilerplate + +func (FluentSocketWriterImpl) connect(fs *FluentSocket) error { + c, err := net.Dial("unix", fs.sockAddress) + if err != nil { + return err + } + fs.socket = c + return nil +} + +func (FluentSocketWriterImpl) disconnect(fs *FluentSocket) error { + if fs.socket != nil { + fs.socket.Close() + fs.socket = nil + } + return nil +} + +func (FluentSocketWriterImpl) writeWithRetries(fs *FluentSocket, data []byte) (int, error) { + var ( + err error + n int + ) + for i := 0; i < MaxRetries; i++ { + n, err = fs.socket.Write(data) + if err == nil { + return n, nil + } + } + if err, ok := err.(net.Error); !ok || !err.Temporary() { + // so that connect() is called next time if write fails + // this happens when mdsd is restarted + _ = fs.socket.Close() // no need to log the socket closing error + fs.socket = nil + } + return 0, err +} + +func (FluentSocketWriterImpl) read(fs *FluentSocket) ([]byte, error) { + buf := make([]byte, ReadBufferSize) + n, err := fs.socket.Read(buf) + if err != nil { + return nil, err + } + return buf[:n], nil + +} + +func (FluentSocketWriterImpl) write(fs *FluentSocket, payload []byte) (int, error) { + if fs.socket == nil { + // previous write failed with permanent error and socket was closed. + if err := FluentSocketWriter.connect(fs); err != nil { + return 0, err + } + } + + return FluentSocketWriter.writeWithRetries(fs, payload) +} + +//writeAndRead writes data to the socket and sends the response back +func (FluentSocketWriterImpl) writeAndRead(fs *FluentSocket, payload []byte) ([]byte, error) { + _, err := FluentSocketWriter.write(fs, payload) + if err != nil { + return nil, err + } + return FluentSocketWriter.read(fs) +} diff --git a/source/plugins/go/src/go.mod b/source/plugins/go/src/go.mod index 5b5c735e5..58e668597 100644 --- a/source/plugins/go/src/go.mod +++ b/source/plugins/go/src/go.mod @@ -3,33 +3,18 @@ module Docker-Provider/source/plugins/go/src go 1.14 require ( - code.cloudfoundry.org/clock v1.0.1-0.20200131002207-86534f4ca3a5 // indirect - github.com/Azure/azure-kusto-go v0.1.4-0.20200427191510-041d4ed55f86 + github.com/Azure/azure-kusto-go v0.3.2 github.com/Azure/go-autorest/autorest/azure/auth v0.4.2 + github.com/Azure/go-autorest/autorest/to v0.4.0 // indirect + github.com/dnaeon/go-vcr v1.2.0 // indirect github.com/fluent/fluent-bit-go v0.0.0-20171103221316-c4a158a6e3a7 - github.com/ghodss/yaml v0.0.0-20150909031657-73d445a93680 // indirect - github.com/gogo/protobuf v0.0.0-20170330071051-c0656edd0d9e // indirect - github.com/golang/glog v0.0.0-20141105023935-44145f04b68c // indirect - github.com/google/btree v0.0.0-20160524151835-7d79101e329e // indirect - github.com/google/gofuzz v0.0.0-20161122191042-44d81051d367 // indirect - github.com/google/uuid v1.1.1 - github.com/googleapis/gnostic v0.0.0-20170729233727-0c5108395e2d // indirect - github.com/gregjones/httpcache v0.0.0-20170728041850-787624de3eb7 // indirect - github.com/json-iterator/go v0.0.0-20180612202835-f2b4162afba3 // indirect + github.com/golang/mock v1.4.1 + github.com/google/uuid v1.1.2 github.com/microsoft/ApplicationInsights-Go v0.4.3 - github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect - github.com/modern-go/reflect2 v0.0.0-20180320133207-05fbef0ca5da // indirect - github.com/peterbourgon/diskv v2.0.1+incompatible // indirect - github.com/philhofer/fwd v1.0.0 // indirect - github.com/satori/go.uuid v1.2.1-0.20181028125025-b2ce2384e17b // indirect + github.com/philhofer/fwd v1.1.1 // indirect github.com/tinylib/msgp v1.1.2 - github.com/ugorji/go v1.1.2-0.20180813092308-00b869d2f4a5 // indirect - golang.org/x/net v0.0.0-20200421231249-e086a090c8fd // indirect - golang.org/x/time v0.0.0-20161028155119-f51c12702a4d // indirect - gopkg.in/inf.v0 v0.9.0 // indirect + github.com/ugorji/go v1.1.2-0.20180813092308-00b869d2f4a5 gopkg.in/natefinch/lumberjack.v2 v2.0.0-20170531160350-a96e63847dc3 - k8s.io/api v0.0.0-20180628040859-072894a440bd // indirect - k8s.io/apimachinery v0.0.0-20180621070125-103fd098999d - k8s.io/client-go v8.0.0+incompatible - golang.org/x/crypto v0.0.0-20201216223049-8b5274cf687f + k8s.io/apimachinery v0.21.0 + k8s.io/client-go v0.21.0 ) diff --git a/source/plugins/go/src/go.sum b/source/plugins/go/src/go.sum index 64745749f..ad9e40089 100644 --- a/source/plugins/go/src/go.sum +++ b/source/plugins/go/src/go.sum @@ -1,29 +1,54 @@ +cloud.google.com/go v0.26.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw= +cloud.google.com/go v0.34.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw= +cloud.google.com/go v0.38.0/go.mod h1:990N+gfupTy94rShfmMCWGDn0LpTmnzTp2qbd1dvSRU= +cloud.google.com/go v0.44.1/go.mod h1:iSa0KzasP4Uvy3f1mN/7PiObzGgflwredwwASm/v6AU= +cloud.google.com/go v0.44.2/go.mod h1:60680Gw3Yr4ikxnPRS/oxxkBccT6SA1yMk63TGekxKY= +cloud.google.com/go v0.45.1/go.mod h1:RpBamKRgapWJb87xiFSdk4g1CME7QZg3uwTez+TSTjc= +cloud.google.com/go v0.46.3/go.mod h1:a6bKKbmY7er1mI7TEI4lsAkts/mkhTSZK8w33B4RAg0= +cloud.google.com/go v0.50.0/go.mod h1:r9sluTvynVuxRIOHXQEHMFffphuXHOMZMycpNR5e6To= +cloud.google.com/go v0.52.0/go.mod h1:pXajvRH/6o3+F9jDHZWQ5PbGhn+o8w9qiu/CffaVdO4= +cloud.google.com/go v0.53.0/go.mod h1:fp/UouUEsRkN6ryDKNW/Upv/JBKnv6WDthjR6+vze6M= +cloud.google.com/go v0.54.0/go.mod h1:1rq2OEkV3YMf6n/9ZvGWI3GWw0VoqH/1x2nd8Is/bPc= +cloud.google.com/go/bigquery v1.0.1/go.mod h1:i/xbL2UlR5RvWAURpBYZTtm/cXjCha9lbfbpx4poX+o= +cloud.google.com/go/bigquery v1.3.0/go.mod h1:PjpwJnslEMmckchkHFfq+HTD2DmtT67aNFKH1/VBDHE= +cloud.google.com/go/bigquery v1.4.0/go.mod h1:S8dzgnTigyfTmLBfrtrhyYhwRxG72rYxvftPBK2Dvzc= +cloud.google.com/go/datastore v1.0.0/go.mod h1:LXYbyblFSglQ5pkeyhO+Qmw7ukd3C+pD7TKLgZqpHYE= +cloud.google.com/go/datastore v1.1.0/go.mod h1:umbIZjpQpHh4hmRpGhH4tLFup+FVzqBi1b3c64qFpCk= +cloud.google.com/go/pubsub v1.0.1/go.mod h1:R0Gpsv3s54REJCy4fxDixWD93lHJMoZTyQ2kNxGRt3I= +cloud.google.com/go/pubsub v1.1.0/go.mod h1:EwwdRX2sKPjnvnqCa270oGRyludottCI76h+R3AArQw= +cloud.google.com/go/pubsub v1.2.0/go.mod h1:jhfEVHT8odbXTkndysNHCcx0awwzvfOlguIAii9o8iA= +cloud.google.com/go/storage v1.0.0/go.mod h1:IhtSnM/ZTZV8YYJWCY8RULGVqBDmpoyjwiyrjsg+URw= +cloud.google.com/go/storage v1.5.0/go.mod h1:tpKbwo567HUNpVclU5sGELwQWBDZ8gh0ZeosJ0Rtdos= +cloud.google.com/go/storage v1.6.0/go.mod h1:N7U0C8pVQ/+NIKOBQyamJIeKQKkZ+mxpohlUTyfDhBk= +code.cloudfoundry.org/clock v0.0.0-20180518195852-02e53af36e6c h1:5eeuG0BHx1+DHeT3AP+ISKZ2ht1UjGhm581ljqYpVeQ= code.cloudfoundry.org/clock v0.0.0-20180518195852-02e53af36e6c/go.mod h1:QD9Lzhd/ux6eNQVUDVRJX/RKTigpewimNYBi7ivZKY8= -code.cloudfoundry.org/clock v1.0.1-0.20200131002207-86534f4ca3a5 h1:LTlZ2AD8IV/d1JRzB+HHfZfF1M+K8lyOlN28zDEpw7U= -code.cloudfoundry.org/clock v1.0.1-0.20200131002207-86534f4ca3a5/go.mod h1:QD9Lzhd/ux6eNQVUDVRJX/RKTigpewimNYBi7ivZKY8= -github.com/Azure/azure-kusto-go v0.1.3 h1:0u+YqfIvwj5PHd+moXwtlxVePt8xTLU1ixM8Q6PjJ3o= -github.com/Azure/azure-kusto-go v0.1.3/go.mod h1:55hwXJ3PaahmWZFP7VC4+PlgsSUuetSA30rFtYFabfc= -github.com/Azure/azure-kusto-go v0.1.4-0.20200427191510-041d4ed55f86 h1:vyhCediIKg1gZ9H/kMcutU8F8BFNhxLk76Gti8UAOzo= -github.com/Azure/azure-kusto-go v0.1.4-0.20200427191510-041d4ed55f86/go.mod h1:55hwXJ3PaahmWZFP7VC4+PlgsSUuetSA30rFtYFabfc= +dmitri.shuralyov.com/gpu/mtl v0.0.0-20190408044501-666a987793e9/go.mod h1:H6x//7gZCb22OMCxBHrMx7a5I7Hp++hsVxbQ4BYO7hU= +github.com/Azure/azure-kusto-go v0.3.2 h1:XpS9co6GvEDl2oICF9HsjEsQVwEpRK6wbNWb9Z+uqsY= +github.com/Azure/azure-kusto-go v0.3.2/go.mod h1:wd50n4qlsSxh+G4f80t+Fnl2ShK9AcXD+lMOstiKuYo= github.com/Azure/azure-pipeline-go v0.1.8/go.mod h1:XA1kFWRVhSK+KNFiOhfv83Fv8L9achrP7OxIzeTn1Yg= github.com/Azure/azure-pipeline-go v0.2.1 h1:OLBdZJ3yvOn2MezlWvbrBMTEUQC72zAftRZOMdj5HYo= github.com/Azure/azure-pipeline-go v0.2.1/go.mod h1:UGSo8XybXnIGZ3epmeBw7Jdz+HiUVpqIlpz/HKHylF4= +github.com/Azure/azure-sdk-for-go v44.1.0+incompatible h1:l1UGvaaoMCUwVGUauvHzeB4t+Y0yPX5iJwBhzc0LqyE= +github.com/Azure/azure-sdk-for-go v44.1.0+incompatible/go.mod h1:9XXNKU+eRnpl9moKnB4QOLf1HestfXbmab5FXxiDBjc= github.com/Azure/azure-storage-blob-go v0.8.0 h1:53qhf0Oxa0nOjgbDeeYPUeyiNmafAFEY95rZLK0Tj6o= github.com/Azure/azure-storage-blob-go v0.8.0/go.mod h1:lPI3aLPpuLTeUwh1sViKXFxwl2B6teiRqI0deQUvsw0= github.com/Azure/azure-storage-queue-go v0.0.0-20191125232315-636801874cdd h1:b3wyxBl3vvr15tUAziPBPK354y+LSdfPCpex5oBttHo= github.com/Azure/azure-storage-queue-go v0.0.0-20191125232315-636801874cdd/go.mod h1:K6am8mT+5iFXgingS9LUc7TmbsW6XBw3nxaRyaMyWc8= -github.com/Azure/go-autorest v1.1.1 h1:4G9tVCqooRY3vDTB2bA1Z01PlSALtnUbji0AfzthUSs= -github.com/Azure/go-autorest v14.1.1+incompatible h1:m2F62e1Zk5DV3HENGdH/wEuzvJZIynHG4fHF7oiQwgE= +github.com/Azure/go-autorest v14.2.0+incompatible h1:V5VMDjClD3GiElqLWO7mz2MxNAK/vTfRHdAubSIPRgs= +github.com/Azure/go-autorest v14.2.0+incompatible/go.mod h1:r+4oMnoxhatjLLJ6zxSWATqVooLgysK6ZNox3g/xq24= github.com/Azure/go-autorest/autorest v0.9.0/go.mod h1:xyHB1BMZT0cuDHU7I0+g046+BFDTQ8rEZB0s4Yfa6bI= github.com/Azure/go-autorest/autorest v0.9.3/go.mod h1:GsRuLYvwzLjjjRoWEIyMUaYq8GNUx2nRB378IPt/1p0= github.com/Azure/go-autorest/autorest v0.10.0 h1:mvdtztBqcL8se7MdrUweNieTNi4kfNG6GOJuurQJpuY= github.com/Azure/go-autorest/autorest v0.10.0/go.mod h1:/FALq9T/kS7b5J5qsQ+RSTUdAmGFqi0vUdVNNx8q630= -github.com/Azure/go-autorest/autorest v0.10.2 h1:NuSF3gXetiHyUbVdneJMEVyPUYAe5wh+aN08JYAf1tI= +github.com/Azure/go-autorest/autorest v0.11.12 h1:gI8ytXbxMfI+IVbI9mP2JGCTXIuhHLgRlvQ9X4PsnHE= +github.com/Azure/go-autorest/autorest v0.11.12/go.mod h1:eipySxLmqSyC5s5k1CLupqet0PSENBEDP93LQ9a8QYw= github.com/Azure/go-autorest/autorest/adal v0.5.0/go.mod h1:8Z9fGy2MpX0PvDjB1pEgQTmVqjGhiHBW7RJJEciWzS0= github.com/Azure/go-autorest/autorest/adal v0.8.0/go.mod h1:Z6vX6WXXuyieHAXwMj0S6HY6e6wcHn37qQMBQlvY3lc= github.com/Azure/go-autorest/autorest/adal v0.8.1/go.mod h1:ZjhuQClTqx435SRJ2iMlOxPYt3d2C/T/7TiQCVZSn3Q= github.com/Azure/go-autorest/autorest/adal v0.8.2 h1:O1X4oexUxnZCaEUGsvMnr8ZGj8HI37tNezwY4npRqA0= github.com/Azure/go-autorest/autorest/adal v0.8.2/go.mod h1:ZjhuQClTqx435SRJ2iMlOxPYt3d2C/T/7TiQCVZSn3Q= +github.com/Azure/go-autorest/autorest/adal v0.9.5 h1:Y3bBUV4rTuxenJJs41HU3qmqsb+auo+a3Lz+PlJPpL0= +github.com/Azure/go-autorest/autorest/adal v0.9.5/go.mod h1:B7KF7jKIeC9Mct5spmyCB/A8CG/sEz1vwIRGv/bbw7A= github.com/Azure/go-autorest/autorest/azure/auth v0.4.2 h1:iM6UAvjR97ZIeR93qTcwpKNMpV+/FTWjwEbuPD495Tk= github.com/Azure/go-autorest/autorest/azure/auth v0.4.2/go.mod h1:90gmfKdlmKgfjUpnCEpOJzsUEjrWDSLwHIG73tSXddM= github.com/Azure/go-autorest/autorest/azure/cli v0.3.1 h1:LXl088ZQlP0SBppGFsRZonW6hSvwgL5gRByMbvUbx8U= @@ -31,126 +56,472 @@ github.com/Azure/go-autorest/autorest/azure/cli v0.3.1/go.mod h1:ZG5p860J94/0kI9 github.com/Azure/go-autorest/autorest/date v0.1.0/go.mod h1:plvfp3oPSKwf2DNjlBjWF/7vwR+cUD/ELuzDCXwHUVA= github.com/Azure/go-autorest/autorest/date v0.2.0 h1:yW+Zlqf26583pE43KhfnhFcdmSWlm5Ew6bxipnr/tbM= github.com/Azure/go-autorest/autorest/date v0.2.0/go.mod h1:vcORJHLJEh643/Ioh9+vPmf1Ij9AEBM5FuBIXLmIy0g= +github.com/Azure/go-autorest/autorest/date v0.3.0 h1:7gUk1U5M/CQbp9WoqinNzJar+8KY+LPI6wiWrP/myHw= +github.com/Azure/go-autorest/autorest/date v0.3.0/go.mod h1:BI0uouVdmngYNUzGWeSYnokU+TrmwEsOqdt8Y6sso74= github.com/Azure/go-autorest/autorest/mocks v0.1.0/go.mod h1:OTyCOPRA2IgIlWxVYxBee2F5Gr4kF2zd2J5cFRaIDN0= github.com/Azure/go-autorest/autorest/mocks v0.2.0/go.mod h1:OTyCOPRA2IgIlWxVYxBee2F5Gr4kF2zd2J5cFRaIDN0= github.com/Azure/go-autorest/autorest/mocks v0.3.0/go.mod h1:a8FDP3DYzQ4RYfVAxAN3SVSiiO77gL2j2ronKKP0syM= +github.com/Azure/go-autorest/autorest/mocks v0.4.1 h1:K0laFcLE6VLTOwNgSxaGbUcLPuGXlNkbVvq4cW4nIHk= +github.com/Azure/go-autorest/autorest/mocks v0.4.1/go.mod h1:LTp+uSrOhSkaKrUy935gNZuuIPPVsHlr9DSOxSayd+k= +github.com/Azure/go-autorest/autorest/to v0.4.0 h1:oXVqrxakqqV1UZdSazDOPOLvOIz+XA683u8EctwboHk= +github.com/Azure/go-autorest/autorest/to v0.4.0/go.mod h1:fE8iZBn7LQR7zH/9XU2NcPR4o9jEImooCeWJcYV/zLE= github.com/Azure/go-autorest/logger v0.1.0 h1:ruG4BSDXONFRrZZJ2GUXDiUyVpayPmb1GnWeHDdaNKY= github.com/Azure/go-autorest/logger v0.1.0/go.mod h1:oExouG+K6PryycPJfVSxi/koC6LSNgds39diKLz7Vrc= +github.com/Azure/go-autorest/logger v0.2.0 h1:e4RVHVZKC5p6UANLJHkM4OfR1UKZPj8Wt8Pcx+3oqrE= +github.com/Azure/go-autorest/logger v0.2.0/go.mod h1:T9E3cAhj2VqvPOtCYAvby9aBXkZmbF5NWuPV8+WeEW8= github.com/Azure/go-autorest/tracing v0.5.0 h1:TRn4WjSnkcSy5AEG3pnbtFSwNtwzjr4VYyQflFE619k= github.com/Azure/go-autorest/tracing v0.5.0/go.mod h1:r/s2XiOKccPW3HrqB+W0TQzfbtp2fGCgRFtBroKn4Dk= -github.com/Microsoft/ApplicationInsights-Go v0.4.2 h1:HIZoGXMiKNwAtMAgCSSX35j9mP+DjGF9ezfBvxMDLLg= -github.com/Microsoft/ApplicationInsights-Go v0.4.2/go.mod h1:CukZ/G66zxXtI+h/VcVn3eVVDGDHfXM2zVILF7bMmsg= +github.com/Azure/go-autorest/tracing v0.6.0 h1:TYi4+3m5t6K48TGI9AUdb+IzbnSxvnvUMfuitfgcfuo= +github.com/Azure/go-autorest/tracing v0.6.0/go.mod h1:+vhtPC754Xsa23ID7GlGsrdKBpUA79WCAKPPZVC2DeU= +github.com/BurntSushi/toml v0.3.1 h1:WXkYYl6Yr3qBf1K79EBnL4mak0OimBfB0XUf9Vl28OQ= +github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU= +github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802/go.mod h1:IVnqGOEym/WlBOVXweHU+Q+/VP0lqqI8lqeDx9IjBqo= +github.com/NYTimes/gziphandler v0.0.0-20170623195520-56545f4a5d46/go.mod h1:3wb06e3pkSAbeQ52E9H9iFoQsEEwGN64994WTCIhntQ= +github.com/PuerkitoBio/purell v1.1.1/go.mod h1:c11w/QuzBsJSee3cPx9rAFu61PvFxuPbtSwDGJws/X0= +github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578/go.mod h1:uGdkoq3SwY9Y+13GIhn11/XLaGBb4BfwItxLd5jeuXE= +github.com/asaskevich/govalidator v0.0.0-20190424111038-f61b66f89f4a/go.mod h1:lB+ZfQJz7igIIfQNfa7Ml4HSf2uFQQRzpGGRXenZAgY= +github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU= +github.com/chzyer/logex v1.1.10/go.mod h1:+Ywpsq7O8HXn0nuIou7OrIPyXbp3wmkHB+jjWRnGsAI= +github.com/chzyer/readline v0.0.0-20180603132655-2972be24d48e/go.mod h1:nSuG5e5PlCu98SY8svDHJxuZscDgtXS6KTTbou5AhLI= +github.com/chzyer/test v0.0.0-20180213035817-a1ea475d72b1/go.mod h1:Q3SI9o4m/ZMnBNeIyt5eFwwo7qiLfzFZmjNmxjkiQlU= +github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw= +github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E= +github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= +github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= +github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= github.com/dgrijalva/jwt-go v3.2.0+incompatible h1:7qlOGliEKZXTDg6OTjfoBKDXWrumCAMpl/TFQ4/5kLM= github.com/dgrijalva/jwt-go v3.2.0+incompatible/go.mod h1:E3ru+11k8xSBh+hMPgOLZmtrrCbhqsmaPHjLKYnJCaQ= github.com/dimchansky/utfbom v1.1.0 h1:FcM3g+nofKgUteL8dm/UpdRXNC9KmADgTpLKsu0TRo4= github.com/dimchansky/utfbom v1.1.0/go.mod h1:rO41eb7gLfo8SF1jd9F8HplJm1Fewwi4mQvIirEdv+8= +github.com/dnaeon/go-vcr v1.2.0 h1:zHCHvJYTMh1N7xnV7zf1m1GPBF9Ad0Jk/whtQ1663qI= +github.com/dnaeon/go-vcr v1.2.0/go.mod h1:R4UdLID7HZT3taECzJs4YgbbH6PIGXB6W/sc5OLb6RQ= +github.com/docopt/docopt-go v0.0.0-20180111231733-ee0de3bc6815/go.mod h1:WwZ+bS3ebgob9U8Nd0kOddGdZWjyMGR8Wziv+TBNwSE= +github.com/elazarl/goproxy v0.0.0-20180725130230-947c36da3153/go.mod h1:/Zj4wYkgs4iZTTu3o/KG3Itv/qCCa8VVMlb3i9OVuzc= +github.com/emicklei/go-restful v0.0.0-20170410110728-ff4f55a20633/go.mod h1:otzb+WCGbkyDHkqmQmT5YD2WR4BBwUdeQoFo8l/7tVs= +github.com/envoyproxy/go-control-plane v0.9.1-0.20191026205805-5f8ba28d4473/go.mod h1:YTl/9mNaCwkRvm6d1a2C3ymFceY/DCBVvsKhRF0iEA4= +github.com/envoyproxy/protoc-gen-validate v0.1.0/go.mod h1:iSmxcyjqTsJpI2R4NaDN7+kN2VEUnK/pcBlmesArF7c= +github.com/evanphx/json-patch v4.9.0+incompatible/go.mod h1:50XU6AFN0ol/bzJsmQLiYLvXMP4fmwYFNcr97nuDLSk= github.com/fluent/fluent-bit-go v0.0.0-20171103221316-c4a158a6e3a7 h1:mck6KdLX2FTh2/ZD27dK69ehWDZR4hCk+nLf+HvAbDk= github.com/fluent/fluent-bit-go v0.0.0-20171103221316-c4a158a6e3a7/go.mod h1:JVF1Nl3QOPpKTR8xDjhkm0xINYUX0z4XdJvOpIUF+Eo= +github.com/form3tech-oss/jwt-go v3.2.2+incompatible h1:TcekIExNqud5crz4xD2pavyTgWiPvpYe4Xau31I0PRk= +github.com/form3tech-oss/jwt-go v3.2.2+incompatible/go.mod h1:pbq4aXjuKjdthFRnoDwaVPLA+WlJuPGy+QneDUgJi2k= +github.com/fsnotify/fsnotify v1.4.7 h1:IXs+QLmnXW2CcXuY+8Mzv/fWEsPGWxqefPtCP5CnV9I= github.com/fsnotify/fsnotify v1.4.7/go.mod h1:jwhsz4b93w/PPRr/qN1Yymfu8t87LnFCMoQvtojpjFo= -github.com/ghodss/yaml v0.0.0-20150909031657-73d445a93680 h1:ZktWZesgun21uEDrwW7iEV1zPCGQldM2atlJZ3TdvVM= -github.com/ghodss/yaml v0.0.0-20150909031657-73d445a93680/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04= -github.com/gogo/protobuf v0.0.0-20170330071051-c0656edd0d9e h1:ago6fNuQ6IhszPsXkeU7qRCyfsIX7L67WDybsAPkLl8= -github.com/gogo/protobuf v0.0.0-20170330071051-c0656edd0d9e/go.mod h1:r8qH/GZQm5c6nD/R0oafs1akxWv10x8SbQlK7atdtwQ= -github.com/golang/glog v0.0.0-20141105023935-44145f04b68c h1:CbdkBQ1/PiAo0FYJhQGwASD8wrgNvTdf01g6+O9tNuA= -github.com/golang/glog v0.0.0-20141105023935-44145f04b68c/go.mod h1:SBH7ygxi8pfUlaOkMMuAQtPIUF8ecWP5IEl/CR7VP2Q= -github.com/golang/protobuf v1.1.0 h1:0iH4Ffd/meGoXqF2lSAhZHt8X+cPgkfn/cb6Cce5Vpc= -github.com/golang/protobuf v1.1.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= +github.com/go-gl/glfw v0.0.0-20190409004039-e6da0acd62b1/go.mod h1:vR7hzQXu2zJy9AVAgeJqvqgH9Q5CA+iKCZ2gyEVpxRU= +github.com/go-gl/glfw/v3.3/glfw v0.0.0-20191125211704-12ad95a8df72/go.mod h1:tQ2UAYgL5IevRw8kRxooKSPJfGvJ9fJQFa0TUsXzTg8= +github.com/go-gl/glfw/v3.3/glfw v0.0.0-20200222043503-6f7a984d4dc4/go.mod h1:tQ2UAYgL5IevRw8kRxooKSPJfGvJ9fJQFa0TUsXzTg8= +github.com/go-logr/logr v0.1.0/go.mod h1:ixOQHD9gLJUVQQ2ZOR7zLEifBX6tGkNJF4QyIY7sIas= +github.com/go-logr/logr v0.4.0 h1:K7/B1jt6fIBQVd4Owv2MqGQClcgf0R266+7C/QjRcLc= +github.com/go-logr/logr v0.4.0/go.mod h1:z6/tIYblkpsD+a4lm/fGIIU9mZ+XfAiaFtq7xTgseGU= +github.com/go-openapi/jsonpointer v0.19.2/go.mod h1:3akKfEdA7DF1sugOqz1dVQHBcuDBPKZGEoHC/NkiQRg= +github.com/go-openapi/jsonpointer v0.19.3/go.mod h1:Pl9vOtqEWErmShwVjC8pYs9cog34VGT37dQOVbmoatg= +github.com/go-openapi/jsonreference v0.19.2/go.mod h1:jMjeRr2HHw6nAVajTXJ4eiUwohSTlpa0o73RUL1owJc= +github.com/go-openapi/jsonreference v0.19.3/go.mod h1:rjx6GuL8TTa9VaixXglHmQmIL98+wF9xc8zWvFonSJ8= +github.com/go-openapi/spec v0.19.3/go.mod h1:FpwSN1ksY1eteniUU7X0N/BgJ7a4WvBFVA8Lj9mJglo= +github.com/go-openapi/swag v0.19.2/go.mod h1:POnQmlKehdgb5mhVOsnJFsivZCEZ/vjK9gh66Z9tfKk= +github.com/go-openapi/swag v0.19.5/go.mod h1:POnQmlKehdgb5mhVOsnJFsivZCEZ/vjK9gh66Z9tfKk= +github.com/gogo/protobuf v1.3.2 h1:Ov1cvc58UF3b5XjBnZv7+opcTcQFZebYjWzi34vdm4Q= +github.com/gogo/protobuf v1.3.2/go.mod h1:P1XiOD3dCwIKUDQYPy72D8LYyHL2YPYrpS2s69NZV8Q= +github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b/go.mod h1:SBH7ygxi8pfUlaOkMMuAQtPIUF8ecWP5IEl/CR7VP2Q= +github.com/golang/groupcache v0.0.0-20190702054246-869f871628b6/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc= +github.com/golang/groupcache v0.0.0-20191227052852-215e87163ea7/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc= +github.com/golang/groupcache v0.0.0-20200121045136-8c9f03a8e57e/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc= +github.com/golang/mock v1.1.1/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfbm0A= +github.com/golang/mock v1.2.0/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfbm0A= +github.com/golang/mock v1.3.1/go.mod h1:sBzyDLLjw3U8JLTeZvSv8jJB+tU5PVekmnlKIyFUx0Y= +github.com/golang/mock v1.4.0/go.mod h1:UOMv5ysSaYNkG+OFQykRIcU/QvvxJf3p21QfJ2Bt3cw= +github.com/golang/mock v1.4.1 h1:ocYkMQY5RrXTYgXl7ICpV0IXwlEQGwKIsery4gyXa1U= +github.com/golang/mock v1.4.1/go.mod h1:UOMv5ysSaYNkG+OFQykRIcU/QvvxJf3p21QfJ2Bt3cw= github.com/golang/protobuf v1.2.0 h1:P3YflyNX/ehuJFLhxviNdFxQPkGK5cDcApsge1SqnvM= github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= -github.com/google/btree v0.0.0-20160524151835-7d79101e329e h1:JHB7F/4TJCrYBW8+GZO8VkWDj1jxcWuCl6uxKODiyi4= -github.com/google/btree v0.0.0-20160524151835-7d79101e329e/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ= -github.com/google/gofuzz v0.0.0-20161122191042-44d81051d367 h1:ScAXWS+TR6MZKex+7Z8rneuSJH+FSDqd6ocQyl+ZHo4= -github.com/google/gofuzz v0.0.0-20161122191042-44d81051d367/go.mod h1:HP5RmnzzSNb993RKQDq4+1A4ia9nllfqcQFTQJedwGI= +github.com/golang/protobuf v1.3.1/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= +github.com/golang/protobuf v1.3.2/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= +github.com/golang/protobuf v1.3.3/go.mod h1:vzj43D7+SQXF/4pzW/hwtAqwc6iTitCiVSaWz5lYuqw= +github.com/golang/protobuf v1.3.4/go.mod h1:vzj43D7+SQXF/4pzW/hwtAqwc6iTitCiVSaWz5lYuqw= +github.com/golang/protobuf v1.4.0-rc.1/go.mod h1:ceaxUfeHdC40wWswd/P6IGgMaK3YpKi5j83Wpe3EHw8= +github.com/golang/protobuf v1.4.0-rc.1.0.20200221234624-67d41d38c208/go.mod h1:xKAWHe0F5eneWXFV3EuXVDTCmh+JuBKY0li0aMyXATA= +github.com/golang/protobuf v1.4.0-rc.2/go.mod h1:LlEzMj4AhA7rCAGe4KMBDvJI+AwstrUpVNzEA03Pprs= +github.com/golang/protobuf v1.4.0-rc.4.0.20200313231945-b860323f09d0/go.mod h1:WU3c8KckQ9AFe+yFwt9sWVRKCVIyN9cPHBJSNnbL67w= +github.com/golang/protobuf v1.4.0/go.mod h1:jodUvKwWbYaEsadDk5Fwe5c77LiNKVO9IDvqG2KuDX0= +github.com/golang/protobuf v1.4.1/go.mod h1:U8fpvMrcmy5pZrNK1lt4xCsGvpyWQ/VVv6QDs8UjoX8= +github.com/golang/protobuf v1.4.3 h1:JjCZWpVbqXDqFVmTfYWEVTMIYrL/NPdPSCHPJ0T/raM= +github.com/golang/protobuf v1.4.3/go.mod h1:oDoupMAO8OvCJWAcko0GGGIgR6R6ocIYbsSw735rRwI= +github.com/google/btree v0.0.0-20180813153112-4030bb1f1f0c/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ= +github.com/google/btree v1.0.0/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ= +github.com/google/go-cmp v0.2.0/go.mod h1:oXzfMopK8JAjlY9xF4vHSVASa0yLyX7SntLO5aqRK0M= +github.com/google/go-cmp v0.3.0/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU= +github.com/google/go-cmp v0.3.1/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU= +github.com/google/go-cmp v0.4.0/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= +github.com/google/go-cmp v0.5.0/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= +github.com/google/go-cmp v0.5.2 h1:X2ev0eStA3AbceY54o37/0PQ/UWqKEiiO2dKL5OPaFM= +github.com/google/go-cmp v0.5.2/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= +github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg= +github.com/google/gofuzz v1.1.0 h1:Hsa8mG0dQ46ij8Sl2AYJDUv1oA9/d6Vk+3LG99Oe02g= +github.com/google/gofuzz v1.1.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg= +github.com/google/martian v2.1.0+incompatible/go.mod h1:9I4somxYTbIHy5NJKHRl3wXiIaQGbYVAs8BPL6v8lEs= +github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57/go.mod h1:zfwlbNMJ+OItoe0UupaVj+oy1omPYYDuagoSzA8v9mc= +github.com/google/pprof v0.0.0-20190515194954-54271f7e092f/go.mod h1:zfwlbNMJ+OItoe0UupaVj+oy1omPYYDuagoSzA8v9mc= +github.com/google/pprof v0.0.0-20191218002539-d4f498aebedc/go.mod h1:ZgVRPoUq/hfqzAqh7sHMqb3I9Rq5C59dIz2SbBwJ4eM= +github.com/google/pprof v0.0.0-20200212024743-f11f1df84d12/go.mod h1:ZgVRPoUq/hfqzAqh7sHMqb3I9Rq5C59dIz2SbBwJ4eM= +github.com/google/pprof v0.0.0-20200229191704-1ebb73c60ed3/go.mod h1:ZgVRPoUq/hfqzAqh7sHMqb3I9Rq5C59dIz2SbBwJ4eM= +github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI= github.com/google/uuid v1.1.1 h1:Gkbcsh/GbpXz7lPftLA3P6TYMwjCLYm83jiFQZF/3gY= github.com/google/uuid v1.1.1/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= -github.com/googleapis/gnostic v0.0.0-20170729233727-0c5108395e2d h1:7XGaL1e6bYS1yIonGp9761ExpPPV1ui0SAC59Yube9k= -github.com/googleapis/gnostic v0.0.0-20170729233727-0c5108395e2d/go.mod h1:sJBsCZ4ayReDTBIg8b9dl28c5xFWyhBTVRp3pOg5EKY= -github.com/gregjones/httpcache v0.0.0-20170728041850-787624de3eb7 h1:6TSoaYExHper8PYsJu23GWVNOyYRCSnIFyxKgLSZ54w= -github.com/gregjones/httpcache v0.0.0-20170728041850-787624de3eb7/go.mod h1:FecbI9+v66THATjSRHfNgh1IVFe/9kFxbXtjV0ctIMA= +github.com/google/uuid v1.1.2 h1:EVhdT+1Kseyi1/pUmXKaFxYsDNy9RQYkMWRH68J/W7Y= +github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= +github.com/googleapis/gax-go/v2 v2.0.4/go.mod h1:0Wqv26UfaUD9n4G6kQubkQ+KchISgw+vpHVxEJEs9eg= +github.com/googleapis/gax-go/v2 v2.0.5/go.mod h1:DWXyrwAJ9X0FpwwEdw+IPEYBICEFu5mhpdKc/us6bOk= +github.com/googleapis/gnostic v0.4.1 h1:DLJCy1n/vrD4HPjOvYcT8aYQXpPIzoRZONaYwyycI+I= +github.com/googleapis/gnostic v0.4.1/go.mod h1:LRhVm6pbyptWbWbuZ38d1eyptfvIytN3ir6b65WBswg= +github.com/gorilla/websocket v1.4.2/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE= +github.com/gregjones/httpcache v0.0.0-20180305231024-9cad4c3443a7/go.mod h1:FecbI9+v66THATjSRHfNgh1IVFe/9kFxbXtjV0ctIMA= +github.com/hashicorp/golang-lru v0.5.0/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8= +github.com/hashicorp/golang-lru v0.5.1/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8= +github.com/hpcloud/tail v1.0.0 h1:nfCOvKYfkgYP8hkirhJocXT2+zOD8yUNjXaWfTlyFKI= github.com/hpcloud/tail v1.0.0/go.mod h1:ab1qPbhIpdTxEkNHXyeSf5vhxWSCs/tWer42PpOxQnU= -github.com/json-iterator/go v0.0.0-20180612202835-f2b4162afba3 h1:/UewZcckqhvnnS0C6r3Sher2hSEbVmM6Ogpcjen08+Y= -github.com/json-iterator/go v0.0.0-20180612202835-f2b4162afba3/go.mod h1:+SdeFBvtyEkXs7REEP0seUULqWtbJapLOCVDaaPEHmU= +github.com/ianlancetaylor/demangle v0.0.0-20181102032728-5e5cf60278f6/go.mod h1:aSSvb/t6k1mPoxDqO4vJh6VOCGPwU4O0C2/Eqndh1Sc= +github.com/imdario/mergo v0.3.5/go.mod h1:2EnlNZ0deacrJVfApfmtdGgDfMuh/nq6Ok1EcJh5FfA= +github.com/json-iterator/go v1.1.6/go.mod h1:+SdeFBvtyEkXs7REEP0seUULqWtbJapLOCVDaaPEHmU= +github.com/json-iterator/go v1.1.10 h1:Kz6Cvnvv2wGdaG/V8yMvfkmNiXq9Ya2KUv4rouJJr68= +github.com/json-iterator/go v1.1.10/go.mod h1:KdQUCv79m/52Kvf8AW2vK1V8akMuk1QjK/uOdHXbAo4= +github.com/jstemmer/go-junit-report v0.0.0-20190106144839-af01ea7f8024/go.mod h1:6v2b51hI/fHJwM22ozAgKL4VKDeJcHhJFhtBdhmNjmU= +github.com/jstemmer/go-junit-report v0.9.1/go.mod h1:Brl9GWCQeLvo8nXZwPNNblvFj/XSXhF0NWZEnDohbsk= +github.com/kisielk/errcheck v1.5.0/go.mod h1:pFxgyoBC7bSaBwPgfKdkLd5X25qrDl4LWUI2bnpBCr8= +github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck= github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo= +github.com/kr/pretty v0.2.0/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI= github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ= +github.com/kr/pty v1.1.5/go.mod h1:9r2w37qlBe7rQ6e1fg1S/9xpWHSnaqNdHD3WcMdbPDA= github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI= +github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY= +github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE= +github.com/kylelemons/godebug v1.1.0 h1:RPNrshWIDI6G2gRW9EHilWtl7Z6Sb1BR0xunSBf0SNc= github.com/kylelemons/godebug v1.1.0/go.mod h1:9/0rRGxNHcop5bhtWyNeEfOS8JIWk580+fNqagV/RAw= +github.com/mailru/easyjson v0.0.0-20190614124828-94de47d64c63/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc= +github.com/mailru/easyjson v0.0.0-20190626092158-b2ccc519800e/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc= github.com/mattn/go-ieproxy v0.0.0-20190610004146-91bb50d98149 h1:HfxbT6/JcvIljmERptWhwa8XzP7H3T+Z2N26gTsaDaA= github.com/mattn/go-ieproxy v0.0.0-20190610004146-91bb50d98149/go.mod h1:31jz6HNzdxOmlERGGEc4v/dMssOfmp2p5bT/okiKFFc= -github.com/microsoft/ApplicationInsights-Go v0.4.2 h1:LCv4NtCpXpsUF6ZUzZdpVG2x4RwebY7tiJUb25uYXiM= -github.com/microsoft/ApplicationInsights-Go v0.4.2/go.mod h1:DupRHRNoeuH4j8Yv3nux9/IXo3HZ0kO5A1ykNK4vR2E= github.com/microsoft/ApplicationInsights-Go v0.4.3 h1:gBuy5rM3o6Zo69QTkq1Ens8wx6sVf+mpgMjjfayiRcw= github.com/microsoft/ApplicationInsights-Go v0.4.3/go.mod h1:ih0t3h84PdzV1qGeUs89o9wL8eCuwf24M7TZp/nyqXk= github.com/mitchellh/go-homedir v1.1.0 h1:lukF9ziXFxDFPkA1vsr5zpc1XuPDn/wFntq5mG+4E0Y= github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0= +github.com/mitchellh/mapstructure v1.1.2/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y= +github.com/moby/spdystream v0.2.0/go.mod h1:f7i0iNDQJ059oMTcWxx8MA/zKFIuD/lY+0GqbN2Wy8c= +github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg= github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= -github.com/modern-go/reflect2 v0.0.0-20180320133207-05fbef0ca5da h1:ZQGIPjr1iTtUPXZFk8WShqb5G+Qg65VHFLtSvmHh+Mw= -github.com/modern-go/reflect2 v0.0.0-20180320133207-05fbef0ca5da/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0= +github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0= +github.com/modern-go/reflect2 v1.0.1 h1:9f412s+6RmYXLWZSEzVVgPGK7C2PphHj5RJrvfx9AWI= +github.com/modern-go/reflect2 v1.0.1/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0= +github.com/modocache/gover v0.0.0-20171022184752-b58185e213c5/go.mod h1:caMODM3PzxT8aQXRPkAt8xlV/e7d7w8GM5g0fa5F0D8= +github.com/munnerz/goautoneg v0.0.0-20120707110453-a547fc61f48d/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ= +github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f/go.mod h1:ZdcZmHo+o7JKHSa8/e818NopupXU1YMK5fe1lsApnBw= +github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e h1:fD57ERR4JtEqsWbfPhv4DMiApHyliiK5xCTNVSPiaAs= +github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e/go.mod h1:zD1mROLANZcx1PVRCS0qkT7pwLkGfwJo4zjcN/Tysno= +github.com/onsi/ginkgo v0.0.0-20170829012221-11459a886d9c/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+WWjE= github.com/onsi/ginkgo v1.6.0/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+WWjE= github.com/onsi/ginkgo v1.8.0/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+WWjE= +github.com/onsi/ginkgo v1.11.0 h1:JAKSXpt1YjtLA7YpPiqO9ss6sNXEsPfSGdwN0UHqzrw= +github.com/onsi/ginkgo v1.11.0/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+WWjE= +github.com/onsi/gomega v0.0.0-20170829124025-dcabb60a477c/go.mod h1:C1qb7wdrVGGVU+Z6iS04AVkA3Q65CEZX59MT0QO5uiA= github.com/onsi/gomega v1.5.0/go.mod h1:ex+gbHU/CVuBBDIJjb2X0qEXbFg53c61hWP/1CpauHY= +github.com/onsi/gomega v1.7.0 h1:XPnZz8VVBHjVsy1vzJmRwIcSwiUO+JFfrv/xGiigmME= +github.com/onsi/gomega v1.7.0/go.mod h1:ex+gbHU/CVuBBDIJjb2X0qEXbFg53c61hWP/1CpauHY= github.com/peterbourgon/diskv v2.0.1+incompatible h1:UBdAOUP5p4RWqPBg048CAvpKN+vxiaj6gdUUzhl4XmI= github.com/peterbourgon/diskv v2.0.1+incompatible/go.mod h1:uqqh8zWWbv1HBMNONnaR/tNboyR3/BZd58JJSHlUSCU= -github.com/philhofer/fwd v1.0.0 h1:UbZqGr5Y38ApvM/V/jEljVxwocdweyH+vmYvRPBnbqQ= -github.com/philhofer/fwd v1.0.0/go.mod h1:gk3iGcWd9+svBvR0sR+KPcfE+RNWozjowpeBVG3ZVNU= +github.com/philhofer/fwd v1.1.1 h1:GdGcTjf5RNAxwS4QLsiMzJYj5KEvPJD3Abr261yRQXQ= +github.com/philhofer/fwd v1.1.1/go.mod h1:gk3iGcWd9+svBvR0sR+KPcfE+RNWozjowpeBVG3ZVNU= +github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4= github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= +github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= +github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= +github.com/prometheus/client_model v0.0.0-20190812154241-14fe0d1b01d4/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA= +github.com/rogpeppe/go-internal v1.3.0/go.mod h1:M8bDsm7K2OlrFYOpmOWEs/qY81heoFRclV5y23lUDJ4= +github.com/satori/go.uuid v1.2.0 h1:0uYX9dsZ2yD7q2RtLRtPSdGDWzjeM3TbMJP9utgA0ww= github.com/satori/go.uuid v1.2.0/go.mod h1:dA0hQrYB0VpLJoorglMZABFdXlWrHn1NEOzdhQKdks0= -github.com/satori/go.uuid v1.2.1-0.20181028125025-b2ce2384e17b h1:gQZ0qzfKHQIybLANtM3mBXNUtOfsCFXeTsnBqCsx1KM= -github.com/satori/go.uuid v1.2.1-0.20181028125025-b2ce2384e17b/go.mod h1:dA0hQrYB0VpLJoorglMZABFdXlWrHn1NEOzdhQKdks0= +github.com/spf13/afero v1.2.2/go.mod h1:9ZxEEn6pIJ8Rxe320qSDBk6AsU0r9pR7Q4OcevTdifk= +github.com/spf13/pflag v0.0.0-20170130214245-9ff6c6923cff/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4= +github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA= +github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg= +github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= +github.com/stretchr/objx v0.2.0/go.mod h1:qt09Ya8vawLte6SNmTgCsAVtYtaKzEcn8ATUoHMkEqE= +github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI= +github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4= +github.com/stretchr/testify v1.6.1 h1:hDPOHmpOpP40lSULcqw7IrRb/u7w6RpDC9399XyoNd0= +github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg= +github.com/tedsuo/ifrit v0.0.0-20180802180643-bea94bb476cc h1:LUUe4cdABGrIJAhl1P1ZpWY76AwukVszFdwkVFVLwIk= github.com/tedsuo/ifrit v0.0.0-20180802180643-bea94bb476cc/go.mod h1:eyZnKCc955uh98WQvzOm0dgAeLnf2O0Rz0LPoC5ze+0= github.com/tinylib/msgp v1.1.2 h1:gWmO7n0Ys2RBEb7GPYB9Ujq8Mk5p2U08lRnmMcGy6BQ= github.com/tinylib/msgp v1.1.2/go.mod h1:+d+yLhGm8mzTaHzB+wgMYrodPfmZrzkirds8fDWklFE= github.com/ugorji/go v1.1.2-0.20180813092308-00b869d2f4a5 h1:JRe7Bc0YQq+x7Bm3p/LIBIb4aopsdr3H0KRKRI8g6oY= github.com/ugorji/go v1.1.2-0.20180813092308-00b869d2f4a5/go.mod h1:hnLbHMwcvSihnDhEfx2/BzKp2xb0Y+ErdfYcrs9tkJQ= -golang.org/x/crypto v0.0.0-20180222182404-49796115aa4b h1:/GxqO8gbyb+sNnviFY2IIMrtm8vGg6NEJDft68wJY/g= -golang.org/x/crypto v0.0.0-20180222182404-49796115aa4b/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4= +github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74= +github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74= +go.opencensus.io v0.21.0/go.mod h1:mSImk1erAIZhrmZN+AvHh14ztQfjbGwt4TtuofqLduU= +go.opencensus.io v0.22.0/go.mod h1:+kGneAE2xo2IficOXnaByMWTGM9T73dGwxeWcUqIpI8= +go.opencensus.io v0.22.2/go.mod h1:yxeiOL68Rb0Xd1ddK5vPZ/oVn4vY4Ynel7k9FzqtOIw= +go.opencensus.io v0.22.3/go.mod h1:yxeiOL68Rb0Xd1ddK5vPZ/oVn4vY4Ynel7k9FzqtOIw= golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2 h1:VklqNMn3ovrHsnt90PveolxSbWFaJdECFbxSq0Mqo2M= golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w= +golang.org/x/crypto v0.0.0-20190510104115-cbcb75029529/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI= +golang.org/x/crypto v0.0.0-20190605123033-f99c8df09eb5/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI= +golang.org/x/crypto v0.0.0-20190611184440-5c40567a22f8/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI= +golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI= golang.org/x/crypto v0.0.0-20191206172530-e9b2fee46413 h1:ULYEB3JvPRE/IfO+9uO7vKV/xzVTO7XPAwm8xbf4w2g= golang.org/x/crypto v0.0.0-20191206172530-e9b2fee46413/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto= -golang.org/x/crypto v0.0.0-20200220183623-bac4c82f6975 h1:/Tl7pH94bvbAAHBdZJT947M/+gp0+CqQXDtMRC0fseo= -golang.org/x/crypto v0.0.0-20200220183623-bac4c82f6975/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto= -golang.org/x/crypto v0.0.0-20201216223049-8b5274cf687f h1:aZp0e2vLN4MToVqnjNEYEtrEA8RH8U8FN1CU7JgqsPU= -golang.org/x/crypto v0.0.0-20201216223049-8b5274cf687f/go.mod h1:jdWPYTVW3xRLrWPugEBEK3UY2ZEsg3UU495nc5E+M+I= -golang.org/x/net v0.0.0-20170809000501-1c05540f6879 h1:0rFa7EaCGdQPmZVbo9F7MNF65b8dyzS6EUnXjs9Cllk= -golang.org/x/net v0.0.0-20170809000501-1c05540f6879/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= +golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto= +golang.org/x/crypto v0.0.0-20201002170205-7f63de1d35b0/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto= +golang.org/x/crypto v0.0.0-20210220033148-5ea612d1eb83 h1:/ZScEX8SfEmUGRHs0gxpqteO5nfNW6axyZbBdw9A12g= +golang.org/x/crypto v0.0.0-20210220033148-5ea612d1eb83/go.mod h1:jdWPYTVW3xRLrWPugEBEK3UY2ZEsg3UU495nc5E+M+I= +golang.org/x/exp v0.0.0-20190121172915-509febef88a4/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= +golang.org/x/exp v0.0.0-20190306152737-a1d7652674e8/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= +golang.org/x/exp v0.0.0-20190510132918-efd6b22b2522/go.mod h1:ZjyILWgesfNpC6sMxTJOJm9Kp84zZh5NQWvqDGG3Qr8= +golang.org/x/exp v0.0.0-20190829153037-c13cbed26979/go.mod h1:86+5VVa7VpoJ4kLfm080zCjGlMRFzhUhsZKEZO7MGek= +golang.org/x/exp v0.0.0-20191030013958-a1ab85dbe136/go.mod h1:JXzH8nQsPlswgeRAPE3MuO9GYsAcnJvJ4vnMwN/5qkY= +golang.org/x/exp v0.0.0-20191129062945-2f5052295587/go.mod h1:2RIsYlXP63K8oxa1u096TMicItID8zy7Y6sNkU49FU4= +golang.org/x/exp v0.0.0-20191227195350-da58074b4299/go.mod h1:2RIsYlXP63K8oxa1u096TMicItID8zy7Y6sNkU49FU4= +golang.org/x/exp v0.0.0-20200119233911-0405dc783f0a/go.mod h1:2RIsYlXP63K8oxa1u096TMicItID8zy7Y6sNkU49FU4= +golang.org/x/exp v0.0.0-20200207192155-f17229e696bd/go.mod h1:J/WKrq2StrnmMY6+EHIKF9dgMWnmCNThgcyBT1FY9mM= +golang.org/x/exp v0.0.0-20200224162631-6cc2880d07d6/go.mod h1:3jZMyOhIsHpP37uCMkUooju7aAi5cS1Q23tOzKc+0MU= +golang.org/x/image v0.0.0-20190227222117-0694c2d4d067/go.mod h1:kZ7UVZpmo3dzQBMxlp+ypCbDeSB+sBbTgSJuh5dn5js= +golang.org/x/image v0.0.0-20190802002840-cff245a6509b/go.mod h1:FeLwcggjj3mMvU+oOTbSwawSJRM1uh48EjtB4UJZlP0= +golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE= +golang.org/x/lint v0.0.0-20190227174305-5b3e6a55c961/go.mod h1:wehouNa3lNwaWXcvxsM5YxQ5yQlVC4a0KAMCusXpPoU= +golang.org/x/lint v0.0.0-20190301231843-5614ed5bae6f/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE= +golang.org/x/lint v0.0.0-20190313153728-d0100b6bd8b3/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc= +golang.org/x/lint v0.0.0-20190409202823-959b441ac422/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc= +golang.org/x/lint v0.0.0-20190909230951-414d861bb4ac/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc= +golang.org/x/lint v0.0.0-20190930215403-16217165b5de/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc= +golang.org/x/lint v0.0.0-20191125180803-fdd1cda4f05f/go.mod h1:5qLYkcX4OjUUV8bRuDixDT3tpyyb+LUpUlRWLxfhWrs= +golang.org/x/lint v0.0.0-20200130185559-910be7a94367/go.mod h1:3xt1FjdF8hUf6vQPIChWIBhFzV8gjjsPE/fR3IyQdNY= +golang.org/x/lint v0.0.0-20200302205851-738671d3881b/go.mod h1:3xt1FjdF8hUf6vQPIChWIBhFzV8gjjsPE/fR3IyQdNY= +golang.org/x/mobile v0.0.0-20190312151609-d3739f865fa6/go.mod h1:z+o9i4GpDbdi3rU15maQ/Ox0txvL9dWGYEHz965HBQE= +golang.org/x/mobile v0.0.0-20190719004257-d2bd2a29d028/go.mod h1:E/iHnbuqvinMTCcRqshq8CkpyQDoeVncDDYHnLhea+o= +golang.org/x/mod v0.0.0-20190513183733-4bf6d317e70e/go.mod h1:mXi4GBBbnImb6dmsKGUJ2LatrhH/nqhxcFungHvyanc= +golang.org/x/mod v0.1.0/go.mod h1:0QHyrYULN0/3qlju5TqG8bIK38QM8yzMo5ekMj3DlcY= +golang.org/x/mod v0.1.1-0.20191105210325-c90efee705ee/go.mod h1:QqPTAvyqsEbceGzBzNggFXnrqF1CaUcvgkdR5Ot7KZg= +golang.org/x/mod v0.1.1-0.20191107180719-034126e5016b/go.mod h1:QqPTAvyqsEbceGzBzNggFXnrqF1CaUcvgkdR5Ot7KZg= +golang.org/x/mod v0.2.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA= +golang.org/x/mod v0.3.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA= +golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= +golang.org/x/net v0.0.0-20180826012351-8a410e7b638d/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= golang.org/x/net v0.0.0-20180906233101-161cd47e91fd/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= +golang.org/x/net v0.0.0-20190108225652-1e06a53dbb7e/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= +golang.org/x/net v0.0.0-20190213061140-3a22650c66bd/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= +golang.org/x/net v0.0.0-20190311183353-d8887717615a/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg= golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg= -golang.org/x/net v0.0.0-20200421231249-e086a090c8fd h1:QPwSajcTUrFriMF1nJ3XzgoqakqQEsnZf9LdXdi2nkI= -golang.org/x/net v0.0.0-20200421231249-e086a090c8fd/go.mod h1:qpuaurCH72eLCgpAm/N6yyVIVM9cpaDIP3A8BGJEC5A= +golang.org/x/net v0.0.0-20190501004415-9ce7a6920f09/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg= +golang.org/x/net v0.0.0-20190503192946-f4e77d36d62c/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg= +golang.org/x/net v0.0.0-20190603091049-60506f45cf65/go.mod h1:HSz+uSET+XFnRR8LxR5pz3Of3rY3CfYBVs4xY44aLks= +golang.org/x/net v0.0.0-20190613194153-d28f0bde5980/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= +golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= +golang.org/x/net v0.0.0-20190724013045-ca1201d0de80/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= +golang.org/x/net v0.0.0-20190827160401-ba9fcec4b297/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= +golang.org/x/net v0.0.0-20191209160850-c0dbc17a3553/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= +golang.org/x/net v0.0.0-20200114155413-6afb5195e5aa/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= +golang.org/x/net v0.0.0-20200202094626-16171245cfb2/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= +golang.org/x/net v0.0.0-20200222125558-5a598a2470a0/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= +golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= +golang.org/x/net v0.0.0-20200301022130-244492dfa37a/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= +golang.org/x/net v0.0.0-20200324143707-d3edc9973b7e/go.mod h1:qpuaurCH72eLCgpAm/N6yyVIVM9cpaDIP3A8BGJEC5A= +golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU= +golang.org/x/net v0.0.0-20210224082022-3d97a244fca7 h1:OgUuv8lsRpBibGNbSizVwKWlysjaNzmC9gYMhPVfqFM= +golang.org/x/net v0.0.0-20210224082022-3d97a244fca7/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg= +golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U= +golang.org/x/oauth2 v0.0.0-20190226205417-e64efc72b421/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw= +golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw= +golang.org/x/oauth2 v0.0.0-20191202225959-858c2ad4c8b6/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw= +golang.org/x/oauth2 v0.0.0-20200107190931-bf48bf16ab8d h1:TzXSXBo42m9gQenoE3b9BGiEpg5IG2JkU5FkPIawgtw= +golang.org/x/oauth2 v0.0.0-20200107190931-bf48bf16ab8d/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw= golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= -golang.org/x/sys v0.0.0-20171031081856-95c657629925 h1:nCH33NboKIsT4HoXBsXTWX8ul303HxWgkc5s2Ezwacg= -golang.org/x/sys v0.0.0-20171031081856-95c657629925/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= +golang.org/x/sync v0.0.0-20181108010431-42b317875d0f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= +golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= +golang.org/x/sync v0.0.0-20190227155943-e225da77a7e6/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= +golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= +golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= +golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= +golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20180909124046-d0be0721c37e/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= +golang.org/x/sys v0.0.0-20190312061237-fead79001313/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20190502145724-3ef323f4f1fd/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20190507160741-ecd444e8653b/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20190606165138-5da285871e9c/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20190616124812-15dcb6c0061f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20190624142023-c5567b49c5d0/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20190726091711-fc99dfbffb4e/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20191001151750-bb3f8db39f24/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= golang.org/x/sys v0.0.0-20191026070338-33540a1f6037/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20191204072324-ce4227a45e2e/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20191228213918-04cbcbbfeed8/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20200113162924-86b910548bc1/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20200122134326-e047566fdf82/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20200202164722-d101bd2416d5/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20200212091648-12a6c2dcc1e4/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20200223170610-d5e6a3e2c0ae/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20200302150141-5c8b2ff67527/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= golang.org/x/sys v0.0.0-20200323222414-85ca7c5b95cd h1:xhmwyvizuTgC2qz7ZlMluP20uW+C3Rm0FD/WLDX8884= golang.org/x/sys v0.0.0-20200323222414-85ca7c5b95cd/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= +golang.org/x/sys v0.0.0-20210225134936-a50acf3fe073 h1:8qxJSnu+7dRq6upnbntrmriWByIakBuct5OM/MdQC1M= +golang.org/x/sys v0.0.0-20210225134936-a50acf3fe073/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= golang.org/x/term v0.0.0-20201117132131-f5c789dd3221 h1:/ZHdbVpdR/jk3g30/d4yUL0JU9kksj8+F/bnQUVLGDM= golang.org/x/term v0.0.0-20201117132131-f5c789dd3221/go.mod h1:Nr5EML6q2oocZ2LXRh80K7BxOlk5/8JxuGnuhpl+muw= -golang.org/x/text v0.0.0-20170810154203-b19bf474d317 h1:WKW+OPdYPlvOTVGHuMfjnIC6yY2SI93yFB0pZ7giBmQ= -golang.org/x/text v0.0.0-20170810154203-b19bf474d317/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= +golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo= +golang.org/x/term v0.0.0-20210220032956-6a3ed077a48d h1:SZxvLBoTP5yHO3Frd4z4vrF+DBX9vMVanchswa69toE= +golang.org/x/term v0.0.0-20210220032956-6a3ed077a48d/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo= +golang.org/x/text v0.0.0-20170915032832-14c0d48ead0c/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= golang.org/x/text v0.3.0 h1:g61tztE5qeGQ89tm6NTjjM9VPIm088od1l6aSorWRWg= golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= -golang.org/x/time v0.0.0-20161028155119-f51c12702a4d h1:TnM+PKb3ylGmZvyPXmo9m/wktg7Jn/a/fNmr33HSj8g= -golang.org/x/time v0.0.0-20161028155119-f51c12702a4d/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= +golang.org/x/text v0.3.1-0.20180807135948-17ff2d5776d2/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= +golang.org/x/text v0.3.2/go.mod h1:bEr9sfX3Q8Zfm5fL9x+3itogRgK3+ptLWKqgva+5dAk= +golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= +golang.org/x/text v0.3.4 h1:0YWbFKbhXG/wIiuHDSKpS0Iy7FSA+u45VtBMfQcFTTc= +golang.org/x/text v0.3.4/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= +golang.org/x/time v0.0.0-20181108054448-85acf8d2951c/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= +golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= +golang.org/x/time v0.0.0-20191024005414-555d28b269f0/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= +golang.org/x/time v0.0.0-20210220033141-f8bda1e9f3ba h1:O8mE0/t419eoIwhTFpKVkHiTs/Igowgfkj25AcZrtiE= +golang.org/x/time v0.0.0-20210220033141-f8bda1e9f3ba/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= +golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= +golang.org/x/tools v0.0.0-20190114222345-bf090417da8b/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= +golang.org/x/tools v0.0.0-20190226205152-f727befe758c/go.mod h1:9Yl7xja0Znq3iFh3HoIrodX9oNMXvdceNzlUR8zjMvY= +golang.org/x/tools v0.0.0-20190311212946-11955173bddd/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs= +golang.org/x/tools v0.0.0-20190312151545-0bb0c0a6e846/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs= +golang.org/x/tools v0.0.0-20190312170243-e65039ee4138/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs= +golang.org/x/tools v0.0.0-20190425150028-36563e24a262/go.mod h1:RgjU9mgBXZiqYHBnxXauZ1Gv1EHHAz9KjViQ78xBX0Q= +golang.org/x/tools v0.0.0-20190506145303-2d16b83fe98c/go.mod h1:RgjU9mgBXZiqYHBnxXauZ1Gv1EHHAz9KjViQ78xBX0Q= +golang.org/x/tools v0.0.0-20190524140312-2c0ae7006135/go.mod h1:RgjU9mgBXZiqYHBnxXauZ1Gv1EHHAz9KjViQ78xBX0Q= +golang.org/x/tools v0.0.0-20190606124116-d0a3d012864b/go.mod h1:/rFqwRUd4F7ZHNgwSSTFct+R/Kf4OFW1sUzUTQQTgfc= +golang.org/x/tools v0.0.0-20190614205625-5aca471b1d59/go.mod h1:/rFqwRUd4F7ZHNgwSSTFct+R/Kf4OFW1sUzUTQQTgfc= +golang.org/x/tools v0.0.0-20190621195816-6e04913cbbac/go.mod h1:/rFqwRUd4F7ZHNgwSSTFct+R/Kf4OFW1sUzUTQQTgfc= +golang.org/x/tools v0.0.0-20190628153133-6cdbf07be9d0/go.mod h1:/rFqwRUd4F7ZHNgwSSTFct+R/Kf4OFW1sUzUTQQTgfc= +golang.org/x/tools v0.0.0-20190816200558-6889da9d5479/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= +golang.org/x/tools v0.0.0-20190911174233-4f2ddba30aff/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= +golang.org/x/tools v0.0.0-20191012152004-8de300cfc20a/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= +golang.org/x/tools v0.0.0-20191113191852-77e3bb0ad9e7/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= +golang.org/x/tools v0.0.0-20191115202509-3a792d9c32b2/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= +golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= +golang.org/x/tools v0.0.0-20191125144606-a911d9008d1f/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= +golang.org/x/tools v0.0.0-20191130070609-6e064ea0cf2d/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= +golang.org/x/tools v0.0.0-20191216173652-a0e659d51361/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28= +golang.org/x/tools v0.0.0-20191227053925-7b8e75db28f4/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28= +golang.org/x/tools v0.0.0-20200117161641-43d50277825c/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28= +golang.org/x/tools v0.0.0-20200122220014-bf1340f18c4a/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28= +golang.org/x/tools v0.0.0-20200130002326-2f3ba24bd6e7/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28= +golang.org/x/tools v0.0.0-20200204074204-1cc6d1ef6c74/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28= +golang.org/x/tools v0.0.0-20200207183749-b753a1ba74fa/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28= +golang.org/x/tools v0.0.0-20200212150539-ea181f53ac56/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28= +golang.org/x/tools v0.0.0-20200224181240-023911ca70b2/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28= +golang.org/x/tools v0.0.0-20200304193943-95d2e580d8eb/go.mod h1:o4KQGtdN14AW+yjsvvwRTJJuXz8XRtIHtEnmAXLyFUw= +golang.org/x/tools v0.0.0-20200619180055-7c47624df98f/go.mod h1:EkVYQZoAsY45+roYkvgYkIh4xh/qjgUK9TdY2XT94GE= +golang.org/x/tools v0.0.0-20210106214847-113979e3529a/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA= +golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= +golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= +golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= +golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1 h1:go1bK/D/BFZV2I8cIQd1NKEZ+0owSTG1fDTci4IqFcE= +golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= +google.golang.org/api v0.4.0/go.mod h1:8k5glujaEP+g9n7WNsDg8QP6cUVNI86fCNMcbazEtwE= +google.golang.org/api v0.7.0/go.mod h1:WtwebWUNSVBH/HAw79HIFXZNqEvBhG+Ra+ax0hx3E3M= +google.golang.org/api v0.8.0/go.mod h1:o4eAsZoiT+ibD93RtjEohWalFOjRDx6CVaqeizhEnKg= +google.golang.org/api v0.9.0/go.mod h1:o4eAsZoiT+ibD93RtjEohWalFOjRDx6CVaqeizhEnKg= +google.golang.org/api v0.13.0/go.mod h1:iLdEw5Ide6rF15KTC1Kkl0iskquN2gFfn9o9XIsbkAI= +google.golang.org/api v0.14.0/go.mod h1:iLdEw5Ide6rF15KTC1Kkl0iskquN2gFfn9o9XIsbkAI= +google.golang.org/api v0.15.0/go.mod h1:iLdEw5Ide6rF15KTC1Kkl0iskquN2gFfn9o9XIsbkAI= +google.golang.org/api v0.17.0/go.mod h1:BwFmGc8tA3vsd7r/7kR8DY7iEEGSU04BFxCo5jP/sfE= +google.golang.org/api v0.18.0/go.mod h1:BwFmGc8tA3vsd7r/7kR8DY7iEEGSU04BFxCo5jP/sfE= +google.golang.org/api v0.20.0/go.mod h1:BwFmGc8tA3vsd7r/7kR8DY7iEEGSU04BFxCo5jP/sfE= +google.golang.org/appengine v1.1.0/go.mod h1:EbEs0AVv82hx2wNQdGPgUI5lhzA/G0D9YwlJXL52JkM= +google.golang.org/appengine v1.4.0/go.mod h1:xpcJRLb0r/rnEns0DIKYYv+WjYCduHsrkT7/EB5XEv4= +google.golang.org/appengine v1.5.0/go.mod h1:xpcJRLb0r/rnEns0DIKYYv+WjYCduHsrkT7/EB5XEv4= +google.golang.org/appengine v1.6.1/go.mod h1:i06prIuMbXzDqacNJfV5OdTW448YApPu5ww/cMBSeb0= +google.golang.org/appengine v1.6.5 h1:tycE03LOZYQNhDpS27tcQdAzLCVMaj7QT2SXxebnpCM= +google.golang.org/appengine v1.6.5/go.mod h1:8WjMMxjGQR8xUklV/ARdw2HLXBOI7O7uCIDZVag1xfc= +google.golang.org/genproto v0.0.0-20180817151627-c66870c02cf8/go.mod h1:JiN7NxoALGmiZfu7CAH4rXhgtRTLTxftemlI0sWmxmc= +google.golang.org/genproto v0.0.0-20190307195333-5fe7a883aa19/go.mod h1:VzzqZJRnGkLBvHegQrXjBqPurQTc5/KpmUdxsrq26oE= +google.golang.org/genproto v0.0.0-20190418145605-e7d98fc518a7/go.mod h1:VzzqZJRnGkLBvHegQrXjBqPurQTc5/KpmUdxsrq26oE= +google.golang.org/genproto v0.0.0-20190425155659-357c62f0e4bb/go.mod h1:VzzqZJRnGkLBvHegQrXjBqPurQTc5/KpmUdxsrq26oE= +google.golang.org/genproto v0.0.0-20190502173448-54afdca5d873/go.mod h1:VzzqZJRnGkLBvHegQrXjBqPurQTc5/KpmUdxsrq26oE= +google.golang.org/genproto v0.0.0-20190801165951-fa694d86fc64/go.mod h1:DMBHOl98Agz4BDEuKkezgsaosCRResVns1a3J2ZsMNc= +google.golang.org/genproto v0.0.0-20190819201941-24fa4b261c55/go.mod h1:DMBHOl98Agz4BDEuKkezgsaosCRResVns1a3J2ZsMNc= +google.golang.org/genproto v0.0.0-20190911173649-1774047e7e51/go.mod h1:IbNlFCBrqXvoKpeg0TB2l7cyZUmoaFKYIwrEpbDKLA8= +google.golang.org/genproto v0.0.0-20191108220845-16a3f7862a1a/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc= +google.golang.org/genproto v0.0.0-20191115194625-c23dd37a84c9/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc= +google.golang.org/genproto v0.0.0-20191216164720-4f79533eabd1/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc= +google.golang.org/genproto v0.0.0-20191230161307-f3c370f40bfb/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc= +google.golang.org/genproto v0.0.0-20200115191322-ca5a22157cba/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc= +google.golang.org/genproto v0.0.0-20200122232147-0452cf42e150/go.mod h1:n3cpQtvxv34hfy77yVDNjmbRyujviMdxYliBSkLhpCc= +google.golang.org/genproto v0.0.0-20200204135345-fa8e72b47b90/go.mod h1:GmwEX6Z4W5gMy59cAlVYjN9JhxgbQH6Gn+gFDQe2lzA= +google.golang.org/genproto v0.0.0-20200212174721-66ed5ce911ce/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c= +google.golang.org/genproto v0.0.0-20200224152610-e50cd9704f63/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c= +google.golang.org/genproto v0.0.0-20200305110556-506484158171/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c= +google.golang.org/genproto v0.0.0-20200526211855-cb27e3aa2013/go.mod h1:NbSheEEYHJ7i3ixzK3sjbqSGDJWnxyFXZblF3eUsNvo= +google.golang.org/grpc v1.19.0/go.mod h1:mqu4LbDTu4XGKhr4mRzUsmM4RtVoemTSY81AxZiDr8c= +google.golang.org/grpc v1.20.1/go.mod h1:10oTOabMzJvdu6/UiuZezV6QK5dSlG84ov/aaiqXj38= +google.golang.org/grpc v1.21.1/go.mod h1:oYelfM1adQP15Ek0mdvEgi9Df8B9CZIaU1084ijfRaM= +google.golang.org/grpc v1.23.0/go.mod h1:Y5yQAOtifL1yxbo5wqy6BxZv8vAUGQwXBOALyacEbxg= +google.golang.org/grpc v1.26.0/go.mod h1:qbnxyOmOxrQa7FizSgH+ReBfzJrCY1pSN7KXBS8abTk= +google.golang.org/grpc v1.27.0/go.mod h1:qbnxyOmOxrQa7FizSgH+ReBfzJrCY1pSN7KXBS8abTk= +google.golang.org/grpc v1.27.1/go.mod h1:qbnxyOmOxrQa7FizSgH+ReBfzJrCY1pSN7KXBS8abTk= +google.golang.org/protobuf v0.0.0-20200109180630-ec00e32a8dfd/go.mod h1:DFci5gLYBciE7Vtevhsrf46CRTquxDuWsQurQQe4oz8= +google.golang.org/protobuf v0.0.0-20200221191635-4d8936d0db64/go.mod h1:kwYJMbMJ01Woi6D6+Kah6886xMZcty6N08ah7+eCXa0= +google.golang.org/protobuf v0.0.0-20200228230310-ab0ca4ff8a60/go.mod h1:cfTl7dwQJ+fmap5saPgwCLgHXTUD7jkjRqWcaiX5VyM= +google.golang.org/protobuf v1.20.1-0.20200309200217-e05f789c0967/go.mod h1:A+miEFZTKqfCUM6K7xSMQL9OKL/b6hQv+e19PK+JZNE= +google.golang.org/protobuf v1.21.0/go.mod h1:47Nbq4nVaFHyn7ilMalzfO3qCViNmqZ2kzikPIcrTAo= +google.golang.org/protobuf v1.22.0/go.mod h1:EGpADcykh3NcUnDUJcl1+ZksZNG86OlYog2l/sGQquU= +google.golang.org/protobuf v1.23.0/go.mod h1:EGpADcykh3NcUnDUJcl1+ZksZNG86OlYog2l/sGQquU= +google.golang.org/protobuf v1.23.1-0.20200526195155-81db48ad09cc/go.mod h1:EGpADcykh3NcUnDUJcl1+ZksZNG86OlYog2l/sGQquU= +google.golang.org/protobuf v1.25.0 h1:Ejskq+SyPohKW+1uil0JJMtmHCgJPJ/qWTxr8qp+R4c= +google.golang.org/protobuf v1.25.0/go.mod h1:9JNX74DMeImyA3h4bdi1ymwjUzf21/xIlbajtzgsN7c= gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= gopkg.in/check.v1 v1.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= +gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= +gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f h1:BLraFXnmrev5lT+xlilqcH8XK9/i0At2xKjWk4p6zsU= +gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= +gopkg.in/errgo.v2 v2.1.0/go.mod h1:hNsd1EY+bozCKY1Ytp96fpM3vjJbqLJn88ws8XvfDNI= +gopkg.in/fsnotify.v1 v1.4.7 h1:xOHLXZwVvI9hhs+cLKq5+I5onOuwQLhQwiu63xxlHs4= gopkg.in/fsnotify.v1 v1.4.7/go.mod h1:Tz8NjZHkW78fSQdbUxIjBTcgA1z1m8ZHf0WmKUhAMys= -gopkg.in/inf.v0 v0.9.0 h1:3zYtXIO92bvsdS3ggAdA8Gb4Azj0YU+TVY1uGYNFA8o= -gopkg.in/inf.v0 v0.9.0/go.mod h1:cWUDdTG/fYaXco+Dcufb5Vnc6Gp2YChqWtbxRZE0mXw= +gopkg.in/inf.v0 v0.9.1 h1:73M5CoZyi3ZLMOyDlQh031Cx6N9NDJ2Vvfl76EDAgDc= +gopkg.in/inf.v0 v0.9.1/go.mod h1:cWUDdTG/fYaXco+Dcufb5Vnc6Gp2YChqWtbxRZE0mXw= gopkg.in/natefinch/lumberjack.v2 v2.0.0-20170531160350-a96e63847dc3 h1:AFxeG48hTWHhDTQDk/m2gorfVHUEa9vo3tp3D7TzwjI= gopkg.in/natefinch/lumberjack.v2 v2.0.0-20170531160350-a96e63847dc3/go.mod h1:l0ndWWf7gzL7RNwBG7wST/UCcT4T24xpD6X8LsfU/+k= +gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7 h1:uRGJdciOHaEIrze2W8Q3AKkepLTh2hOroT7a+7czfdQ= gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7/go.mod h1:dt/ZhP58zS4L8KSrWDmTeBkI65Dw0HsyUHuEVlX15mw= -gopkg.in/yaml.v2 v2.0.0-20170721113624-670d4cfef054 h1:ROF+R/wHHruzF40n5DfPv2jwm7rCJwvs8fz+RTZWjLE= -gopkg.in/yaml.v2 v2.0.0-20170721113624-670d4cfef054/go.mod h1:JAlM8MvJe8wmxCU4Bli9HhUf9+ttbYbLASfIpnQbh74= gopkg.in/yaml.v2 v2.2.1 h1:mUhvW9EsL+naU5Q3cakzfE91YhliOondGd6ZrsDBHQE= gopkg.in/yaml.v2 v2.2.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= -k8s.io/api v0.0.0-20180628040859-072894a440bd h1:HzgYeLDS1jLxw8DGr68KJh9cdQ5iZJizG0HZWstIhfQ= -k8s.io/api v0.0.0-20180628040859-072894a440bd/go.mod h1:iuAfoD4hCxJ8Onx9kaTIt30j7jUFS00AXQi6QMi99vA= -k8s.io/apimachinery v0.0.0-20180621070125-103fd098999d h1:MZjlsu9igBoVPZkXpIGoxI6EonqNsXXZU7hhvfQLkd4= -k8s.io/apimachinery v0.0.0-20180621070125-103fd098999d/go.mod h1:ccL7Eh7zubPUSh9A3USN90/OzHNSVN6zxzde07TDCL0= -k8s.io/client-go v8.0.0+incompatible h1:tTI4hRmb1DRMl4fG6Vclfdi6nTM82oIrTT7HfitmxC4= -k8s.io/client-go v8.0.0+incompatible/go.mod h1:7vJpHMYJwNQCWgzmNV+VYUl1zCObLyodBc8nIyt8L5s= +gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= +gopkg.in/yaml.v2 v2.2.8/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= +gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY= +gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ= +gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c h1:dUUwHk2QECo/6vqA44rthZ8ie2QXMNeKRTHCNY2nXvo= +gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= +honnef.co/go/tools v0.0.0-20190102054323-c2f93a96b099/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4= +honnef.co/go/tools v0.0.0-20190106161140-3f1c8253044a/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4= +honnef.co/go/tools v0.0.0-20190418001031-e561f6794a2a/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4= +honnef.co/go/tools v0.0.0-20190523083050-ea95bdfd59fc/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4= +honnef.co/go/tools v0.0.1-2019.2.3/go.mod h1:a3bituU0lyd329TUQxRnasdCoJDkEUEAqEt0JzvZhAg= +honnef.co/go/tools v0.0.1-2020.1.3/go.mod h1:X/FiERA/W4tHapMX5mGpAtMSVEeEUOyHaw9vFzvIQ3k= +k8s.io/api v0.21.0 h1:gu5iGF4V6tfVCQ/R+8Hc0h7H1JuEhzyEi9S4R5LM8+Y= +k8s.io/api v0.21.0/go.mod h1:+YbrhBBGgsxbF6o6Kj4KJPJnBmAKuXDeS3E18bgHNVU= +k8s.io/apimachinery v0.21.0 h1:3Fx+41if+IRavNcKOz09FwEXDBG6ORh6iMsTSelhkMA= +k8s.io/apimachinery v0.21.0/go.mod h1:jbreFvJo3ov9rj7eWT7+sYiRx+qZuCYXwWT1bcDswPY= +k8s.io/client-go v0.21.0 h1:n0zzzJsAQmJngpC0IhgFcApZyoGXPrDIAD601HD09ag= +k8s.io/client-go v0.21.0/go.mod h1:nNBytTF9qPFDEhoqgEPaarobC8QPae13bElIVHzIglA= +k8s.io/gengo v0.0.0-20200413195148-3a45101e95ac/go.mod h1:ezvh/TsK7cY6rbqRK0oQQ8IAqLxYwwyPxAX1Pzy0ii0= +k8s.io/klog/v2 v2.0.0/go.mod h1:PBfzABfn139FHAV07az/IF9Wp1bkk3vpT2XSJ76fSDE= +k8s.io/klog/v2 v2.8.0 h1:Q3gmuM9hKEjefWFFYF0Mat+YyFJvsUyYuwyNNJ5C9Ts= +k8s.io/klog/v2 v2.8.0/go.mod h1:hy9LJ/NvuK+iVyP4Ehqva4HxZG/oXyIS3n3Jmire4Ec= +k8s.io/kube-openapi v0.0.0-20210305001622-591a79e4bda7/go.mod h1:wXW5VT87nVfh/iLV8FpR2uDvrFyomxbtb1KivDbvPTE= +k8s.io/utils v0.0.0-20201110183641-67b214c5f920 h1:CbnUZsM497iRC5QMVkHwyl8s2tB3g7yaSHkYPkpgelw= +k8s.io/utils v0.0.0-20201110183641-67b214c5f920/go.mod h1:jPW/WVKK9YHAvNhRxK0md/EJ228hCsBRufyofKtW8HA= +rsc.io/binaryregexp v0.2.0/go.mod h1:qTv7/COck+e2FymRvadv62gMdZztPaShugOCi3I+8D8= +rsc.io/quote/v3 v3.1.0/go.mod h1:yEA65RcK8LyAZtP9Kv3t0HmxON59tX3rD+tICJqUlj0= +rsc.io/sampler v1.3.0/go.mod h1:T1hPZKmBbMNahiBKFy5HrXp6adAjACjK9JXDnKaTXpA= +sigs.k8s.io/structured-merge-diff/v4 v4.0.2/go.mod h1:bJZC9H9iH24zzfZ/41RGcq60oK1F7G282QMXDPYydCw= +sigs.k8s.io/structured-merge-diff/v4 v4.1.0 h1:C4r9BgJ98vrKnnVCjwCSXcWjWe0NKcUQkmzDXZXGwH8= +sigs.k8s.io/structured-merge-diff/v4 v4.1.0/go.mod h1:bJZC9H9iH24zzfZ/41RGcq60oK1F7G282QMXDPYydCw= +sigs.k8s.io/yaml v1.2.0 h1:kr/MCeFWJWTwyaHoR9c8EjH9OumOmoF9YGiZd7lFm/Q= +sigs.k8s.io/yaml v1.2.0/go.mod h1:yfXDCHCao9+ENCvLSE62v9VSji2MKu5jeNfTrofGhJc= diff --git a/source/plugins/go/src/ingestion_token_utils.go b/source/plugins/go/src/ingestion_token_utils.go new file mode 100644 index 000000000..c96685042 --- /dev/null +++ b/source/plugins/go/src/ingestion_token_utils.go @@ -0,0 +1,516 @@ +package main + +import ( + "encoding/json" + "errors" + "fmt" + "io/ioutil" + "net/http" + "net/url" + "os" + "regexp" + "strconv" + "strings" + "time" +) + +const IMDSTokenPathForWindows = "c:/etc/imds-access-token/token" // only used in windows +const AMCSAgentConfigAPIVersion = "2020-08-01-preview" +const AMCSIngestionTokenAPIVersion = "2020-04-01-preview" +const MaxRetries = 3 + +var IMDSToken string +var IMDSTokenExpiration int64 + +var ConfigurationId string +var ChannelId string + +var IngestionAuthToken string +var IngestionAuthTokenExpiration int64 + +type IMDSResponse struct { + AccessToken string `json:"access_token"` + ClientID string `json:"client_id"` + ExpiresIn string `json:"expires_in"` + ExpiresOn string `json:"expires_on"` + ExtExpiresIn string `json:"ext_expires_in"` + NotBefore string `json:"not_before"` + Resource string `json:"resource"` + TokenType string `json:"token_type"` +} + +type AgentConfiguration struct { + Configurations []struct { + Configurationid string `json:"configurationId"` + Etag string `json:"eTag"` + Op string `json:"op"` + Content struct { + Datasources []struct { + Configuration struct { + Extensionname string `json:"extensionName"` + } `json:"configuration"` + ID string `json:"id"` + Kind string `json:"kind"` + Streams []struct { + Stream string `json:"stream"` + Solution string `json:"solution"` + Extensionoutputstream string `json:"extensionOutputStream"` + } `json:"streams"` + Sendtochannels []string `json:"sendToChannels"` + } `json:"dataSources"` + Channels []struct { + Endpoint string `json:"endpoint"` + ID string `json:"id"` + Protocol string `json:"protocol"` + } `json:"channels"` + Extensionconfigurations struct { + Containerinsights []struct { + ID string `json:"id"` + Originids []string `json:"originIds"` + Outputstreams struct { + LinuxPerfBlob string `json:"LINUX_PERF_BLOB"` + ContainerInventoryBlob string `json:"CONTAINER_INVENTORY_BLOB"` + ContainerLogBlob string `json:"CONTAINER_LOG_BLOB"` + ContainerinsightsContainerlogv2 string `json:"CONTAINERINSIGHTS_CONTAINERLOGV2"` + ContainerNodeInventoryBlob string `json:"CONTAINER_NODE_INVENTORY_BLOB"` + KubeEventsBlob string `json:"KUBE_EVENTS_BLOB"` + KubeHealthBlob string `json:"KUBE_HEALTH_BLOB"` + KubeMonAgentEventsBlob string `json:"KUBE_MON_AGENT_EVENTS_BLOB"` + KubeNodeInventoryBlob string `json:"KUBE_NODE_INVENTORY_BLOB"` + KubePodInventoryBlob string `json:"KUBE_POD_INVENTORY_BLOB"` + KubePvInventoryBlob string `json:"KUBE_PV_INVENTORY_BLOB"` + KubeServicesBlob string `json:"KUBE_SERVICES_BLOB"` + InsightsMetricsBlob string `json:"INSIGHTS_METRICS_BLOB"` + } `json:"outputStreams"` + } `json:"ContainerInsights"` + } `json:"extensionConfigurations"` + } `json:"content"` + } `json:"configurations"` +} + +type IngestionTokenResponse struct { + Configurationid string `json:"configurationId"` + Ingestionauthtoken string `json:"ingestionAuthToken"` +} + +func getAccessTokenFromIMDS() (string, int64, error) { + Log("Info getAccessTokenFromIMDS: start") + useIMDSTokenProxyEndPoint := os.Getenv("USE_IMDS_TOKEN_PROXY_END_POINT") + imdsAccessToken := "" + var responseBytes []byte + var err error + + if (useIMDSTokenProxyEndPoint != "" && strings.Compare(strings.ToLower(useIMDSTokenProxyEndPoint), "true") == 0) { + Log("Info Reading IMDS Access Token from IMDS Token proxy endpoint") + mcsEndpoint := os.Getenv("MCS_ENDPOINT") + msi_endpoint_string := fmt.Sprintf("http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https://%s/", mcsEndpoint) + var msi_endpoint *url.URL + msi_endpoint, err := url.Parse(msi_endpoint_string) + if err != nil { + Log("getAccessTokenFromIMDS: Error creating IMDS endpoint URL: %s", err.Error()) + return imdsAccessToken, 0, err + } + req, err := http.NewRequest("GET", msi_endpoint.String(), nil) + if err != nil { + Log("getAccessTokenFromIMDS: Error creating HTTP request: %s", err.Error()) + return imdsAccessToken, 0, err + } + req.Header.Add("Metadata", "true") + + //IMDS endpoint nonroutable endpoint and requests doesnt go through proxy hence using dedicated http client + httpClient := &http.Client{Timeout: 30 * time.Second} + + // Call managed services for Azure resources token endpoint + var resp *http.Response = nil + IsSuccess := false + for retryCount := 0; retryCount < MaxRetries; retryCount++ { + resp, err = httpClient.Do(req) + if err != nil { + message := fmt.Sprintf("getAccessTokenFromIMDS: Error calling token endpoint: %s, retryCount: %d", err.Error(), retryCount) + Log(message) + SendException(message) + continue + } + + if resp != nil && resp.Body != nil { + defer resp.Body.Close() + } + + Log("getAccessTokenFromIMDS: IMDS Response Status: %d, retryCount: %d", resp.StatusCode, retryCount) + if IsRetriableError(resp.StatusCode) { + message := fmt.Sprintf("getAccessTokenFromIMDS: IMDS Request failed with an error code: %d, retryCount: %d", resp.StatusCode, retryCount) + Log(message) + retryDelay := time.Duration((retryCount + 1) * 100) * time.Millisecond + if resp.StatusCode == 429 { + if resp != nil && resp.Header.Get("Retry-After") != "" { + after, err := strconv.ParseInt(resp.Header.Get("Retry-After"), 10, 64) + if err != nil && after > 0 { + retryDelay = time.Duration(after) * time.Second + } + } + } + time.Sleep(retryDelay) + continue + } else if resp.StatusCode != 200 { + message := fmt.Sprintf("getAccessTokenFromIMDS: IMDS Request failed with nonretryable error code: %d, retryCount: %d", resp.StatusCode, retryCount) + Log(message) + SendException(message) + return imdsAccessToken, 0, err + } + IsSuccess = true + break // call succeeded, don't retry any more + } + if !IsSuccess || resp == nil || resp.Body == nil { + Log("getAccessTokenFromIMDS: IMDS Request ran out of retries") + return imdsAccessToken, 0, err + } + + // Pull out response body + responseBytes, err = ioutil.ReadAll(resp.Body) + if err != nil { + Log("getAccessTokenFromIMDS: Error reading response body: %s", err.Error()) + return imdsAccessToken, 0, err + } + + } else { + Log("Info Reading IMDS Access Token from file : %s", IMDSTokenPathForWindows) + if _, err = os.Stat(IMDSTokenPathForWindows); os.IsNotExist(err) { + Log("getAccessTokenFromIMDS: IMDS token file doesnt exist: %s", err.Error()) + return imdsAccessToken, 0, err + } + //adding retries incase if we ended up reading the token file while the token file being written + for retryCount := 0; retryCount < MaxRetries; retryCount++ { + responseBytes, err = ioutil.ReadFile(IMDSTokenPathForWindows) + if err != nil { + Log("getAccessTokenFromIMDS: Could not read IMDS token from file: %s, retryCount: %d", err.Error(), retryCount) + time.Sleep(time.Duration((retryCount + 1) * 100) * time.Millisecond) + continue + } + break + } + } + + if responseBytes == nil { + Log("getAccessTokenFromIMDS: Error responseBytes is nil") + return imdsAccessToken, 0, err + } + + // Unmarshall response body into struct + var imdsResponse IMDSResponse + err = json.Unmarshal(responseBytes, &imdsResponse) + if err != nil { + Log("getAccessTokenFromIMDS: Error unmarshalling the response: %s", err.Error()) + return imdsAccessToken, 0, err + } + imdsAccessToken = imdsResponse.AccessToken + + expiration, err := strconv.ParseInt(imdsResponse.ExpiresOn, 10, 64) + if err != nil { + Log("getAccessTokenFromIMDS: Error parsing ExpiresOn field from IMDS response: %s", err.Error()) + return imdsAccessToken, 0, err + } + Log("Info getAccessTokenFromIMDS: end") + return imdsAccessToken, expiration, nil +} + +func getAgentConfiguration(imdsAccessToken string) (configurationId string, channelId string, err error) { + Log("Info getAgentConfiguration: start") + configurationId = "" + channelId = "" + var amcs_endpoint *url.URL + osType := os.Getenv("OS_TYPE") + resourceId := os.Getenv("AKS_RESOURCE_ID") + resourceRegion := os.Getenv("AKS_REGION") + mcsEndpoint := os.Getenv("MCS_ENDPOINT") + amcs_endpoint_string := fmt.Sprintf("https://%s.handler.control.%s%s/agentConfigurations?platform=%s&api-version=%s", resourceRegion, mcsEndpoint, resourceId, osType, AMCSAgentConfigAPIVersion) + amcs_endpoint, err = url.Parse(amcs_endpoint_string) + if err != nil { + Log("getAgentConfiguration: Error creating AMCS endpoint URL: %s", err.Error()) + return configurationId, channelId, err + } + + var bearer = "Bearer " + imdsAccessToken + // Create a new request using http + req, err := http.NewRequest("GET", amcs_endpoint.String(), nil) + if err != nil { + message := fmt.Sprintf("getAgentConfiguration: Error creating HTTP request for AMCS endpoint: %s", err.Error()) + Log(message) + return configurationId, channelId, err + } + req.Header.Set("Authorization", bearer) + + var resp *http.Response = nil + IsSuccess := false + for retryCount := 0; retryCount < MaxRetries; retryCount++ { + resp, err = HTTPClient.Do(req) + if err != nil { + message := fmt.Sprintf("getAgentConfiguration: Error calling AMCS endpoint: %s", err.Error()) + Log(message) + SendException(message) + continue + } + if resp != nil && resp.Body != nil { + defer resp.Body.Close() + } + Log("getAgentConfiguration Response Status: %d", resp.StatusCode) + if IsRetriableError(resp.StatusCode) { + message := fmt.Sprintf("getAgentConfiguration: Request failed with an error code: %d, retryCount: %d", resp.StatusCode, retryCount) + Log(message) + retryDelay := time.Duration((retryCount + 1) * 100) * time.Millisecond + if resp.StatusCode == 429 { + if resp != nil && resp.Header.Get("Retry-After") != "" { + after, err := strconv.ParseInt(resp.Header.Get("Retry-After"), 10, 64) + if err != nil && after > 0 { + retryDelay = time.Duration(after) * time.Second + } + } + } + time.Sleep(retryDelay) + continue + } else if resp.StatusCode != 200 { + message := fmt.Sprintf("getAgentConfiguration: Request failed with nonretryable error code: %d, retryCount: %d", resp.StatusCode, retryCount) + Log(message) + SendException(message) + return configurationId, channelId, err + } + IsSuccess = true + break // call succeeded, don't retry any more + } + if !IsSuccess || resp == nil || resp.Body == nil { + message := fmt.Sprintf("getAgentConfiguration Request ran out of retries") + Log(message) + SendException(message) + return configurationId, channelId, err + } + responseBytes, err := ioutil.ReadAll(resp.Body) + if err != nil { + Log("getAgentConfiguration: Error reading response body from AMCS API call: %s", err.Error()) + return configurationId, channelId, err + } + + // Unmarshall response body into struct + var agentConfiguration AgentConfiguration + err = json.Unmarshal(responseBytes, &agentConfiguration) + if err != nil { + message := fmt.Sprintf("getAgentConfiguration: Error unmarshalling the response: %s", err.Error()) + Log(message) + SendException(message) + return configurationId, channelId, err + } + + if len(agentConfiguration.Configurations) == 0 { + message := "getAgentConfiguration: Received empty agentConfiguration.Configurations array" + Log(message) + SendException(message) + return configurationId, channelId, err + } + + if len(agentConfiguration.Configurations[0].Content.Channels) == 0 { + message := "getAgentConfiguration: Received empty agentConfiguration.Configurations[0].Content.Channels" + Log(message) + SendException(message) + return configurationId, channelId, err + } + + configurationId = agentConfiguration.Configurations[0].Configurationid + channelId = agentConfiguration.Configurations[0].Content.Channels[0].ID + + Log("getAgentConfiguration: obtained configurationId: %s, channelId: %s", configurationId, channelId) + Log("Info getAgentConfiguration: end") + + return configurationId, channelId, nil +} + +func getIngestionAuthToken(imdsAccessToken string, configurationId string, channelId string) (ingestionAuthToken string, refreshInterval int64, err error) { + Log("Info getIngestionAuthToken: start") + ingestionAuthToken = "" + refreshInterval = 0 + var amcs_endpoint *url.URL + osType := os.Getenv("OS_TYPE") + resourceId := os.Getenv("AKS_RESOURCE_ID") + resourceRegion := os.Getenv("AKS_REGION") + mcsEndpoint := os.Getenv("MCS_ENDPOINT") + amcs_endpoint_string := fmt.Sprintf("https://%s.handler.control.%s%s/agentConfigurations/%s/channels/%s/issueIngestionToken?platform=%s&api-version=%s", resourceRegion, mcsEndpoint, resourceId, configurationId, channelId, osType, AMCSIngestionTokenAPIVersion) + amcs_endpoint, err = url.Parse(amcs_endpoint_string) + if err != nil { + Log("getIngestionAuthToken: Error creating AMCS endpoint URL: %s", err.Error()) + return ingestionAuthToken, refreshInterval, err + } + + var bearer = "Bearer " + imdsAccessToken + // Create a new request using http + req, err := http.NewRequest("GET", amcs_endpoint.String(), nil) + if err != nil { + Log("getIngestionAuthToken: Error creating HTTP request for AMCS endpoint: %s", err.Error()) + return ingestionAuthToken, refreshInterval, err + } + + // add authorization header to the req + req.Header.Add("Authorization", bearer) + + var resp *http.Response = nil + IsSuccess := false + for retryCount := 0; retryCount < MaxRetries; retryCount++ { + // Call managed services for Azure resources token endpoint + resp, err = HTTPClient.Do(req) + if err != nil { + message := fmt.Sprintf("getIngestionAuthToken: Error calling AMCS endpoint for ingestion auth token: %s", err.Error()) + Log(message) + SendException(message) + resp = nil + continue + } + + if resp != nil && resp.Body != nil { + defer resp.Body.Close() + } + + Log("getIngestionAuthToken Response Status: %d", resp.StatusCode) + if IsRetriableError(resp.StatusCode) { + message := fmt.Sprintf("getIngestionAuthToken: Request failed with an error code: %d, retryCount: %d", resp.StatusCode, retryCount) + Log(message) + retryDelay := time.Duration((retryCount + 1) * 100) * time.Millisecond + if resp.StatusCode == 429 { + if resp != nil && resp.Header.Get("Retry-After") != "" { + after, err := strconv.ParseInt(resp.Header.Get("Retry-After"), 10, 64) + if err != nil && after > 0 { + retryDelay = time.Duration(after) * time.Second + } + } + } + time.Sleep(retryDelay) + continue + } else if resp.StatusCode != 200 { + message := fmt.Sprintf("getIngestionAuthToken: Request failed with nonretryable error code: %d, retryCount: %d", resp.StatusCode, retryCount) + Log(message) + SendException(message) + return ingestionAuthToken, refreshInterval, err + } + IsSuccess = true + break + } + + if !IsSuccess || resp == nil || resp.Body == nil { + message := "getIngestionAuthToken: ran out of retries calling AMCS for ingestion token" + Log(message) + SendException(message) + return ingestionAuthToken, refreshInterval, err + } + + // Pull out response body + responseBytes, err := ioutil.ReadAll(resp.Body) + if err != nil { + Log("getIngestionAuthToken: Error reading response body from AMCS Ingestion API call : %s", err.Error()) + return ingestionAuthToken, refreshInterval, err + } + + // Unmarshall response body into struct + var ingestionTokenResponse IngestionTokenResponse + err = json.Unmarshal(responseBytes, &ingestionTokenResponse) + if err != nil { + Log("getIngestionAuthToken: Error unmarshalling the response: %s", err.Error()) + return ingestionAuthToken, refreshInterval, err + } + + ingestionAuthToken = ingestionTokenResponse.Ingestionauthtoken + + refreshInterval, err = getTokenRefreshIntervalFromAmcsResponse(resp.Header) + if err != nil { + Log("getIngestionAuthToken: Error failed to parse max-age response header") + return ingestionAuthToken, refreshInterval, err + } + Log("getIngestionAuthToken: refresh interval %d seconds", refreshInterval) + + Log("Info getIngestionAuthToken: end") + return ingestionAuthToken, refreshInterval, nil +} + +var cacheControlHeaderRegex = regexp.MustCompile(`max-age=([0-9]+)`) + +func getTokenRefreshIntervalFromAmcsResponse(header http.Header) (refreshInterval int64, err error) { + cacheControlHeader, valueInMap := header["Cache-Control"] + if !valueInMap { + return 0, errors.New("getTokenRefreshIntervalFromAmcsResponse: Cache-Control not in passed header") + } + + for _, entry := range cacheControlHeader { + match := cacheControlHeaderRegex.FindStringSubmatch(entry) + if len(match) == 2 { + interval := 0 + interval, err = strconv.Atoi(match[1]) + if err != nil { + Log("getTokenRefreshIntervalFromAmcsResponse: error getting timeout from auth token. Header: " + strings.Join(cacheControlHeader, ",")) + return 0, err + } + refreshInterval = int64(interval) + return refreshInterval, nil + } + } + + return 0, errors.New("getTokenRefreshIntervalFromAmcsResponse: didn't find max-age in response header") +} + +func refreshIngestionAuthToken() { + for ; true; <-IngestionAuthTokenRefreshTicker.C { + if IMDSToken == "" || IMDSTokenExpiration <= (time.Now().Unix() + 60 * 60) { // token valid 24 hrs and refresh token 1 hr before expiry + imdsToken, imdsTokenExpiry, err := getAccessTokenFromIMDS() + if err != nil { + message := fmt.Sprintf("refreshIngestionAuthToken: Error on getAccessTokenFromIMDS %s \n", err.Error()) + Log(message) + SendException(message) + } else { + IMDSToken = imdsToken + IMDSTokenExpiration = imdsTokenExpiry + } + } + if IMDSToken == "" { + message := "refreshIngestionAuthToken: IMDSToken is empty" + Log(message) + SendException(message) + continue + } + var err error + // ignore agent configuration expiring, the configuration and channel IDs will never change (without creating an agent restart) + if ConfigurationId == "" || ChannelId == "" { + ConfigurationId, ChannelId, err = getAgentConfiguration(IMDSToken) + if err != nil { + message := fmt.Sprintf("refreshIngestionAuthToken: Error getAgentConfiguration %s \n", err.Error()) + Log(message) + SendException(message) + continue + } + } + if IMDSToken == "" || ConfigurationId == "" || ChannelId == "" { + message := "refreshIngestionAuthToken: IMDSToken or ConfigurationId or ChannelId empty" + Log(message) + SendException(message) + continue + } + ingestionAuthToken, refreshIntervalInSeconds, err := getIngestionAuthToken(IMDSToken, ConfigurationId, ChannelId) + if err != nil { + message := fmt.Sprintf("refreshIngestionAuthToken: Error getIngestionAuthToken %s \n", err.Error()) + Log(message) + SendException(message) + continue + } + IngestionAuthTokenUpdateMutex.Lock() + ODSIngestionAuthToken = ingestionAuthToken + IngestionAuthTokenUpdateMutex.Unlock() + if refreshIntervalInSeconds > 0 && refreshIntervalInSeconds != defaultIngestionAuthTokenRefreshIntervalSeconds { + //TODO - use Reset which is better when go version upgraded to 1.15 or up rather Stop() and NewTicker + //IngestionAuthTokenRefreshTicker.Reset(time.Second * time.Duration(refreshIntervalInSeconds)) + IngestionAuthTokenRefreshTicker.Stop() + IngestionAuthTokenRefreshTicker = time.NewTicker(time.Second * time.Duration(refreshIntervalInSeconds)) + } + } +} + +func IsRetriableError(httpStatusCode int) bool { + retryableStatusCodes := [5]int{408, 429, 502, 503, 504} + for _, code := range retryableStatusCodes { + if code == httpStatusCode { + return true + } + } + return false +} diff --git a/source/plugins/go/src/oms.go b/source/plugins/go/src/oms.go index 5f18728b7..91a5b4b40 100644 --- a/source/plugins/go/src/oms.go +++ b/source/plugins/go/src/oms.go @@ -22,6 +22,7 @@ import ( "github.com/tinylib/msgp/msgp" lumberjack "gopkg.in/natefinch/lumberjack.v2" + "Docker-Provider/source/plugins/go/src/extension" "github.com/Azure/azure-kusto-go/kusto/ingest" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" @@ -88,6 +89,7 @@ const IPName = "ContainerInsights" const defaultContainerInventoryRefreshInterval = 60 const kubeMonAgentConfigEventFlushInterval = 60 +const defaultIngestionAuthTokenRefreshIntervalSeconds = 3600 //Eventsource name in mdsd const MdsdContainerLogSourceName = "ContainerLogSource" @@ -106,6 +108,11 @@ const ContainerLogsV1Route = "v1" //container logs schema (v2=ContainerLogsV2 table in LA, anything else ContainerLogs table in LA. This is applicable only if Container logs route is NOT ADX) const ContainerLogV2SchemaVersion = "v2" +//env variable for AAD MSI Auth mode +const AADMSIAuthMode = "AAD_MSI_AUTH_MODE" + +// Tag prefix of mdsd output streamid for AMA in MSI auth mode +const MdsdOutputStreamIdTagPrefix = "dcr-" //env variable to container type const ContainerTypeEnv = "CONTAINER_TYPE" @@ -169,6 +176,8 @@ var ( IsWindows bool // container type ContainerType string + // flag to check whether LA AAD MSI Auth Enabled or not + IsAADMSIAuthMode bool ) var ( @@ -194,6 +203,10 @@ var ( EventHashUpdateMutex = &sync.Mutex{} // parent context used by ADX uploader ParentContext = context.Background() + // IngestionAuthTokenUpdateMutex read and write mutex access for ODSIngestionAuthToken + IngestionAuthTokenUpdateMutex = &sync.Mutex{} + // ODSIngestionAuthToken for windows agent AAD MSI Auth + ODSIngestionAuthToken string ) var ( @@ -201,6 +214,8 @@ var ( ContainerImageNameRefreshTicker *time.Ticker // KubeMonAgentConfigEventsSendTicker to send config events every hour KubeMonAgentConfigEventsSendTicker *time.Ticker + // IngestionAuthTokenRefreshTicker to refresh ingestion token + IngestionAuthTokenRefreshTicker *time.Ticker ) var ( @@ -402,7 +417,9 @@ func updateContainerImageNameMaps() { listOptions := metav1.ListOptions{} listOptions.FieldSelector = fmt.Sprintf("spec.nodeName=%s", Computer) - pods, err := ClientSet.CoreV1().Pods("").List(listOptions) + + // Context was added as a parameter, but we want the same behavior as before: see https://pkg.go.dev/context#TODO + pods, err := ClientSet.CoreV1().Pods("").List(context.TODO(), listOptions) if err != nil { message := fmt.Sprintf("Error getting pods %s\nIt is ok to log here and continue, because the logs will be missing image and Name, but the logs will still have the containerID", err.Error()) @@ -703,6 +720,10 @@ func flushKubeMonAgentEventRecords() { } } if (IsWindows == false && len(msgPackEntries) > 0) { //for linux, mdsd route + if IsAADMSIAuthMode == true && strings.HasPrefix(MdsdKubeMonAgentEventsTagName, MdsdOutputStreamIdTagPrefix) == false { + Log("Info::mdsd::obtaining output stream id for data type: %s", KubeMonAgentEventDataType) + MdsdKubeMonAgentEventsTagName = extension.GetInstance(FLBLogger, ContainerType).GetOutputStreamId(KubeMonAgentEventDataType) + } Log("Info::mdsd:: using mdsdsource name for KubeMonAgentEvents: %s", MdsdKubeMonAgentEventsTagName) msgpBytes := convertMsgPackEntriesToMsgpBytes(MdsdKubeMonAgentEventsTagName, msgPackEntries) if MdsdKubeMonMsgpUnixSocketClient == nil { @@ -760,6 +781,16 @@ func flushKubeMonAgentEventRecords() { req.Header.Set("x-ms-AzureResourceId", ResourceID) } + if IsAADMSIAuthMode == true { + IngestionAuthTokenUpdateMutex.Lock() + ingestionAuthToken := ODSIngestionAuthToken + IngestionAuthTokenUpdateMutex.Unlock() + if ingestionAuthToken == "" { + Log("Error::ODS Ingestion Auth Token is empty. Please check error log.") + } + req.Header.Set("Authorization", "Bearer "+ingestionAuthToken) + } + resp, err := HTTPClient.Do(req) elapsed = time.Since(start) @@ -905,6 +936,10 @@ func PostTelegrafMetricsToLA(telegrafRecords []map[interface{}]interface{}) int } } if (len(msgPackEntries) > 0) { + if IsAADMSIAuthMode == true && (strings.HasPrefix(MdsdInsightsMetricsTagName, MdsdOutputStreamIdTagPrefix) == false) { + Log("Info::mdsd::obtaining output stream id for InsightsMetricsDataType since Log Analytics AAD MSI Auth Enabled") + MdsdInsightsMetricsTagName = extension.GetInstance(FLBLogger, ContainerType).GetOutputStreamId(InsightsMetricsDataType) + } msgpBytes := convertMsgPackEntriesToMsgpBytes(MdsdInsightsMetricsTagName, msgPackEntries) if MdsdInsightsMetricsMsgpUnixSocketClient == nil { Log("Error::mdsd::mdsd connection does not exist. re-connecting ...") @@ -926,6 +961,7 @@ func PostTelegrafMetricsToLA(telegrafRecords []map[interface{}]interface{}) int if er != nil { Log("Error::mdsd::Failed to write to mdsd %d records after %s. Will retry ... error : %s", len(msgPackEntries), elapsed, er.Error()) + UpdateNumTelegrafMetricsSentTelemetry(0, 1, 0) if MdsdInsightsMetricsMsgpUnixSocketClient != nil { MdsdInsightsMetricsMsgpUnixSocketClient.Close() MdsdInsightsMetricsMsgpUnixSocketClient = nil @@ -937,6 +973,7 @@ func PostTelegrafMetricsToLA(telegrafRecords []map[interface{}]interface{}) int return output.FLB_RETRY } else { numTelegrafMetricsRecords := len(msgPackEntries) + UpdateNumTelegrafMetricsSentTelemetry(numTelegrafMetricsRecords, 0, 0) Log("Success::mdsd::Successfully flushed %d telegraf metrics records that was %d bytes to mdsd in %s ", numTelegrafMetricsRecords, bts, elapsed) } } @@ -979,6 +1016,18 @@ func PostTelegrafMetricsToLA(telegrafRecords []map[interface{}]interface{}) int if ResourceCentric == true { req.Header.Set("x-ms-AzureResourceId", ResourceID) } + if IsAADMSIAuthMode == true { + IngestionAuthTokenUpdateMutex.Lock() + ingestionAuthToken := ODSIngestionAuthToken + IngestionAuthTokenUpdateMutex.Unlock() + if ingestionAuthToken == "" { + message := "Error::ODS Ingestion Auth Token is empty. Please check error log." + Log(message) + return output.FLB_RETRY + } + // add authorization header to the req + req.Header.Set("Authorization", "Bearer "+ingestionAuthToken) + } start := time.Now() resp, err := HTTPClient.Do(req) @@ -1184,6 +1233,16 @@ func PostDataHelper(tailPluginRecords []map[interface{}]interface{}) int { if len(msgPackEntries) > 0 && ContainerLogsRouteV2 == true { //flush to mdsd + if IsAADMSIAuthMode == true && strings.HasPrefix(MdsdContainerLogTagName, MdsdOutputStreamIdTagPrefix) == false { + Log("Info::mdsd::obtaining output stream id") + if ContainerLogSchemaV2 == true { + MdsdContainerLogTagName = extension.GetInstance(FLBLogger, ContainerType).GetOutputStreamId(ContainerLogV2DataType) + } else { + MdsdContainerLogTagName = extension.GetInstance(FLBLogger, ContainerType).GetOutputStreamId(ContainerLogDataType) + } + Log("Info::mdsd:: using mdsdsource name: %s", MdsdContainerLogTagName) + } + fluentForward := MsgPackForward{ Tag: MdsdContainerLogTagName, Entries: msgPackEntries, @@ -1285,7 +1344,7 @@ func PostDataHelper(tailPluginRecords []map[interface{}]interface{}) int { //ADXFlushMutex.Lock() //defer ADXFlushMutex.Unlock() //MultiJSON support is not there yet - if ingestionErr := ADXIngestor.FromReader(ctx, r, ingest.IngestionMappingRef("ContainerLogV2Mapping", ingest.JSON), ingest.FileFormat(ingest.JSON)); ingestionErr != nil { + if _, ingestionErr := ADXIngestor.FromReader(ctx, r, ingest.IngestionMappingRef("ContainerLogV2Mapping", ingest.JSON), ingest.FileFormat(ingest.JSON)); ingestionErr != nil { Log("Error when streaming to ADX Ingestion: %s", ingestionErr.Error()) //ADXIngestor = nil //not required as per ADX team. Will keep it to indicate that we tried this approach @@ -1343,6 +1402,18 @@ func PostDataHelper(tailPluginRecords []map[interface{}]interface{}) int { req.Header.Set("x-ms-AzureResourceId", ResourceID) } + if IsAADMSIAuthMode == true { + IngestionAuthTokenUpdateMutex.Lock() + ingestionAuthToken := ODSIngestionAuthToken + IngestionAuthTokenUpdateMutex.Unlock() + if ingestionAuthToken == "" { + Log("Error::ODS Ingestion Auth Token is empty. Please check error log.") + return output.FLB_RETRY + } + // add authorization header to the req + req.Header.Set("Authorization", "Bearer "+ingestionAuthToken) + } + resp, err := HTTPClient.Do(req) elapsed = time.Since(start) @@ -1440,7 +1511,6 @@ func GetContainerIDK8sNamespacePodNameFromFileName(filename string) (string, str // InitializePlugin reads and populates plugin configuration func InitializePlugin(pluginConfPath string, agentVersion string) { - go func() { isTest := os.Getenv("ISTEST") if strings.Compare(strings.ToLower(strings.TrimSpace(isTest)), "true") == 0 { @@ -1541,6 +1611,11 @@ func InitializePlugin(pluginConfPath string, agentVersion string) { } Log("OMSEndpoint %s", OMSEndpoint) + IsAADMSIAuthMode = false + if strings.Compare(strings.ToLower(os.Getenv(AADMSIAuthMode)), "true") == 0 { + IsAADMSIAuthMode = true + Log("AAD MSI Auth Mode Configured") + } ResourceID = os.Getenv(envAKSResourceID) if len(ResourceID) > 0 { @@ -1713,4 +1788,10 @@ func InitializePlugin(pluginConfPath string, agentVersion string) { MdsdInsightsMetricsTagName = MdsdInsightsMetricsSourceName MdsdKubeMonAgentEventsTagName = MdsdKubeMonAgentEventsSourceName -} \ No newline at end of file + Log("ContainerLogsRouteADX: %v, IsWindows: %v, IsAADMSIAuthMode = %v \n", ContainerLogsRouteADX, IsWindows, IsAADMSIAuthMode) + if !ContainerLogsRouteADX && IsWindows && IsAADMSIAuthMode { + Log("defaultIngestionAuthTokenRefreshIntervalSeconds = %d \n", defaultIngestionAuthTokenRefreshIntervalSeconds) + IngestionAuthTokenRefreshTicker = time.NewTicker(time.Second * time.Duration(defaultIngestionAuthTokenRefreshIntervalSeconds)) + go refreshIngestionAuthToken() + } +} diff --git a/source/plugins/go/src/telemetry.go b/source/plugins/go/src/telemetry.go index 4750b4624..31818dbb3 100644 --- a/source/plugins/go/src/telemetry.go +++ b/source/plugins/go/src/telemetry.go @@ -145,8 +145,8 @@ func SendContainerLogPluginMetrics(telemetryPushIntervalProperty string) { ContainerLogTelemetryMutex.Unlock() if strings.Compare(strings.ToLower(os.Getenv("CONTROLLER_TYPE")), "daemonset") == 0 { + telemetryDimensions := make(map[string]string) if strings.Compare(strings.ToLower(os.Getenv("CONTAINER_TYPE")), "prometheussidecar") == 0 { - telemetryDimensions := make(map[string]string) telemetryDimensions["CustomPromMonitorPods"] = promMonitorPods if promMonitorPodsNamespaceLength > 0 { telemetryDimensions["CustomPromMonitorPodsNamespaceLength"] = strconv.Itoa(promMonitorPodsNamespaceLength) @@ -161,10 +161,30 @@ func SendContainerLogPluginMetrics(telemetryPushIntervalProperty string) { telemetryDimensions["OsmNamespaceCount"] = strconv.Itoa(osmNamespaceCount) } + telemetryDimensions["PromFbitChunkSize"] = os.Getenv("AZMON_SIDECAR_FBIT_CHUNK_SIZE") + telemetryDimensions["PromFbitBufferSize"] = os.Getenv("AZMON_SIDECAR_FBIT_BUFFER_SIZE") + telemetryDimensions["PromFbitMemBufLimit"] = os.Getenv("AZMON_SIDECAR_FBIT_MEM_BUF_LIMIT") + SendEvent(eventNameCustomPrometheusSidecarHeartbeat, telemetryDimensions) } else { - SendEvent(eventNameDaemonSetHeartbeat, make(map[string]string)) + fbitFlushIntervalSecs := os.Getenv("FBIT_SERVICE_FLUSH_INTERVAL") + if fbitFlushIntervalSecs != "" { + telemetryDimensions["FbitServiceFlushIntervalSecs"] = fbitFlushIntervalSecs + } + fbitTailBufferChunkSizeMBs := os.Getenv("FBIT_TAIL_BUFFER_CHUNK_SIZE") + if fbitTailBufferChunkSizeMBs != "" { + telemetryDimensions["FbitBufferChunkSizeMBs"] = fbitTailBufferChunkSizeMBs + } + fbitTailBufferMaxSizeMBs := os.Getenv("FBIT_TAIL_BUFFER_MAX_SIZE") + if fbitTailBufferMaxSizeMBs != "" { + telemetryDimensions["FbitBufferMaxSizeMBs"] = fbitTailBufferMaxSizeMBs + } + fbitTailMemBufLimitMBs := os.Getenv("FBIT_TAIL_MEM_BUF_LIMIT") + if fbitTailMemBufLimitMBs != "" { + telemetryDimensions["FbitMemBufLimitSizeMBs"] = fbitTailMemBufLimitMBs + } + SendEvent(eventNameDaemonSetHeartbeat, telemetryDimensions) flushRateMetric := appinsights.NewMetricTelemetry(metricNameAvgFlushRate, flushRate) TelemetryClient.Track(flushRateMetric) logRateMetric := appinsights.NewMetricTelemetry(metricNameAvgLogGenerationRate, logRate) diff --git a/source/plugins/go/src/utils.go b/source/plugins/go/src/utils.go index 3fe5c6d0e..73c8cf6d3 100644 --- a/source/plugins/go/src/utils.go +++ b/source/plugins/go/src/utils.go @@ -12,8 +12,8 @@ import ( "net/url" "os" "strings" - "time" - + "time" + "github.com/Azure/azure-kusto-go/kusto" "github.com/Azure/azure-kusto-go/kusto/ingest" "github.com/Azure/go-autorest/autorest/azure/auth" @@ -63,27 +63,32 @@ func ReadConfiguration(filename string) (map[string]string, error) { // CreateHTTPClient used to create the client for sending post requests to OMSEndpoint func CreateHTTPClient() { - certFilePath := PluginConfiguration["cert_file_path"] - keyFilePath := PluginConfiguration["key_file_path"] - if IsWindows == false { - certFilePath = fmt.Sprintf(certFilePath, WorkspaceID) - keyFilePath = fmt.Sprintf(keyFilePath, WorkspaceID) - } - cert, err := tls.LoadX509KeyPair(certFilePath, keyFilePath) - if err != nil { - message := fmt.Sprintf("Error when loading cert %s", err.Error()) - SendException(message) - time.Sleep(30 * time.Second) - Log(message) - log.Fatalf("Error when loading cert %s", err.Error()) - } + var transport *http.Transport + if IsAADMSIAuthMode { + transport = &http.Transport{} + } else { + certFilePath := PluginConfiguration["cert_file_path"] + keyFilePath := PluginConfiguration["key_file_path"] + if IsWindows == false { + certFilePath = fmt.Sprintf(certFilePath, WorkspaceID) + keyFilePath = fmt.Sprintf(keyFilePath, WorkspaceID) + } + cert, err := tls.LoadX509KeyPair(certFilePath, keyFilePath) + if err != nil { + message := fmt.Sprintf("Error when loading cert %s", err.Error()) + SendException(message) + time.Sleep(30 * time.Second) + Log(message) + log.Fatalf("Error when loading cert %s", err.Error()) + } - tlsConfig := &tls.Config{ - Certificates: []tls.Certificate{cert}, - } + tlsConfig := &tls.Config{ + Certificates: []tls.Certificate{cert}, + } - tlsConfig.BuildNameToCertificate() - transport := &http.Transport{TLSClientConfig: tlsConfig} + tlsConfig.BuildNameToCertificate() + transport = &http.Transport{TLSClientConfig: tlsConfig} + } // set the proxy if the proxy configured if ProxyEndpoint != "" { proxyEndpointUrl, err := url.Parse(ProxyEndpoint) @@ -100,7 +105,7 @@ func CreateHTTPClient() { HTTPClient = http.Client{ Transport: transport, Timeout: 30 * time.Second, - } + } Log("Successfully created HTTP Client") } @@ -118,57 +123,57 @@ func ToString(s interface{}) string { //mdsdSocketClient to write msgp messages func CreateMDSDClient(dataType DataType, containerType string) { - mdsdfluentSocket := "/var/run/mdsd/default_fluent.socket" + mdsdfluentSocket := "/var/run/mdsd/default_fluent.socket" if containerType != "" && strings.Compare(strings.ToLower(containerType), "prometheussidecar") == 0 { - mdsdfluentSocket = fmt.Sprintf("/var/run/mdsd-%s/default_fluent.socket", containerType) - } + mdsdfluentSocket = fmt.Sprintf("/var/run/mdsd-%s/default_fluent.socket", containerType) + } switch dataType { - case ContainerLogV2: - if MdsdMsgpUnixSocketClient != nil { - MdsdMsgpUnixSocketClient.Close() - MdsdMsgpUnixSocketClient = nil - } - /*conn, err := fluent.New(fluent.Config{FluentNetwork:"unix", - FluentSocketPath:"/var/run/mdsd/default_fluent.socket", - WriteTimeout: 5 * time.Second, - RequestAck: true}) */ - conn, err := net.DialTimeout("unix", - mdsdfluentSocket, 10*time.Second) - if err != nil { - Log("Error::mdsd::Unable to open MDSD msgp socket connection for ContainerLogV2 %s", err.Error()) - //log.Fatalf("Unable to open MDSD msgp socket connection %s", err.Error()) - } else { - Log("Successfully created MDSD msgp socket connection for ContainerLogV2: %s", mdsdfluentSocket) - MdsdMsgpUnixSocketClient = conn - } - case KubeMonAgentEvents: - if MdsdKubeMonMsgpUnixSocketClient != nil { - MdsdKubeMonMsgpUnixSocketClient.Close() - MdsdKubeMonMsgpUnixSocketClient = nil - } - conn, err := net.DialTimeout("unix", - mdsdfluentSocket, 10*time.Second) - if err != nil { - Log("Error::mdsd::Unable to open MDSD msgp socket connection for KubeMon events %s", err.Error()) - //log.Fatalf("Unable to open MDSD msgp socket connection %s", err.Error()) - } else { - Log("Successfully created MDSD msgp socket connection for KubeMon events:%s", mdsdfluentSocket) - MdsdKubeMonMsgpUnixSocketClient = conn - } - case InsightsMetrics: - if MdsdInsightsMetricsMsgpUnixSocketClient != nil { - MdsdInsightsMetricsMsgpUnixSocketClient.Close() - MdsdInsightsMetricsMsgpUnixSocketClient = nil - } - conn, err := net.DialTimeout("unix", - mdsdfluentSocket, 10*time.Second) - if err != nil { - Log("Error::mdsd::Unable to open MDSD msgp socket connection for insights metrics %s", err.Error()) - //log.Fatalf("Unable to open MDSD msgp socket connection %s", err.Error()) - } else { - Log("Successfully created MDSD msgp socket connection for Insights metrics %s", mdsdfluentSocket) - MdsdInsightsMetricsMsgpUnixSocketClient = conn - } + case ContainerLogV2: + if MdsdMsgpUnixSocketClient != nil { + MdsdMsgpUnixSocketClient.Close() + MdsdMsgpUnixSocketClient = nil + } + /*conn, err := fluent.New(fluent.Config{FluentNetwork:"unix", + FluentSocketPath:"/var/run/mdsd/default_fluent.socket", + WriteTimeout: 5 * time.Second, + RequestAck: true}) */ + conn, err := net.DialTimeout("unix", + mdsdfluentSocket, 10*time.Second) + if err != nil { + Log("Error::mdsd::Unable to open MDSD msgp socket connection for ContainerLogV2 %s", err.Error()) + //log.Fatalf("Unable to open MDSD msgp socket connection %s", err.Error()) + } else { + Log("Successfully created MDSD msgp socket connection for ContainerLogV2: %s", mdsdfluentSocket) + MdsdMsgpUnixSocketClient = conn + } + case KubeMonAgentEvents: + if MdsdKubeMonMsgpUnixSocketClient != nil { + MdsdKubeMonMsgpUnixSocketClient.Close() + MdsdKubeMonMsgpUnixSocketClient = nil + } + conn, err := net.DialTimeout("unix", + mdsdfluentSocket, 10*time.Second) + if err != nil { + Log("Error::mdsd::Unable to open MDSD msgp socket connection for KubeMon events %s", err.Error()) + //log.Fatalf("Unable to open MDSD msgp socket connection %s", err.Error()) + } else { + Log("Successfully created MDSD msgp socket connection for KubeMon events:%s", mdsdfluentSocket) + MdsdKubeMonMsgpUnixSocketClient = conn + } + case InsightsMetrics: + if MdsdInsightsMetricsMsgpUnixSocketClient != nil { + MdsdInsightsMetricsMsgpUnixSocketClient.Close() + MdsdInsightsMetricsMsgpUnixSocketClient = nil + } + conn, err := net.DialTimeout("unix", + mdsdfluentSocket, 10*time.Second) + if err != nil { + Log("Error::mdsd::Unable to open MDSD msgp socket connection for insights metrics %s", err.Error()) + //log.Fatalf("Unable to open MDSD msgp socket connection %s", err.Error()) + } else { + Log("Successfully created MDSD msgp socket connection for Insights metrics %s", mdsdfluentSocket) + MdsdInsightsMetricsMsgpUnixSocketClient = conn + } } } @@ -197,11 +202,15 @@ func CreateADXClient() { } func ReadFileContents(fullPathToFileName string) (string, error) { + return ReadFileContentsImpl(fullPathToFileName, ioutil.ReadFile) +} + +func ReadFileContentsImpl(fullPathToFileName string, readfilefunc func(string) ([]byte, error)) (string, error) { fullPathToFileName = strings.TrimSpace(fullPathToFileName) if len(fullPathToFileName) == 0 { return "", errors.New("ReadFileContents::filename is empty") } - content, err := ioutil.ReadFile(fullPathToFileName) //no need to close + content, err := readfilefunc(fullPathToFileName) //no need to close if err != nil { return "", errors.New("ReadFileContents::Unable to open file " + fullPathToFileName) } else { @@ -223,7 +232,6 @@ func isValidUrl(uri string) bool { func convertMsgPackEntriesToMsgpBytes(fluentForwardTag string, msgPackEntries []MsgPackEntry) []byte { var msgpBytes []byte - fluentForward := MsgPackForward{ Tag: fluentForwardTag, Entries: msgPackEntries, @@ -234,7 +242,7 @@ func convertMsgPackEntriesToMsgpBytes(fluentForwardTag string, msgPackEntries [] msgpSize += 1 + msgp.Int64Size + msgp.GuessSize(fluentForward.Entries[i].Record) } - //allocate buffer for msgp message + //allocate buffer for msgp message msgpBytes = msgp.Require(nil, msgpSize) //construct the stream @@ -247,6 +255,6 @@ func convertMsgPackEntriesToMsgpBytes(fluentForwardTag string, msgPackEntries [] msgpBytes = msgp.AppendInt64(msgpBytes, batchTime) msgpBytes = msgp.AppendMapStrStr(msgpBytes, fluentForward.Entries[entry].Record) } - - return msgpBytes + + return msgpBytes } diff --git a/source/plugins/go/src/utils_test.go b/source/plugins/go/src/utils_test.go new file mode 100644 index 000000000..ab61ce751 --- /dev/null +++ b/source/plugins/go/src/utils_test.go @@ -0,0 +1,79 @@ +package main + +import ( + "errors" + "testing" +) + +func Test_isValidUrl(t *testing.T) { + type test_struct struct { + isValid bool + url string + } + + tests := []test_struct{ + {true, "https://www.microsoft.com"}, + {true, "http://abc.xyz"}, + {true, "https://www.microsoft.com/tests"}, + {false, "()"}, + {false, "https//www.microsoft.com"}, + {false, "https:/www.microsoft.com"}, + {false, "https:/www.microsoft.com*"}, + {false, ""}, + } + + for _, tt := range tests { + t.Run(tt.url, func(t *testing.T) { + got := isValidUrl(tt.url) + if got != tt.isValid { + t.Errorf("isValidUrl(%s) = %t, want %t", tt.url, got, tt.isValid) + } + }) + } +} + +func Test_ReadFileContents(t *testing.T) { + type mock_struct struct { + expectedFilePath string + fileContents []byte + err error + } + type test_struct struct { + testname string + calledFilePath string + subcall_spec mock_struct + output string + err bool + } + + tests := []test_struct{ + {"normal", "foobar.txt", mock_struct{"foobar.txt", []byte("asdf"), nil}, "asdf", false}, + {"extra whitespace", "foobar.txt ", mock_struct{"foobar.txt", []byte("asdf \t"), nil}, "asdf", false}, + {"empty filename", "", mock_struct{"", []byte(""), nil}, "", true}, + {"file doesn't exist", "asdf.txt", mock_struct{"asdf", []byte(""), errors.New("this error doesn't matter much")}, "", true}, + } + + for _, tt := range tests { + t.Run(string(tt.testname), func(t *testing.T) { + + readfileFunc := func(filename string) ([]byte, error) { + if filename == tt.subcall_spec.expectedFilePath { + return tt.subcall_spec.fileContents, nil + } + return []byte(""), errors.New("file not found") + } + + got, err := ReadFileContentsImpl(tt.calledFilePath, readfileFunc) + + if got != tt.output || !(tt.err == (err != nil)) { + t.Errorf("ReadFileContents(%v) = (%v, %v), want (%v, %v)", tt.calledFilePath, got, err, tt.output, tt.err) + if got != tt.output { + t.Errorf("output strings are not equal") + } + if tt.err == (err != nil) { + t.Errorf("errors are not equal") + } + } + }) + } +} diff --git a/source/plugins/ruby/ApplicationInsightsUtility.rb b/source/plugins/ruby/ApplicationInsightsUtility.rb index 31f9503cd..7691304a6 100644 --- a/source/plugins/ruby/ApplicationInsightsUtility.rb +++ b/source/plugins/ruby/ApplicationInsightsUtility.rb @@ -21,6 +21,8 @@ class ApplicationInsightsUtility @@EnvApplicationInsightsEndpoint = "APPLICATIONINSIGHTS_ENDPOINT" @@EnvControllerType = "CONTROLLER_TYPE" @@EnvContainerRuntime = "CONTAINER_RUNTIME" + @@EnvAADMSIAuthMode = "AAD_MSI_AUTH_MODE" + @@isWindows = false @@hostName = (OMS::Common.get_hostname) @@os_type = ENV["OS_TYPE"] @@ -82,7 +84,12 @@ def initializeUtility() isProxyConfigured = false $log.info("proxy is not configured") end - + aadAuthMSIMode = ENV[@@EnvAADMSIAuthMode] + if !aadAuthMSIMode.nil? && !aadAuthMSIMode.empty? && aadAuthMSIMode.downcase == "true".downcase + @@CustomProperties["aadAuthMSIMode"] = "true" + else + @@CustomProperties["aadAuthMSIMode"] = "false" + end #Check if telemetry is turned off telemetryOffSwitch = ENV["DISABLE_TELEMETRY"] if telemetryOffSwitch && !telemetryOffSwitch.nil? && !telemetryOffSwitch.empty? && telemetryOffSwitch.downcase == "true".downcase @@ -236,6 +243,9 @@ def sendTelemetry(pluginName, properties) getContainerRuntimeInfo() end @@CustomProperties["Computer"] = properties["Computer"] + if !properties["addonTokenAdapterImageTag"].nil? && !properties["addonTokenAdapterImageTag"].empty? + @@CustomProperties["addonTokenAdapterImageTag"] = properties["addonTokenAdapterImageTag"] + end sendHeartBeatEvent(pluginName) sendLastProcessedContainerInventoryCountMetric(pluginName, properties) rescue => errorStr diff --git a/source/plugins/ruby/CAdvisorMetricsAPIClient.rb b/source/plugins/ruby/CAdvisorMetricsAPIClient.rb index 10720752d..63f43eaf1 100644 --- a/source/plugins/ruby/CAdvisorMetricsAPIClient.rb +++ b/source/plugins/ruby/CAdvisorMetricsAPIClient.rb @@ -40,9 +40,9 @@ class CAdvisorMetricsAPIClient @os_type = ENV["OS_TYPE"] if !@os_type.nil? && !@os_type.empty? && @os_type.strip.casecmp("windows") == 0 - @LogPath = "/etc/omsagentwindows/kubernetes_perf_log.txt" + @LogPath = Constants::WINDOWS_LOG_PATH + "kubernetes_perf_log.txt" else - @LogPath = "/var/opt/microsoft/docker-cimprov/log/kubernetes_perf_log.txt" + @LogPath = Constants::LINUX_LOG_PATH + "kubernetes_perf_log.txt" end @Log = Logger.new(@LogPath, 2, 10 * 1048576) #keep last 2 files, max log file size = 10M # @@rxBytesLast = nil @@ -81,6 +81,11 @@ def getSummaryStatsFromCAdvisor(winNode) return getResponse(winNode, relativeUri) end + def getCongifzCAdvisor(winNode: nil) + relativeUri = "/configz" + return getResponse(winNode, relativeUri) + end + def getAllMetricsCAdvisor(winNode: nil) relativeUri = "/metrics/cadvisor" return getResponse(winNode, relativeUri) @@ -230,17 +235,17 @@ def getContainerCpuMetricItems(metricJSON, hostName, cpuMetricNameToCollect, met metricItem["ObjectName"] = Constants::OBJECT_NAME_K8S_CONTAINER metricItem["InstanceName"] = clusterId + "/" + podUid + "/" + containerName - + metricCollection = {} metricCollection["CounterName"] = metricNametoReturn metricCollection["Value"] = metricValue metricItem["json_Collections"] = [] - metricCollections = [] - metricCollections.push(metricCollection) + metricCollections = [] + metricCollections.push(metricCollection) metricItem["json_Collections"] = metricCollections.to_json - metricItems.push(metricItem) - + metricItems.push(metricItem) + #Telemetry about agent performance begin # we can only do this much now. Ideally would like to use the docker image repository to find our pods/containers @@ -272,7 +277,7 @@ def getContainerCpuMetricItems(metricJSON, hostName, cpuMetricNameToCollect, met end #telemetry about containerlog Routing for daemonset telemetryProps["containerLogsRoute"] = @containerLogsRoute - + #telemetry about health model if (!@hmEnabled.nil? && !@hmEnabled.empty?) telemetryProps["hmEnabled"] = @hmEnabled @@ -521,13 +526,13 @@ def getContainerCpuMetricItemRate(metricJSON, hostName, cpuMetricNameToCollect, containerName = container["name"] metricValue = container["cpu"][cpuMetricNameToCollect] metricTime = metricPollTime #container["cpu"]["time"] - + metricItem = {} metricItem["Timestamp"] = metricTime metricItem["Host"] = hostName metricItem["ObjectName"] = Constants::OBJECT_NAME_K8S_CONTAINER metricItem["InstanceName"] = clusterId + "/" + podUid + "/" + containerName - + metricItem["json_Collections"] = [] metricCollection = {} metricCollection["CounterName"] = metricNametoReturn @@ -562,9 +567,9 @@ def getContainerCpuMetricItemRate(metricJSON, hostName, cpuMetricNameToCollect, end metricCollection["Value"] = metricValue - - metricCollections = [] - metricCollections.push(metricCollection) + + metricCollections = [] + metricCollections.push(metricCollection) metricItem["json_Collections"] = metricCollections.to_json metricItems.push(metricItem) #Telemetry about agent performance @@ -651,16 +656,16 @@ def getContainerMemoryMetricItems(metricJSON, hostName, memoryMetricNameToCollec metricItem["Host"] = hostName metricItem["ObjectName"] = Constants::OBJECT_NAME_K8S_CONTAINER metricItem["InstanceName"] = clusterId + "/" + podUid + "/" + containerName - + metricCollection = {} metricCollection["CounterName"] = metricNametoReturn metricCollection["Value"] = metricValue metricItem["json_Collections"] = [] - metricCollections = [] - metricCollections.push(metricCollection) + metricCollections = [] + metricCollections.push(metricCollection) metricItem["json_Collections"] = metricCollections.to_json - metricItems.push(metricItem) + metricItems.push(metricItem) #Telemetry about agent performance begin @@ -704,21 +709,21 @@ def getNodeMetricItem(metricJSON, hostName, metricCategory, metricNameToCollect, if !node[metricCategory].nil? metricValue = node[metricCategory][metricNameToCollect] metricTime = metricPollTime #node[metricCategory]["time"] - + metricItem["Timestamp"] = metricTime metricItem["Host"] = hostName metricItem["ObjectName"] = Constants::OBJECT_NAME_K8S_NODE metricItem["InstanceName"] = clusterId + "/" + nodeName - + metricCollection = {} metricCollection["CounterName"] = metricNametoReturn metricCollection["Value"] = metricValue metricItem["json_Collections"] = [] - metricCollections = [] - metricCollections.push(metricCollection) - metricItem["json_Collections"] = metricCollections.to_json + metricCollections = [] + metricCollections.push(metricCollection) + metricItem["json_Collections"] = metricCollections.to_json end rescue => error @Log.warn("getNodeMetricItem failed: #{error} for metric #{metricNameToCollect}") @@ -821,19 +826,19 @@ def getNodeMetricItemRate(metricJSON, hostName, metricCategory, metricNameToColl end end end - + metricItem["Timestamp"] = metricTime metricItem["Host"] = hostName metricItem["ObjectName"] = Constants::OBJECT_NAME_K8S_NODE metricItem["InstanceName"] = clusterId + "/" + nodeName - + metricCollection = {} metricCollection["CounterName"] = metricNametoReturn metricCollection["Value"] = metricValue metricItem["json_Collections"] = [] - metricCollections = [] - metricCollections.push(metricCollection) + metricCollections = [] + metricCollections.push(metricCollection) metricItem["json_Collections"] = metricCollections.to_json end rescue => error @@ -856,21 +861,21 @@ def getNodeLastRebootTimeMetric(metricJSON, hostName, metricNametoReturn, metric metricValue = node["startTime"] metricTime = metricPollTime #Time.now.utc.iso8601 #2018-01-30T19:36:14Z - + metricItem["Timestamp"] = metricTime metricItem["Host"] = hostName metricItem["ObjectName"] = Constants::OBJECT_NAME_K8S_NODE metricItem["InstanceName"] = clusterId + "/" + nodeName - + metricCollection = {} metricCollection["CounterName"] = metricNametoReturn #Read it from /proc/uptime metricCollection["Value"] = DateTime.parse(metricTime).to_time.to_i - IO.read("/proc/uptime").split[0].to_f metricItem["json_Collections"] = [] - metricCollections = [] - metricCollections.push(metricCollection) + metricCollections = [] + metricCollections.push(metricCollection) metricItem["json_Collections"] = metricCollections.to_json rescue => error @Log.warn("getNodeLastRebootTimeMetric failed: #{error} ") @@ -899,14 +904,14 @@ def getContainerStartTimeMetricItems(metricJSON, hostName, metricNametoReturn, m metricItem["Host"] = hostName metricItem["ObjectName"] = Constants::OBJECT_NAME_K8S_CONTAINER metricItem["InstanceName"] = clusterId + "/" + podUid + "/" + containerName - + metricCollection = {} metricCollection["CounterName"] = metricNametoReturn metricCollection["Value"] = DateTime.parse(metricValue).to_time.to_i metricItem["json_Collections"] = [] - metricCollections = [] - metricCollections.push(metricCollection) + metricCollections = [] + metricCollections.push(metricCollection) metricItem["json_Collections"] = metricCollections.to_json metricItems.push(metricItem) end diff --git a/source/plugins/ruby/CustomMetricsUtils.rb b/source/plugins/ruby/CustomMetricsUtils.rb index 220313e6b..fd9290b78 100644 --- a/source/plugins/ruby/CustomMetricsUtils.rb +++ b/source/plugins/ruby/CustomMetricsUtils.rb @@ -13,8 +13,8 @@ def check_custom_metrics_availability if aks_region.to_s.empty? || aks_resource_id.to_s.empty? return false # This will also take care of AKS-Engine Scenario. AKS_REGION/AKS_RESOURCE_ID is not set for AKS-Engine. Only ACS_RESOURCE_NAME is set end - - return aks_cloud_environment.to_s.downcase == 'public' + + return aks_cloud_environment.to_s.downcase == 'azurepubliccloud' end end end \ No newline at end of file diff --git a/source/plugins/ruby/KubernetesApiClient.rb b/source/plugins/ruby/KubernetesApiClient.rb index 4b50e20d8..4afb3d961 100644 --- a/source/plugins/ruby/KubernetesApiClient.rb +++ b/source/plugins/ruby/KubernetesApiClient.rb @@ -25,11 +25,12 @@ class KubernetesApiClient #@@IsValidRunningNode = nil #@@IsLinuxCluster = nil @@KubeSystemNamespace = "kube-system" + @os_type = ENV["OS_TYPE"] if !@os_type.nil? && !@os_type.empty? && @os_type.strip.casecmp("windows") == 0 - @LogPath = "/etc/omsagentwindows/kubernetes_client_log.txt" + @LogPath = Constants::WINDOWS_LOG_PATH + "kubernetes_client_log.txt" else - @LogPath = "/var/opt/microsoft/docker-cimprov/log/kubernetes_client_log.txt" + @LogPath = Constants::LINUX_LOG_PATH + "kubernetes_client_log.txt" end @Log = Logger.new(@LogPath, 2, 10 * 1048576) #keep last 2 files, max log file size = 10M @@TokenFileName = "/var/run/secrets/kubernetes.io/serviceaccount/token" @@ -87,42 +88,42 @@ def getTokenStr end end - def getClusterRegion - if ENV["AKS_REGION"] - return ENV["AKS_REGION"] + def getClusterRegion(env=ENV) + if env["AKS_REGION"] + return env["AKS_REGION"] else @Log.warn ("Kubernetes environment variable not set AKS_REGION. Unable to get cluster region.") return nil end end - def getResourceUri(resource, api_group) + def getResourceUri(resource, api_group, env=ENV) begin - if ENV["KUBERNETES_SERVICE_HOST"] && ENV["KUBERNETES_PORT_443_TCP_PORT"] + if env["KUBERNETES_SERVICE_HOST"] && env["KUBERNETES_PORT_443_TCP_PORT"] if api_group.nil? - return "https://#{ENV["KUBERNETES_SERVICE_HOST"]}:#{ENV["KUBERNETES_PORT_443_TCP_PORT"]}/api/" + @@ApiVersion + "/" + resource + return "https://#{env["KUBERNETES_SERVICE_HOST"]}:#{env["KUBERNETES_PORT_443_TCP_PORT"]}/api/" + @@ApiVersion + "/" + resource elsif api_group == @@ApiGroupApps - return "https://#{ENV["KUBERNETES_SERVICE_HOST"]}:#{ENV["KUBERNETES_PORT_443_TCP_PORT"]}/apis/apps/" + @@ApiVersionApps + "/" + resource + return "https://#{env["KUBERNETES_SERVICE_HOST"]}:#{env["KUBERNETES_PORT_443_TCP_PORT"]}/apis/apps/" + @@ApiVersionApps + "/" + resource elsif api_group == @@ApiGroupHPA - return "https://#{ENV["KUBERNETES_SERVICE_HOST"]}:#{ENV["KUBERNETES_PORT_443_TCP_PORT"]}/apis/" + @@ApiGroupHPA + "/" + @@ApiVersionHPA + "/" + resource + return "https://#{env["KUBERNETES_SERVICE_HOST"]}:#{env["KUBERNETES_PORT_443_TCP_PORT"]}/apis/" + @@ApiGroupHPA + "/" + @@ApiVersionHPA + "/" + resource end else - @Log.warn ("Kubernetes environment variable not set KUBERNETES_SERVICE_HOST: #{ENV["KUBERNETES_SERVICE_HOST"]} KUBERNETES_PORT_443_TCP_PORT: #{ENV["KUBERNETES_PORT_443_TCP_PORT"]}. Unable to form resourceUri") + @Log.warn ("Kubernetes environment variable not set KUBERNETES_SERVICE_HOST: #{env["KUBERNETES_SERVICE_HOST"]} KUBERNETES_PORT_443_TCP_PORT: #{env["KUBERNETES_PORT_443_TCP_PORT"]}. Unable to form resourceUri") return nil end end end - def getClusterName + def getClusterName(env=ENV) return @@ClusterName if !@@ClusterName.nil? @@ClusterName = "None" begin #try getting resource ID for aks - cluster = ENV["AKS_RESOURCE_ID"] + cluster = env["AKS_RESOURCE_ID"] if cluster && !cluster.nil? && !cluster.empty? @@ClusterName = cluster.split("/").last else - cluster = ENV["ACS_RESOURCE_NAME"] + cluster = env["ACS_RESOURCE_NAME"] if cluster && !cluster.nil? && !cluster.empty? @@ClusterName = cluster else @@ -147,7 +148,7 @@ def getClusterName return @@ClusterName end - def getClusterId + def getClusterId(env=ENV) return @@ClusterId if !@@ClusterId.nil? #By default initialize ClusterId to ClusterName. # In ACS/On-prem, we need to figure out how we can generate ClusterId @@ -155,7 +156,7 @@ def getClusterId # e.g. md5 digest is 128 bits = 32 character in hex. Get first 16 and get a guid, and the next 16 to get resource id @@ClusterId = getClusterName begin - cluster = ENV["AKS_RESOURCE_ID"] + cluster = env["AKS_RESOURCE_ID"] if cluster && !cluster.nil? && !cluster.empty? @@ClusterId = cluster end @@ -455,19 +456,19 @@ def getContainerResourceRequestsAndLimits(pod, metricCategory, metricNameToColle metricCollection = {} metricCollection["CounterName"] = metricNametoReturn metricCollection["Value"] = metricValue - + metricProps["json_Collections"] = [] - metricCollections = [] - metricCollections.push(metricCollection) + metricCollections = [] + metricCollections.push(metricCollection) metricProps["json_Collections"] = metricCollections.to_json - metricItems.push(metricProps) + metricItems.push(metricProps) #No container level limit for the given metric, so default to node level limit else nodeMetricsHashKey = clusterId + "/" + nodeName + "_" + "allocatable" + "_" + metricNameToCollect if (metricCategory == "limits" && @@NodeMetrics.has_key?(nodeMetricsHashKey)) metricValue = @@NodeMetrics[nodeMetricsHashKey] #@Log.info("Limits not set for container #{clusterId + "/" + podUid + "/" + containerName} using node level limits: #{nodeMetricsHashKey}=#{metricValue} ") - + metricProps = {} metricProps["Timestamp"] = metricTime metricProps["Host"] = nodeName @@ -480,10 +481,10 @@ def getContainerResourceRequestsAndLimits(pod, metricCategory, metricNameToColle metricCollection["CounterName"] = metricNametoReturn metricCollection["Value"] = metricValue metricProps["json_Collections"] = [] - metricCollections = [] - metricCollections.push(metricCollection) + metricCollections = [] + metricCollections.push(metricCollection) metricProps["json_Collections"] = metricCollections.to_json - metricItems.push(metricProps) + metricItems.push(metricProps) end end end @@ -614,11 +615,11 @@ def parseNodeLimitsFromNodeItem(node, metricCategory, metricNameToCollect, metri metricCollection["CounterName"] = metricNametoReturn metricCollection["Value"] = metricValue metricCollections = [] - metricCollections.push(metricCollection) - + metricCollections.push(metricCollection) + metricItem["json_Collections"] = [] metricItem["json_Collections"] = metricCollections.to_json - + #push node level metrics to a inmem hash so that we can use it looking up at container level. #Currently if container level cpu & memory limits are not defined we default to node level limits @@NodeMetrics[clusterId + "/" + node["metadata"]["name"] + "_" + metricCategory + "_" + metricNameToCollect] = metricValue @@ -777,13 +778,13 @@ def getResourcesAndContinuationToken(uri, api_group: nil) return continuationToken, resourceInventory end #getResourcesAndContinuationToken - def getKubeAPIServerUrl + def getKubeAPIServerUrl(env=ENV) apiServerUrl = nil begin - if ENV["KUBERNETES_SERVICE_HOST"] && ENV["KUBERNETES_PORT_443_TCP_PORT"] - apiServerUrl = "https://#{ENV["KUBERNETES_SERVICE_HOST"]}:#{ENV["KUBERNETES_PORT_443_TCP_PORT"]}" + if env["KUBERNETES_SERVICE_HOST"] && env["KUBERNETES_PORT_443_TCP_PORT"] + apiServerUrl = "https://#{env["KUBERNETES_SERVICE_HOST"]}:#{env["KUBERNETES_PORT_443_TCP_PORT"]}" else - @Log.warn "Kubernetes environment variable not set KUBERNETES_SERVICE_HOST: #{ENV["KUBERNETES_SERVICE_HOST"]} KUBERNETES_PORT_443_TCP_PORT: #{ENV["KUBERNETES_PORT_443_TCP_PORT"]}. Unable to form resourceUri" + @Log.warn "Kubernetes environment variable not set KUBERNETES_SERVICE_HOST: #{env["KUBERNETES_SERVICE_HOST"]} KUBERNETES_PORT_443_TCP_PORT: #{env["KUBERNETES_PORT_443_TCP_PORT"]}. Unable to form resourceUri" end rescue => errorStr @Log.warn "KubernetesApiClient::getKubeAPIServerUrl:Failed #{errorStr}" diff --git a/source/plugins/ruby/MdmMetricsGenerator.rb b/source/plugins/ruby/MdmMetricsGenerator.rb index 73cf19fac..0858990da 100644 --- a/source/plugins/ruby/MdmMetricsGenerator.rb +++ b/source/plugins/ruby/MdmMetricsGenerator.rb @@ -37,6 +37,12 @@ class MdmMetricsGenerator Constants::MEMORY_WORKING_SET_BYTES => Constants::MDM_NODE_MEMORY_WORKING_SET_PERCENTAGE, } + @@node_metric_name_metric_allocatable_percentage_name_hash = { + Constants::CPU_USAGE_MILLI_CORES => Constants::MDM_NODE_CPU_USAGE_ALLOCATABLE_PERCENTAGE, + Constants::MEMORY_RSS_BYTES => Constants::MDM_NODE_MEMORY_RSS_ALLOCATABLE_PERCENTAGE, + Constants::MEMORY_WORKING_SET_BYTES => Constants::MDM_NODE_MEMORY_WORKING_SET_ALLOCATABLE_PERCENTAGE, + } + @@container_metric_name_metric_percentage_name_hash = { Constants::CPU_USAGE_MILLI_CORES => Constants::MDM_CONTAINER_CPU_UTILIZATION_METRIC, Constants::CPU_USAGE_NANO_CORES => Constants::MDM_CONTAINER_CPU_UTILIZATION_METRIC, @@ -526,7 +532,7 @@ def getContainerResourceUtilizationThresholds return metric_threshold_hash end - def getNodeResourceMetricRecords(record, metric_name, metric_value, percentage_metric_value) + def getNodeResourceMetricRecords(record, metric_name, metric_value, percentage_metric_value, allocatable_percentage_metric_value) records = [] begin custommetricrecord = MdmAlertTemplates::Node_resource_metrics_template % { @@ -554,6 +560,20 @@ def getNodeResourceMetricRecords(record, metric_name, metric_value, percentage_m } records.push(Yajl::Parser.parse(StringIO.new(additional_record))) end + + if !allocatable_percentage_metric_value.nil? + additional_record = MdmAlertTemplates::Node_resource_metrics_template % { + timestamp: record["Timestamp"], + metricName: @@node_metric_name_metric_allocatable_percentage_name_hash[metric_name], + hostvalue: record["Host"], + objectnamevalue: record["ObjectName"], + instancenamevalue: record["InstanceName"], + metricminvalue: allocatable_percentage_metric_value, + metricmaxvalue: allocatable_percentage_metric_value, + metricsumvalue: allocatable_percentage_metric_value, + } + records.push(Yajl::Parser.parse(StringIO.new(additional_record))) + end rescue => errorStr @log.info "Error in getNodeResourceMetricRecords: #{errorStr}" ApplicationInsightsUtility.sendExceptionTelemetry(errorStr) diff --git a/source/plugins/ruby/constants.rb b/source/plugins/ruby/constants.rb index c037c99f6..7c3e858dd 100644 --- a/source/plugins/ruby/constants.rb +++ b/source/plugins/ruby/constants.rb @@ -60,6 +60,9 @@ class Constants MDM_NODE_CPU_USAGE_PERCENTAGE = "cpuUsagePercentage" MDM_NODE_MEMORY_RSS_PERCENTAGE = "memoryRssPercentage" MDM_NODE_MEMORY_WORKING_SET_PERCENTAGE = "memoryWorkingSetPercentage" + MDM_NODE_CPU_USAGE_ALLOCATABLE_PERCENTAGE = "cpuUsageAllocatablePercentage" + MDM_NODE_MEMORY_RSS_ALLOCATABLE_PERCENTAGE = "memoryRssAllocatablePercentage" + MDM_NODE_MEMORY_WORKING_SET_ALLOCATABLE_PERCENTAGE = "memoryWorkingSetAllocatablePercentage" CONTAINER_TERMINATED_RECENTLY_IN_MINUTES = 5 OBJECT_NAME_K8S_CONTAINER = "K8SContainer" @@ -103,5 +106,30 @@ class Constants #Pod Statuses POD_STATUS_TERMINATING = "Terminating" - + # Data type ids + CONTAINER_INVENTORY_DATA_TYPE = "CONTAINER_INVENTORY_BLOB" + CONTAINER_NODE_INVENTORY_DATA_TYPE = "CONTAINER_NODE_INVENTORY_BLOB" + PERF_DATA_TYPE = "LINUX_PERF_BLOB" + INSIGHTS_METRICS_DATA_TYPE = "INSIGHTS_METRICS_BLOB" + KUBE_SERVICES_DATA_TYPE = "KUBE_SERVICES_BLOB" + KUBE_POD_INVENTORY_DATA_TYPE = "KUBE_POD_INVENTORY_BLOB" + KUBE_NODE_INVENTORY_DATA_TYPE = "KUBE_NODE_INVENTORY_BLOB" + KUBE_PV_INVENTORY_DATA_TYPE = "KUBE_PV_INVENTORY_BLOB" + KUBE_EVENTS_DATA_TYPE = "KUBE_EVENTS_BLOB" + KUBE_MON_AGENT_EVENTS_DATA_TYPE = "KUBE_MON_AGENT_EVENTS_BLOB" + KUBE_HEALTH_DATA_TYPE = "KUBE_HEALTH_BLOB" + CONTAINERLOGV2_DATA_TYPE = "CONTAINERINSIGHTS_CONTAINERLOGV2" + CONTAINERLOG_DATA_TYPE = "CONTAINER_LOG_BLOB" + + #ContainerInsights Extension (AMCS) + CI_EXTENSION_NAME = "ContainerInsights" + CI_EXTENSION_VERSION = "1" + #Current CI extension config size is ~5KB and going with 20KB to handle any future scenarios + CI_EXTENSION_CONFIG_MAX_BYTES = 20480 + ONEAGENT_FLUENT_SOCKET_NAME = "/var/run/mdsd/default_fluent.socket" + #Tag prefix for output stream + EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX = "dcr-" + + LINUX_LOG_PATH = $in_unit_test.nil? ? "/var/opt/microsoft/docker-cimprov/log/" : "./" + WINDOWS_LOG_PATH = $in_unit_test.nil? ? "/etc/omsagentwindows/" : "./" end diff --git a/source/plugins/ruby/extension.rb b/source/plugins/ruby/extension.rb new file mode 100644 index 000000000..78236fe15 --- /dev/null +++ b/source/plugins/ruby/extension.rb @@ -0,0 +1,77 @@ +require "socket" +require "msgpack" +require "securerandom" +require "singleton" +require_relative "omslog" +require_relative "constants" +require_relative "ApplicationInsightsUtility" + + +class Extension + include Singleton + + def initialize + @cache = {} + @cache_lock = Mutex.new + $log.info("Extension::initialize complete") + end + + def get_output_stream_id(datatypeId) + @cache_lock.synchronize { + if @cache.has_key?(datatypeId) + return @cache[datatypeId] + else + @cache = get_config() + return @cache[datatypeId] + end + } + end + + private + def get_config() + extConfig = Hash.new + $log.info("Extension::get_config start ...") + begin + clientSocket = UNIXSocket.open(Constants::ONEAGENT_FLUENT_SOCKET_NAME) + requestId = SecureRandom.uuid.to_s + requestBodyJSON = { "Request" => "AgentTaggedData", "RequestId" => requestId, "Tag" => Constants::CI_EXTENSION_NAME, "Version" => Constants::CI_EXTENSION_VERSION }.to_json + $log.info("Extension::get_config::sending request with request body: #{requestBodyJSON}") + requestBodyMsgPack = requestBodyJSON.to_msgpack + clientSocket.write(requestBodyMsgPack) + clientSocket.flush + $log.info("reading the response from fluent socket: #{Constants::ONEAGENT_FLUENT_SOCKET_NAME}") + resp = clientSocket.recv(Constants::CI_EXTENSION_CONFIG_MAX_BYTES) + if !resp.nil? && !resp.empty? + $log.info("Extension::get_config::successfully read the extension config from fluentsocket and number of bytes read is #{resp.length}") + respJSON = JSON.parse(resp) + taggedData = respJSON["TaggedData"] + if !taggedData.nil? && !taggedData.empty? + taggedAgentData = JSON.parse(taggedData) + extensionConfigurations = taggedAgentData["extensionConfigurations"] + if !extensionConfigurations.nil? && !extensionConfigurations.empty? + extensionConfigurations.each do |extensionConfig| + outputStreams = extensionConfig["outputStreams"] + if !outputStreams.nil? && !outputStreams.empty? + outputStreams.each do |datatypeId, streamId| + $log.info("Extension::get_config datatypeId:#{datatypeId}, streamId: #{streamId}") + extConfig[datatypeId] = streamId + end + else + $log.warn("Extension::get_config::received outputStreams is either nil or empty") + end + end + else + $log.warn("Extension::get_config::received extensionConfigurations from fluentsocket is either nil or empty") + end + end + end + rescue => errorStr + $log.warn("Extension::get_config failed: #{errorStr}") + ApplicationInsightsUtility.sendExceptionTelemetry(errorStr) + ensure + clientSocket.close unless clientSocket.nil? + end + $log.info("Extension::get_config complete ...") + return extConfig + end +end diff --git a/source/plugins/ruby/extension_utils.rb b/source/plugins/ruby/extension_utils.rb new file mode 100644 index 000000000..5d439c6b2 --- /dev/null +++ b/source/plugins/ruby/extension_utils.rb @@ -0,0 +1,27 @@ +# Copyright (c) Microsoft Corporation. All rights reserved. +#!/usr/local/bin/ruby +# frozen_string_literal: true + +require_relative "extension" + +class ExtensionUtils + class << self + def getOutputStreamId(dataType) + outputStreamId = "" + begin + if !dataType.nil? && !dataType.empty? + outputStreamId = Extension.instance.get_output_stream_id(dataType) + $log.info("ExtensionUtils::getOutputStreamId: got streamid: #{outputStreamId} for datatype: #{dataType}") + else + $log.warn("ExtensionUtils::getOutputStreamId: dataType shouldnt be nil or empty") + end + rescue => errorStr + $log.warn("ExtensionUtils::getOutputStreamId: failed with an exception: #{errorStr}") + end + return outputStreamId + end + def isAADMSIAuthMode() + return !ENV["AAD_MSI_AUTH_MODE"].nil? && !ENV["AAD_MSI_AUTH_MODE"].empty? && ENV["AAD_MSI_AUTH_MODE"].downcase == "true" + end + end +end diff --git a/source/plugins/ruby/filter_cadvisor2mdm.rb b/source/plugins/ruby/filter_cadvisor2mdm.rb index 9c6b661b0..4ed0d5bde 100644 --- a/source/plugins/ruby/filter_cadvisor2mdm.rb +++ b/source/plugins/ruby/filter_cadvisor2mdm.rb @@ -66,8 +66,10 @@ def start # initialize cpu and memory limit if @process_incoming_stream @cpu_capacity = 0.0 + @cpu_allocatable = 0.0 @memory_capacity = 0.0 - ensure_cpu_memory_capacity_set + @memory_allocatable = 0.0 + ensure_cpu_memory_capacity_and_allocatable_set @containerCpuLimitHash = {} @containerMemoryLimitHash = {} @containerResourceDimensionHash = {} @@ -158,16 +160,17 @@ def filter(tag, time, record) begin if @process_incoming_stream - # Check if insights metrics for PV metrics + # Check if insights metrics for PV metrics if record["Name"] == Constants::PV_USED_BYTES return filterPVInsightsMetrics(record) end object_name = record["ObjectName"] counter_name = JSON.parse(record["json_Collections"])[0]["CounterName"] - + percentage_metric_value = 0.0 - metric_value = JSON.parse(record["json_Collections"])[0]["Value"] + allocatable_percentage_metric_value = 0.0 + metric_value = JSON.parse(record["json_Collections"])[0]["Value"] if object_name == Constants::OBJECT_NAME_K8S_NODE && @metrics_to_collect_hash.key?(counter_name.downcase) # Compute and send % CPU and Memory @@ -176,39 +179,62 @@ def filter(tag, time, record) metric_value /= 1000000 #cadvisor record is in nanocores. Convert to mc if @@controller_type.downcase == "replicaset" target_node_cpu_capacity_mc = @NodeCache.cpu.get_capacity(record["Host"]) / 1000000 + target_node_cpu_allocatable_mc = 0.0 # We do not need this value in the replicaset else target_node_cpu_capacity_mc = @cpu_capacity + target_node_cpu_allocatable_mc = @cpu_allocatable end - @log.info "Metric_value: #{metric_value} CPU Capacity #{target_node_cpu_capacity_mc}" + @log.info "Metric_value: #{metric_value} CPU Capacity #{target_node_cpu_capacity_mc} CPU Allocatable #{target_node_cpu_allocatable_mc} " if target_node_cpu_capacity_mc != 0.0 percentage_metric_value = (metric_value) * 100 / target_node_cpu_capacity_mc end + if target_node_cpu_allocatable_mc != 0.0 + allocatable_percentage_metric_value = (metric_value) * 100 / target_node_cpu_allocatable_mc + else + allocatable_percentage_metric_value = 0.0 + end end if counter_name.start_with?("memory") metric_name = counter_name if @@controller_type.downcase == "replicaset" target_node_mem_capacity = @NodeCache.mem.get_capacity(record["Host"]) + target_node_mem_allocatable = 0.0 # We do not need this value in the replicaset else target_node_mem_capacity = @memory_capacity + target_node_mem_allocatable = @memory_allocatable # We do not need this value in the replicaset end - @log.info "Metric_value: #{metric_value} Memory Capacity #{target_node_mem_capacity}" + + @log.info "Metric_value: #{metric_value} Memory Capacity #{target_node_mem_capacity} Memory Allocatable #{target_node_mem_allocatable}" if target_node_mem_capacity != 0.0 percentage_metric_value = metric_value * 100 / target_node_mem_capacity end - end - @log.info "percentage_metric_value for metric: #{metric_name} for instance: #{record["Host"]} percentage: #{percentage_metric_value}" - # do some sanity checking. Do we want this? - if percentage_metric_value > 100.0 or percentage_metric_value < 0.0 + if target_node_mem_allocatable != 0.0 + allocatable_percentage_metric_value = metric_value * 100 / target_node_mem_allocatable + else + allocatable_percentage_metric_value = 0.0 + end + end + @log.info "percentage_metric_value for metric: #{metric_name} for instance: #{record["Host"]} percentage: #{percentage_metric_value} allocatable_percentage: #{allocatable_percentage_metric_value}" + + # do some sanity checking. + if percentage_metric_value > 100.0 telemetryProperties = {} telemetryProperties["Computer"] = record["Host"] telemetryProperties["MetricName"] = metric_name telemetryProperties["MetricPercentageValue"] = percentage_metric_value ApplicationInsightsUtility.sendCustomEvent("ErrorPercentageOutOfBounds", telemetryProperties) end + if allocatable_percentage_metric_value > 100.0 + telemetryProperties = {} + telemetryProperties["Computer"] = record["Host"] + telemetryProperties["MetricName"] = metric_name + telemetryProperties["MetricAllocatablePercentageValue"] = allocatable_percentage_metric_value + ApplicationInsightsUtility.sendCustomEvent("ErrorPercentageOutOfBounds", telemetryProperties) + end - return MdmMetricsGenerator.getNodeResourceMetricRecords(record, metric_name, metric_value, percentage_metric_value) + return MdmMetricsGenerator.getNodeResourceMetricRecords(record, metric_name, metric_value, percentage_metric_value, allocatable_percentage_metric_value) elsif object_name == Constants::OBJECT_NAME_K8S_CONTAINER && @metrics_to_collect_hash.key?(counter_name.downcase) instanceName = record["InstanceName"] metricName = counter_name @@ -304,13 +330,20 @@ def filterPVInsightsMetrics(record) end end - def ensure_cpu_memory_capacity_set - if @cpu_capacity != 0.0 && @memory_capacity != 0.0 - @log.info "CPU And Memory Capacity are already set" + def ensure_cpu_memory_capacity_and_allocatable_set + @@controller_type = ENV["CONTROLLER_TYPE"] + + if @cpu_capacity != 0.0 && @memory_capacity != 0.0 && @@controller_type.downcase == "replicaset" + @log.info "CPU And Memory Capacity are already set and their values are as follows @cpu_capacity : #{@cpu_capacity}, @memory_capacity: #{@memory_capacity}" + return + end + + if @@controller_type.downcase == "daemonset" && @cpu_capacity != 0.0 && @memory_capacity != 0.0 && @cpu_allocatable != 0.0 && @memory_allocatable != 0.0 + @log.info "CPU And Memory Capacity are already set and their values are as follows @cpu_capacity : #{@cpu_capacity}, @memory_capacity: #{@memory_capacity}" + @log.info "CPU And Memory Allocatable are already set and their values are as follows @cpu_allocatable : #{@cpu_allocatable}, @memory_allocatable: #{@memory_allocatable}" return end - @@controller_type = ENV["CONTROLLER_TYPE"] if @@controller_type.downcase == "replicaset" @log.info "ensure_cpu_memory_capacity_set @cpu_capacity #{@cpu_capacity} @memory_capacity #{@memory_capacity}" @@ -323,7 +356,7 @@ def ensure_cpu_memory_capacity_set end if !nodeInventory.nil? cpu_capacity_json = KubernetesApiClient.parseNodeLimits(nodeInventory, "capacity", "cpu", "cpuCapacityNanoCores") - if !cpu_capacity_json.nil? + if !cpu_capacity_json.nil? metricVal = JSON.parse(cpu_capacity_json[0]["json_Collections"])[0]["Value"] if !metricVal.to_s.nil? @cpu_capacity = metricVal @@ -333,8 +366,8 @@ def ensure_cpu_memory_capacity_set @log.info "Error getting cpu_capacity" end memory_capacity_json = KubernetesApiClient.parseNodeLimits(nodeInventory, "capacity", "memory", "memoryCapacityBytes") - if !memory_capacity_json.nil? - metricVal = JSON.parse(cpu_capacity_json[0]["json_Collections"])[0]["Value"] + if !memory_capacity_json.nil? + metricVal = JSON.parse(cpu_capacity_json[0]["json_Collections"])[0]["Value"] if !metricVal.to_s.nil? @memory_capacity = metricVal @log.info "Memory Limit #{@memory_capacity}" @@ -354,13 +387,24 @@ def ensure_cpu_memory_capacity_set # cpu_capacity and memory_capacity keep initialized value of 0.0 @log.error "Error getting capacity_from_kubelet: cpu_capacity and memory_capacity" end + + allocatable_from_kubelet = KubeletUtils.get_node_allocatable(@cpu_capacity, @memory_capacity) + + # Error handling in case /configz endpoint fails + if !allocatable_from_kubelet.nil? && allocatable_from_kubelet.length > 1 + @cpu_allocatable = allocatable_from_kubelet[0] + @memory_allocatable = allocatable_from_kubelet[1] + else + # cpu_allocatable and memory_allocatable keep initialized value of 0.0 + @log.error "Error getting allocatable_from_kubelet: cpu_allocatable and memory_allocatable" + end end end def filter_stream(tag, es) new_es = Fluent::MultiEventStream.new begin - ensure_cpu_memory_capacity_set + ensure_cpu_memory_capacity_and_allocatable_set # Getting container limits hash if @process_incoming_stream @containerCpuLimitHash, @containerMemoryLimitHash, @containerResourceDimensionHash = KubeletUtils.get_all_container_limits diff --git a/source/plugins/ruby/filter_health_model_builder.rb b/source/plugins/ruby/filter_health_model_builder.rb index d491f17c2..4c6bcb1c1 100644 --- a/source/plugins/ruby/filter_health_model_builder.rb +++ b/source/plugins/ruby/filter_health_model_builder.rb @@ -4,11 +4,12 @@ require 'fluent/plugin/filter' -module Fluent::Plugin +module Fluent::Plugin + require_relative 'extension_utils' require 'logger' require 'yajl/json_gem' Dir[File.join(__dir__, './health', '*.rb')].each { |file| require file } - + class FilterHealthModelBuilder < Filter include HealthModel @@ -22,7 +23,6 @@ class FilterHealthModelBuilder < Filter attr_reader :buffer, :model_builder, :health_model_definition, :monitor_factory, :state_finalizers, :monitor_set, :model_builder, :hierarchy_builder, :resources, :kube_api_down_handler, :provider, :reducer, :state, :generator, :telemetry - @@cluster_id = KubernetesApiClient.getClusterId @@token_file_path = "/var/run/secrets/kubernetes.io/serviceaccount/token" @@cert_file_path = "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt" @@ -56,7 +56,6 @@ def initialize deserialized_state_info = @cluster_health_state.get_state @state.initialize_state(deserialized_state_info) end - rescue => e ApplicationInsightsUtility.sendExceptionTelemetry(e, {"FeatureArea" => "Health"}) end @@ -90,7 +89,14 @@ def filter_stream(tag, es) end begin new_es = Fluent::MultiEventStream.new - time = Time.now + time = Time.now + if ExtensionUtils.isAADMSIAuthMode() + $log.info("filter_health_model_builder::enumerate: AAD AUTH MSI MODE") + if @rewrite_tag.nil? || !@rewrite_tag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @rewrite_tag = ExtensionUtils.getOutputStreamId(Constants::KUBE_HEALTH_DATA_TYPE) + end + $log.info("filter_health_model_builder::filter_stream: using tag -#{@rewrite_tag} @ #{Time.now.utc.iso8601}") + end if tag.start_with?("kubehealth.DaemonSet.Node") node_records = [] @@ -222,7 +228,6 @@ def filter_stream(tag, es) @log.info "after optimizing health signals all_monitors.size #{all_monitors.size}" - # for each key in monitor.keys, # get the state from health_monitor_state # generate the record to send @@ -245,7 +250,7 @@ def filter_stream(tag, es) @cluster_new_state = new_state end end - end + end new_es.add(emit_time, record) } @@ -261,7 +266,7 @@ def filter_stream(tag, es) @telemetry.send # return an empty event stream, else the match will throw a NoMethodError return Fluent::MultiEventStream.new - elsif tag.start_with?(@rewrite_tag) + elsif tag.start_with?(@rewrite_tag) # this filter also acts as a pass through as we are rewriting the tag and emitting to the fluent stream es else @@ -273,6 +278,6 @@ def filter_stream(tag, es) @log.warn "Message: #{e.message} Backtrace: #{e.backtrace}" return nil end - end + end end end diff --git a/source/plugins/ruby/in_cadvisor_perf.rb b/source/plugins/ruby/in_cadvisor_perf.rb index b3f9bd08b..862e88e44 100644 --- a/source/plugins/ruby/in_cadvisor_perf.rb +++ b/source/plugins/ruby/in_cadvisor_perf.rb @@ -20,7 +20,8 @@ def initialize require_relative "CAdvisorMetricsAPIClient" require_relative "oms_common" require_relative "omslog" - require_relative "constants" + require_relative "constants" + require_relative "extension_utils" end config_param :run_interval, :time, :default => 60 @@ -61,13 +62,24 @@ def enumerate() batchTime = currentTime.utc.iso8601 @@istestvar = ENV["ISTEST"] begin - eventStream = Fluent::MultiEventStream.new + eventStream = Fluent::MultiEventStream.new insightsMetricsEventStream = Fluent::MultiEventStream.new metricData = CAdvisorMetricsAPIClient.getMetrics(winNode: nil, metricTime: batchTime ) - metricData.each do |record| - eventStream.add(time, record) if record - end - + metricData.each do |record| + eventStream.add(time, record) if record + end + + if ExtensionUtils.isAADMSIAuthMode() + $log.info("in_cadvisor_perf::enumerate: AAD AUTH MSI MODE") + if @tag.nil? || !@tag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @tag = ExtensionUtils.getOutputStreamId(Constants::PERF_DATA_TYPE) + end + if @insightsmetricstag.nil? || !@insightsmetricstag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @insightsmetricstag = ExtensionUtils.getOutputStreamId(Constants::INSIGHTS_METRICS_DATA_TYPE) + end + $log.info("in_cadvisor_perf::enumerate: using perf tag -#{@tag} @ #{Time.now.utc.iso8601}") + $log.info("in_cadvisor_perf::enumerate: using insightsmetrics tag -#{@insightsmetricstag} @ #{Time.now.utc.iso8601}") + end router.emit_stream(@tag, eventStream) if eventStream router.emit_stream(@mdmtag, eventStream) if eventStream router.emit_stream(@containerhealthtag, eventStream) if eventStream @@ -136,6 +148,6 @@ def run_periodic @mutex.lock end @mutex.unlock - end + end end # CAdvisor_Perf_Input end # module diff --git a/source/plugins/ruby/in_containerinventory.rb b/source/plugins/ruby/in_containerinventory.rb index eebf422d6..c8ffe7d05 100644 --- a/source/plugins/ruby/in_containerinventory.rb +++ b/source/plugins/ruby/in_containerinventory.rb @@ -7,17 +7,19 @@ module Fluent::Plugin class Container_Inventory_Input < Input Fluent::Plugin.register_input("containerinventory", self) - @@PluginName = "ContainerInventory" + @@PluginName = "ContainerInventory" def initialize super require "yajl/json_gem" - require "time" + require "time" require_relative "ContainerInventoryState" require_relative "ApplicationInsightsUtility" require_relative "omslog" require_relative "CAdvisorMetricsAPIClient" - require_relative "kubernetes_container_inventory" + require_relative "kubernetes_container_inventory" + require_relative "extension_utils" + @addonTokenAdapterImageTag = "" end config_param :run_interval, :time, :default => 60 @@ -47,21 +49,28 @@ def shutdown @thread.join super # This super must be at the end of shutdown method end - end - + end + def enumerate - currentTime = Time.now + currentTime = Time.now batchTime = currentTime.utc.iso8601 emitTime = Fluent::Engine.now containerInventory = Array.new eventStream = Fluent::MultiEventStream.new hostName = "" - $log.info("in_container_inventory::enumerate : Begin processing @ #{Time.now.utc.iso8601}") + $log.info("in_container_inventory::enumerate : Begin processing @ #{Time.now.utc.iso8601}") + if ExtensionUtils.isAADMSIAuthMode() + $log.info("in_container_inventory::enumerate: AAD AUTH MSI MODE") + if @tag.nil? || !@tag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @tag = ExtensionUtils.getOutputStreamId(Constants::CONTAINER_INVENTORY_DATA_TYPE) + end + $log.info("in_container_inventory::enumerate: using tag -#{@tag} @ #{Time.now.utc.iso8601}") + end begin containerRuntimeEnv = ENV["CONTAINER_RUNTIME"] $log.info("in_container_inventory::enumerate : container runtime : #{containerRuntimeEnv}") clusterCollectEnvironmentVar = ENV["AZMON_CLUSTER_COLLECT_ENV_VAR"] - $log.info("in_container_inventory::enumerate : using cadvisor apis") + $log.info("in_container_inventory::enumerate : using cadvisor apis") containerIds = Array.new response = CAdvisorMetricsAPIClient.getPodsFromCAdvisor(winNode: nil) if !response.nil? && !response.body.nil? @@ -74,12 +83,21 @@ def enumerate if hostName.empty? && !containerRecord["Computer"].empty? hostName = containerRecord["Computer"] end + if @addonTokenAdapterImageTag.empty? && ExtensionUtils.isAADMSIAuthMode() + if !containerRecord["ElementName"].nil? && !containerRecord["ElementName"].empty? && + containerRecord["ElementName"].include?("_kube-system_") && + containerRecord["ElementName"].include?("addon-token-adapter_omsagent") + if !containerRecord["ImageTag"].nil? && !containerRecord["ImageTag"].empty? + @addonTokenAdapterImageTag = containerRecord["ImageTag"] + end + end + end containerIds.push containerRecord["InstanceID"] containerInventory.push containerRecord - end + end end - end - end + end + end # Update the state for deleted containers deletedContainers = ContainerInventoryState.getDeletedContainers(containerIds) if !deletedContainers.nil? && !deletedContainers.empty? @@ -87,13 +105,13 @@ def enumerate container = ContainerInventoryState.readContainerState(deletedContainer) if !container.nil? container.each { |k, v| container[k] = v } - container["State"] = "Deleted" + container["State"] = "Deleted" KubernetesContainerInventory.deleteCGroupCacheEntryForDeletedContainer(container["InstanceID"]) containerInventory.push container end end - end - containerInventory.each do |record| + end + containerInventory.each do |record| eventStream.add(emitTime, record) if record end router.emit_stream(@tag, eventStream) if eventStream @@ -109,6 +127,9 @@ def enumerate telemetryProperties = {} telemetryProperties["Computer"] = hostName telemetryProperties["ContainerCount"] = containerInventory.length + if !@addonTokenAdapterImageTag.empty? + telemetryProperties["addonTokenAdapterImageTag"] = @addonTokenAdapterImageTag + end ApplicationInsightsUtility.sendTelemetry(@@PluginName, telemetryProperties) end rescue => errorStr @@ -148,6 +169,6 @@ def run_periodic @mutex.lock end @mutex.unlock - end + end end # Container_Inventory_Input end # module diff --git a/source/plugins/ruby/in_kube_events.rb b/source/plugins/ruby/in_kube_events.rb index 6f65dab92..deeae6e14 100644 --- a/source/plugins/ruby/in_kube_events.rb +++ b/source/plugins/ruby/in_kube_events.rb @@ -3,7 +3,7 @@ require 'fluent/plugin/input' -module Fluent::Plugin +module Fluent::Plugin class Kube_Event_Input < Input Fluent::Plugin.register_input("kube_events", self) @@KubeEventsStateFile = "/var/opt/microsoft/docker-cimprov/state/KubeEventQueryState.yaml" @@ -18,6 +18,7 @@ def initialize require_relative "oms_common" require_relative "omslog" require_relative "ApplicationInsightsUtility" + require_relative "extension_utils" # refer tomlparser-agent-config for defaults # this configurable via configmap @@ -37,7 +38,7 @@ def configure(conf) super end - def start + def start if @run_interval super if !ENV["EVENTS_CHUNK_SIZE"].nil? && !ENV["EVENTS_CHUNK_SIZE"].empty? && ENV["EVENTS_CHUNK_SIZE"].to_i > 0 @@ -84,8 +85,15 @@ def enumerate batchTime = currentTime.utc.iso8601 eventQueryState = getEventQueryState newEventQueryState = [] - @eventsCount = 0 - + @eventsCount = 0 + + if ExtensionUtils.isAADMSIAuthMode() + $log.info("in_kube_events::enumerate: AAD AUTH MSI MODE") + if @tag.nil? || !@tag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @tag = ExtensionUtils.getOutputStreamId(Constants::KUBE_EVENTS_DATA_TYPE) + end + $log.info("in_kube_events::enumerate: using kubeevents tag -#{@tag} @ #{Time.now.utc.iso8601}") + end # Initializing continuation token to nil continuationToken = nil $log.info("in_kube_events::enumerate : Getting events from Kube API @ #{Time.now.utc.iso8601}") @@ -131,8 +139,8 @@ def enumerate end # end enumerate def parse_and_emit_records(events, eventQueryState, newEventQueryState, batchTime = Time.utc.iso8601) - currentTime = Time.now - emitTime = Fluent::Engine.now + currentTime = Time.now + emitTime = Fluent::Engine.now @@istestvar = ENV["ISTEST"] begin eventStream = Fluent::MultiEventStream.new @@ -166,7 +174,7 @@ def parse_and_emit_records(events, eventQueryState, newEventQueryState, batchTim record["Count"] = items["count"] record["Computer"] = nodeName record["ClusterName"] = KubernetesApiClient.getClusterName - record["ClusterId"] = KubernetesApiClient.getClusterId + record["ClusterId"] = KubernetesApiClient.getClusterId eventStream.add(emitTime, record) if record @eventsCount += 1 end diff --git a/source/plugins/ruby/in_kube_nodes.rb b/source/plugins/ruby/in_kube_nodes.rb index ebfa903fd..cb52243a0 100644 --- a/source/plugins/ruby/in_kube_nodes.rb +++ b/source/plugins/ruby/in_kube_nodes.rb @@ -6,27 +6,14 @@ module Fluent::Plugin class Kube_nodeInventory_Input < Input Fluent::Plugin.register_input("kube_nodes", self) - - @@configMapMountPath = "/etc/config/settings/log-data-collection-settings" - @@promConfigMountPath = "/etc/config/settings/prometheus-data-collection-settings" - @@osmConfigMountPath = "/etc/config/osm-settings/osm-metric-collection-configuration" - @@AzStackCloudFileName = "/etc/kubernetes/host/azurestackcloud.json" - - - @@rsPromInterval = ENV["TELEMETRY_RS_PROM_INTERVAL"] - @@rsPromFieldPassCount = ENV["TELEMETRY_RS_PROM_FIELDPASS_LENGTH"] - @@rsPromFieldDropCount = ENV["TELEMETRY_RS_PROM_FIELDDROP_LENGTH"] - @@rsPromK8sServiceCount = ENV["TELEMETRY_RS_PROM_K8S_SERVICES_LENGTH"] - @@rsPromUrlCount = ENV["TELEMETRY_RS_PROM_URLS_LENGTH"] - @@rsPromMonitorPods = ENV["TELEMETRY_RS_PROM_MONITOR_PODS"] - @@rsPromMonitorPodsNamespaceLength = ENV["TELEMETRY_RS_PROM_MONITOR_PODS_NS_LENGTH"] - @@rsPromMonitorPodsLabelSelectorLength = ENV["TELEMETRY_RS_PROM_LABEL_SELECTOR_LENGTH"] - @@rsPromMonitorPodsFieldSelectorLength = ENV["TELEMETRY_RS_PROM_FIELD_SELECTOR_LENGTH"] - @@collectAllKubeEvents = ENV["AZMON_CLUSTER_COLLECT_ALL_KUBE_EVENTS"] - @@osmNamespaceCount = ENV["TELEMETRY_OSM_CONFIGURATION_NAMESPACES_COUNT"] - - def initialize - super + + def initialize (kubernetesApiClient=nil, + applicationInsightsUtility=nil, + extensionUtils=nil, + env=nil, + telemetry_flush_interval=nil) + super() + require "yaml" require "yajl/json_gem" require "yajl" @@ -35,11 +22,37 @@ def initialize require_relative "KubernetesApiClient" require_relative "ApplicationInsightsUtility" require_relative "oms_common" - require_relative "omslog" - - @ContainerNodeInventoryTag = "oneagent.containerInsights.CONTAINER_NODE_INVENTORY_BLOB" - @insightsMetricsTag = "oneagent.containerInsights.INSIGHTS_METRICS_BLOB" - @MDMKubeNodeInventoryTag = "mdm.kubenodeinventory" + require_relative "omslog" + require_relative "extension_utils" + + @kubernetesApiClient = kubernetesApiClient == nil ? KubernetesApiClient : kubernetesApiClient + @applicationInsightsUtility = applicationInsightsUtility == nil ? ApplicationInsightsUtility : applicationInsightsUtility + @extensionUtils = extensionUtils == nil ? ExtensionUtils : extensionUtils + @env = env == nil ? ENV : env + @TELEMETRY_FLUSH_INTERVAL_IN_MINUTES = telemetry_flush_interval == nil ? Constants::TELEMETRY_FLUSH_INTERVAL_IN_MINUTES : telemetry_flush_interval + + # these defines were previously at class scope Moving them into the constructor so that they can be set by unit tests + @@configMapMountPath = "/etc/config/settings/log-data-collection-settings" + @@promConfigMountPath = "/etc/config/settings/prometheus-data-collection-settings" + @@osmConfigMountPath = "/etc/config/osm-settings/osm-metric-collection-configuration" + @@AzStackCloudFileName = "/etc/kubernetes/host/azurestackcloud.json" + + + @@rsPromInterval = @env["TELEMETRY_RS_PROM_INTERVAL"] + @@rsPromFieldPassCount = @env["TELEMETRY_RS_PROM_FIELDPASS_LENGTH"] + @@rsPromFieldDropCount = @env["TELEMETRY_RS_PROM_FIELDDROP_LENGTH"] + @@rsPromK8sServiceCount = @env["TELEMETRY_RS_PROM_K8S_SERVICES_LENGTH"] + @@rsPromUrlCount = @env["TELEMETRY_RS_PROM_URLS_LENGTH"] + @@rsPromMonitorPods = @env["TELEMETRY_RS_PROM_MONITOR_PODS"] + @@rsPromMonitorPodsNamespaceLength = @env["TELEMETRY_RS_PROM_MONITOR_PODS_NS_LENGTH"] + @@rsPromMonitorPodsLabelSelectorLength = @env["TELEMETRY_RS_PROM_LABEL_SELECTOR_LENGTH"] + @@rsPromMonitorPodsFieldSelectorLength = @env["TELEMETRY_RS_PROM_FIELD_SELECTOR_LENGTH"] + @@collectAllKubeEvents = @env["AZMON_CLUSTER_COLLECT_ALL_KUBE_EVENTS"] + @@osmNamespaceCount = @env["TELEMETRY_OSM_CONFIGURATION_NAMESPACES_COUNT"] + + @ContainerNodeInventoryTag = "oneagent.containerInsights.CONTAINER_NODE_INVENTORY_BLOB" + @insightsMetricsTag = "oneagent.containerInsights.INSIGHTS_METRICS_BLOB" + @MDMKubeNodeInventoryTag = "mdm.kubenodeinventory" @kubeperfTag = "oneagent.containerInsights.LINUX_PERF_BLOB" # refer tomlparser-agent-config for the defaults @@ -60,11 +73,11 @@ def configure(conf) super end - def start + def start if @run_interval super - if !ENV["NODES_CHUNK_SIZE"].nil? && !ENV["NODES_CHUNK_SIZE"].empty? && ENV["NODES_CHUNK_SIZE"].to_i > 0 - @NODES_CHUNK_SIZE = ENV["NODES_CHUNK_SIZE"].to_i + if !@env["NODES_CHUNK_SIZE"].nil? && !@env["NODES_CHUNK_SIZE"].empty? && @env["NODES_CHUNK_SIZE"].to_i > 0 + @NODES_CHUNK_SIZE = @env["NODES_CHUNK_SIZE"].to_i else # this shouldnt happen just setting default here as safe guard $log.warn("in_kube_nodes::start: setting to default value since got NODES_CHUNK_SIZE nil or empty") @@ -72,8 +85,8 @@ def start end $log.info("in_kube_nodes::start : NODES_CHUNK_SIZE @ #{@NODES_CHUNK_SIZE}") - if !ENV["NODES_EMIT_STREAM_BATCH_SIZE"].nil? && !ENV["NODES_EMIT_STREAM_BATCH_SIZE"].empty? && ENV["NODES_EMIT_STREAM_BATCH_SIZE"].to_i > 0 - @NODES_EMIT_STREAM_BATCH_SIZE = ENV["NODES_EMIT_STREAM_BATCH_SIZE"].to_i + if !@env["NODES_EMIT_STREAM_BATCH_SIZE"].nil? && !@env["NODES_EMIT_STREAM_BATCH_SIZE"].empty? && @env["NODES_EMIT_STREAM_BATCH_SIZE"].to_i > 0 + @NODES_EMIT_STREAM_BATCH_SIZE = @env["NODES_EMIT_STREAM_BATCH_SIZE"].to_i else # this shouldnt happen just setting default here as safe guard $log.warn("in_kube_nodes::start: setting to default value since got NODES_EMIT_STREAM_BATCH_SIZE nil or empty") @@ -109,15 +122,35 @@ def enumerate @nodesAPIE2ELatencyMs = 0 @nodeInventoryE2EProcessingLatencyMs = 0 - nodeInventoryStartTime = (Time.now.to_f * 1000).to_i - + nodeInventoryStartTime = (Time.now.to_f * 1000).to_i + + if @extensionUtils.isAADMSIAuthMode() + $log.info("in_kube_nodes::enumerate: AAD AUTH MSI MODE") + if @kubeperfTag.nil? || !@kubeperfTag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @kubeperfTag = @extensionUtils.getOutputStreamId(Constants::PERF_DATA_TYPE) + end + if @insightsMetricsTag.nil? || !@insightsMetricsTag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @insightsMetricsTag = @extensionUtils.getOutputStreamId(Constants::INSIGHTS_METRICS_DATA_TYPE) + end + if @ContainerNodeInventoryTag.nil? || !@ContainerNodeInventoryTag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @ContainerNodeInventoryTag = @extensionUtils.getOutputStreamId(Constants::CONTAINER_NODE_INVENTORY_DATA_TYPE) + end + if @tag.nil? || !@tag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @tag = @extensionUtils.getOutputStreamId(Constants::KUBE_NODE_INVENTORY_DATA_TYPE) + end + $log.info("in_kube_nodes::enumerate: using perf tag -#{@kubeperfTag} @ #{Time.now.utc.iso8601}") + $log.info("in_kube_nodes::enumerate: using insightsmetrics tag -#{@insightsMetricsTag} @ #{Time.now.utc.iso8601}") + $log.info("in_kube_nodes::enumerate: using containernodeinventory tag -#{@ContainerNodeInventoryTag} @ #{Time.now.utc.iso8601}") + $log.info("in_kube_nodes::enumerate: using kubenodeinventory tag -#{@tag} @ #{Time.now.utc.iso8601}") + end nodesAPIChunkStartTime = (Time.now.to_f * 1000).to_i # Initializing continuation token to nil continuationToken = nil $log.info("in_kube_nodes::enumerate : Getting nodes from Kube API @ #{Time.now.utc.iso8601}") + # KubernetesApiClient.getNodesResourceUri is a pure function, so call it from the actual module instead of from the mock resourceUri = KubernetesApiClient.getNodesResourceUri("nodes?limit=#{@NODES_CHUNK_SIZE}") - continuationToken, nodeInventory = KubernetesApiClient.getResourcesAndContinuationToken(resourceUri) + continuationToken, nodeInventory = @kubernetesApiClient.getResourcesAndContinuationToken(resourceUri) $log.info("in_kube_nodes::enumerate : Done getting nodes from Kube API @ #{Time.now.utc.iso8601}") nodesAPIChunkEndTime = (Time.now.to_f * 1000).to_i @nodesAPIE2ELatencyMs = (nodesAPIChunkEndTime - nodesAPIChunkStartTime) @@ -131,7 +164,7 @@ def enumerate #If we receive a continuation token, make calls, process and flush data until we have processed all data while (!continuationToken.nil? && !continuationToken.empty?) nodesAPIChunkStartTime = (Time.now.to_f * 1000).to_i - continuationToken, nodeInventory = KubernetesApiClient.getResourcesAndContinuationToken(resourceUri + "&continue=#{continuationToken}") + continuationToken, nodeInventory = @kubernetesApiClient.getResourcesAndContinuationToken(resourceUri + "&continue=#{continuationToken}") nodesAPIChunkEndTime = (Time.now.to_f * 1000).to_i @nodesAPIE2ELatencyMs = @nodesAPIE2ELatencyMs + (nodesAPIChunkEndTime - nodesAPIChunkStartTime) if (!nodeInventory.nil? && !nodeInventory.empty? && nodeInventory.key?("items") && !nodeInventory["items"].nil? && !nodeInventory["items"].empty?) @@ -145,9 +178,9 @@ def enumerate @nodeInventoryE2EProcessingLatencyMs = ((Time.now.to_f * 1000).to_i - nodeInventoryStartTime) timeDifference = (DateTime.now.to_time.to_i - @@nodeInventoryLatencyTelemetryTimeTracker).abs timeDifferenceInMinutes = timeDifference / 60 - if (timeDifferenceInMinutes >= Constants::TELEMETRY_FLUSH_INTERVAL_IN_MINUTES) - ApplicationInsightsUtility.sendMetricTelemetry("NodeInventoryE2EProcessingLatencyMs", @nodeInventoryE2EProcessingLatencyMs, {}) - ApplicationInsightsUtility.sendMetricTelemetry("NodesAPIE2ELatencyMs", @nodesAPIE2ELatencyMs, {}) + if (timeDifferenceInMinutes >= @TELEMETRY_FLUSH_INTERVAL_IN_MINUTES) + @applicationInsightsUtility.sendMetricTelemetry("NodeInventoryE2EProcessingLatencyMs", @nodeInventoryE2EProcessingLatencyMs, {}) + @applicationInsightsUtility.sendMetricTelemetry("NodesAPIE2ELatencyMs", @nodesAPIE2ELatencyMs, {}) @@nodeInventoryLatencyTelemetryTimeTracker = DateTime.now.to_time.to_i end # Setting this to nil so that we dont hold memory until GC kicks in @@ -155,25 +188,25 @@ def enumerate rescue => errorStr $log.warn "in_kube_nodes::enumerate:Failed in enumerate: #{errorStr}" $log.debug_backtrace(errorStr.backtrace) - ApplicationInsightsUtility.sendExceptionTelemetry(errorStr) + @applicationInsightsUtility.sendExceptionTelemetry(errorStr) end end # end enumerate def parse_and_emit_records(nodeInventory, batchTime = Time.utc.iso8601) begin - currentTime = Time.now - emitTime = Fluent::Engine.now + currentTime = Time.now + emitTime = Fluent::Engine.now telemetrySent = false eventStream = Fluent::MultiEventStream.new containerNodeInventoryEventStream = Fluent::MultiEventStream.new insightsMetricsEventStream = Fluent::MultiEventStream.new - kubePerfEventStream = Fluent::MultiEventStream.new - @@istestvar = ENV["ISTEST"] + kubePerfEventStream = Fluent::MultiEventStream.new + @@istestvar = @env["ISTEST"] #get node inventory nodeInventory["items"].each do |item| # node inventory nodeInventoryRecord = getNodeInventoryRecord(item, batchTime) - eventStream.add(emitTime, nodeInventoryRecord) if nodeInventoryRecord + eventStream.add(emitTime, nodeInventoryRecord) if nodeInventoryRecord if @NODES_EMIT_STREAM_BATCH_SIZE > 0 && eventStream.count >= @NODES_EMIT_STREAM_BATCH_SIZE $log.info("in_kube_node::parse_and_emit_records: number of node inventory records emitted #{@NODES_EMIT_STREAM_BATCH_SIZE} @ #{Time.now.utc.iso8601}") router.emit_stream(@tag, eventStream) if eventStream @@ -186,7 +219,7 @@ def parse_and_emit_records(nodeInventory, batchTime = Time.utc.iso8601) end # container node inventory - containerNodeInventoryRecord = getContainerNodeInventoryRecord(item, batchTime) + containerNodeInventoryRecord = getContainerNodeInventoryRecord(item, batchTime) containerNodeInventoryEventStream.add(emitTime, containerNodeInventoryRecord) if containerNodeInventoryRecord if @NODES_EMIT_STREAM_BATCH_SIZE > 0 && containerNodeInventoryEventStream.count >= @NODES_EMIT_STREAM_BATCH_SIZE @@ -235,7 +268,7 @@ def parse_and_emit_records(nodeInventory, batchTime = Time.utc.iso8601) @NodeCache.mem.set_capacity(nodeMetricRecord["Host"], metricVal) end end - nodeMetricRecords.each do |metricRecord| + nodeMetricRecords.each do |metricRecord| kubePerfEventStream.add(emitTime, metricRecord) if metricRecord end if @NODES_EMIT_STREAM_BATCH_SIZE > 0 && kubePerfEventStream.count >= @NODES_EMIT_STREAM_BATCH_SIZE @@ -265,7 +298,7 @@ def parse_and_emit_records(nodeInventory, batchTime = Time.utc.iso8601) if !insightsMetricsRecord.nil? && !insightsMetricsRecord.empty? nodeGPUInsightsMetricsRecords.push(insightsMetricsRecord) end - nodeGPUInsightsMetricsRecords.each do |insightsMetricsRecord| + nodeGPUInsightsMetricsRecords.each do |insightsMetricsRecord| insightsMetricsEventStream.add(emitTime, insightsMetricsRecord) if insightsMetricsRecord end if @NODES_EMIT_STREAM_BATCH_SIZE > 0 && insightsMetricsEventStream.count >= @NODES_EMIT_STREAM_BATCH_SIZE @@ -279,49 +312,79 @@ def parse_and_emit_records(nodeInventory, batchTime = Time.utc.iso8601) # Adding telemetry to send node telemetry every 10 minutes timeDifference = (DateTime.now.to_time.to_i - @@nodeTelemetryTimeTracker).abs timeDifferenceInMinutes = timeDifference / 60 - if (timeDifferenceInMinutes >= Constants::TELEMETRY_FLUSH_INTERVAL_IN_MINUTES) - properties = getNodeTelemetryProps(item) - properties["KubernetesProviderID"] = nodeInventoryRecord["KubernetesProviderID"] - capacityInfo = item["status"]["capacity"] - - ApplicationInsightsUtility.sendMetricTelemetry("NodeMemory", capacityInfo["memory"], properties) + if (timeDifferenceInMinutes >= @TELEMETRY_FLUSH_INTERVAL_IN_MINUTES) begin - if (!capacityInfo["nvidia.com/gpu"].nil?) && (!capacityInfo["nvidia.com/gpu"].empty?) - properties["nvigpus"] = capacityInfo["nvidia.com/gpu"] + properties = getNodeTelemetryProps(item) + properties["KubernetesProviderID"] = nodeInventoryRecord["KubernetesProviderID"] + capacityInfo = item["status"]["capacity"] + + ApplicationInsightsUtility.sendMetricTelemetry("NodeMemory", capacityInfo["memory"], properties) + begin + if (!capacityInfo["nvidia.com/gpu"].nil?) && (!capacityInfo["nvidia.com/gpu"].empty?) + properties["nvigpus"] = capacityInfo["nvidia.com/gpu"] + end + + if (!capacityInfo["amd.com/gpu"].nil?) && (!capacityInfo["amd.com/gpu"].empty?) + properties["amdgpus"] = capacityInfo["amd.com/gpu"] + end + rescue => errorStr + $log.warn "Failed in getting GPU telemetry in_kube_nodes : #{errorStr}" + $log.debug_backtrace(errorStr.backtrace) + ApplicationInsightsUtility.sendExceptionTelemetry(errorStr) end - if (!capacityInfo["amd.com/gpu"].nil?) && (!capacityInfo["amd.com/gpu"].empty?) - properties["amdgpus"] = capacityInfo["amd.com/gpu"] + # Telemetry for data collection config for replicaset + if (File.file?(@@configMapMountPath)) + properties["collectAllKubeEvents"] = @@collectAllKubeEvents end - rescue => errorStr - $log.warn "Failed in getting GPU telemetry in_kube_nodes : #{errorStr}" - $log.debug_backtrace(errorStr.backtrace) - ApplicationInsightsUtility.sendExceptionTelemetry(errorStr) - end - # Telemetry for data collection config for replicaset - if (File.file?(@@configMapMountPath)) - properties["collectAllKubeEvents"] = @@collectAllKubeEvents - end + #telemetry about prometheus metric collections settings for replicaset + if (File.file?(@@promConfigMountPath)) + properties["rsPromInt"] = @@rsPromInterval + properties["rsPromFPC"] = @@rsPromFieldPassCount + properties["rsPromFDC"] = @@rsPromFieldDropCount + properties["rsPromServ"] = @@rsPromK8sServiceCount + properties["rsPromUrl"] = @@rsPromUrlCount + properties["rsPromMonPods"] = @@rsPromMonitorPods + properties["rsPromMonPodsNs"] = @@rsPromMonitorPodsNamespaceLength + properties["rsPromMonPodsLabelSelectorLength"] = @@rsPromMonitorPodsLabelSelectorLength + properties["rsPromMonPodsFieldSelectorLength"] = @@rsPromMonitorPodsFieldSelectorLength + end + # telemetry about osm metric settings for replicaset + if (File.file?(@@osmConfigMountPath)) + properties["osmNamespaceCount"] = @@osmNamespaceCount + end + ApplicationInsightsUtility.sendMetricTelemetry("NodeCoreCapacity", capacityInfo["cpu"], properties) + telemetrySent = true - #telemetry about prometheus metric collections settings for replicaset - if (File.file?(@@promConfigMountPath)) - properties["rsPromInt"] = @@rsPromInterval - properties["rsPromFPC"] = @@rsPromFieldPassCount - properties["rsPromFDC"] = @@rsPromFieldDropCount - properties["rsPromServ"] = @@rsPromK8sServiceCount - properties["rsPromUrl"] = @@rsPromUrlCount - properties["rsPromMonPods"] = @@rsPromMonitorPods - properties["rsPromMonPodsNs"] = @@rsPromMonitorPodsNamespaceLength - properties["rsPromMonPodsLabelSelectorLength"] = @@rsPromMonitorPodsLabelSelectorLength - properties["rsPromMonPodsFieldSelectorLength"] = @@rsPromMonitorPodsFieldSelectorLength - end - # telemetry about osm metric settings for replicaset - if (File.file?(@@osmConfigMountPath)) - properties["osmNamespaceCount"] = @@osmNamespaceCount + # Telemetry for data collection config for replicaset + if (File.file?(@@configMapMountPath)) + properties["collectAllKubeEvents"] = @@collectAllKubeEvents + end + + #telemetry about prometheus metric collections settings for replicaset + if (File.file?(@@promConfigMountPath)) + properties["rsPromInt"] = @@rsPromInterval + properties["rsPromFPC"] = @@rsPromFieldPassCount + properties["rsPromFDC"] = @@rsPromFieldDropCount + properties["rsPromServ"] = @@rsPromK8sServiceCount + properties["rsPromUrl"] = @@rsPromUrlCount + properties["rsPromMonPods"] = @@rsPromMonitorPods + properties["rsPromMonPodsNs"] = @@rsPromMonitorPodsNamespaceLength + properties["rsPromMonPodsLabelSelectorLength"] = @@rsPromMonitorPodsLabelSelectorLength + properties["rsPromMonPodsFieldSelectorLength"] = @@rsPromMonitorPodsFieldSelectorLength + end + # telemetry about osm metric settings for replicaset + if (File.file?(@@osmConfigMountPath)) + properties["osmNamespaceCount"] = @@osmNamespaceCount + end + @applicationInsightsUtility.sendMetricTelemetry("NodeCoreCapacity", capacityInfo["cpu"], properties) + telemetrySent = true + rescue => errorStr + $log.warn "Failed in getting telemetry in_kube_nodes : #{errorStr}" + $log.debug_backtrace(errorStr.backtrace) + @applicationInsightsUtility.sendExceptionTelemetry(errorStr) end - ApplicationInsightsUtility.sendMetricTelemetry("NodeCoreCapacity", capacityInfo["cpu"], properties) - telemetrySent = true end end if telemetrySent == true @@ -335,7 +398,7 @@ def parse_and_emit_records(nodeInventory, batchTime = Time.utc.iso8601) if (!@@istestvar.nil? && !@@istestvar.empty? && @@istestvar.casecmp("true") == 0) $log.info("kubeNodeInventoryEmitStreamSuccess @ #{Time.now.utc.iso8601}") end - eventStream = nil + eventStream = nil end if containerNodeInventoryEventStream.count > 0 $log.info("in_kube_node::parse_and_emit_records: number of container node inventory records emitted #{containerNodeInventoryEventStream.count} @ #{Time.now.utc.iso8601}") @@ -365,7 +428,7 @@ def parse_and_emit_records(nodeInventory, batchTime = Time.utc.iso8601) rescue => errorStr $log.warn "Failed to retrieve node inventory: #{errorStr}" $log.debug_backtrace(errorStr.backtrace) - ApplicationInsightsUtility.sendExceptionTelemetry(errorStr) + @applicationInsightsUtility.sendExceptionTelemetry(errorStr) end $log.info "in_kube_nodes::parse_and_emit_records:End #{Time.now.utc.iso8601}" end @@ -394,7 +457,7 @@ def run_periodic $log.info("in_kube_nodes::run_periodic.enumerate.end #{Time.now.utc.iso8601}") rescue => errorStr $log.warn "in_kube_nodes::run_periodic: enumerate Failed to retrieve node inventory: #{errorStr}" - ApplicationInsightsUtility.sendExceptionTelemetry(errorStr) + @applicationInsightsUtility.sendExceptionTelemetry(errorStr) end end @mutex.lock @@ -408,8 +471,8 @@ def getNodeInventoryRecord(item, batchTime = Time.utc.iso8601) begin record["CollectionTime"] = batchTime #This is the time that is mapped to become TimeGenerated record["Computer"] = item["metadata"]["name"] - record["ClusterName"] = KubernetesApiClient.getClusterName - record["ClusterId"] = KubernetesApiClient.getClusterId + record["ClusterName"] = @kubernetesApiClient.getClusterName + record["ClusterId"] = @kubernetesApiClient.getClusterId record["CreationTimeStamp"] = item["metadata"]["creationTimestamp"] record["Labels"] = [item["metadata"]["labels"]] record["Status"] = "" @@ -507,7 +570,7 @@ def getNodeTelemetryProps(item) $log.warn "in_kube_nodes::getContainerNodeIngetNodeTelemetryPropsventoryRecord:Failed: #{errorStr}" end return properties - end + end end # Kube_Node_Input class NodeStatsCache # inner class for caching implementation (CPU and memory caching is handled the exact same way, so logic to do so is moved to a private inner class) @@ -578,5 +641,5 @@ def cpu() def mem() return @@memCache end - end + end end # module diff --git a/source/plugins/ruby/in_kube_nodes_test.rb b/source/plugins/ruby/in_kube_nodes_test.rb new file mode 100644 index 000000000..8f4984c6c --- /dev/null +++ b/source/plugins/ruby/in_kube_nodes_test.rb @@ -0,0 +1,171 @@ +require 'minitest/autorun' + +require 'fluent/test' +require 'fluent/test/driver/input' +require 'fluent/test/helpers' + +require_relative 'in_kube_nodes.rb' + +class InKubeNodesTests < Minitest::Test + include Fluent::Test::Helpers + + def setup + Fluent::Test.setup + end + + def create_driver(conf = {}, kubernetesApiClient=nil, applicationInsightsUtility=nil, extensionUtils=nil, env=nil, telemetry_flush_interval=nil) + Fluent::Test::Driver::Input.new(Fluent::Plugin::Kube_nodeInventory_Input.new(kubernetesApiClient=kubernetesApiClient, + applicationInsightsUtility=applicationInsightsUtility, + extensionUtils=extensionUtils, + env=env)).configure(conf) + end + + # Collection time of scrapped data will always be different. Overwrite it in any records returned by in_kube_ndes.rb + def overwrite_collection_time(data) + if data.key?("CollectionTime") + data["CollectionTime"] = "~CollectionTime~" + end + if data.key?("Timestamp") + data["Timestamp"] = "~Timestamp~" + end + return data + end + + def test_basic_single_node + kubeApiClient = Minitest::Mock.new + appInsightsUtil = Minitest::Mock.new + extensionUtils = Minitest::Mock.new + env = {} + env["NODES_CHUNK_SIZE"] = "200" + + kubeApiClient.expect(:==, false, [nil]) + appInsightsUtil.expect(:==, false, [nil]) + extensionUtils.expect(:==, false, [nil]) + + # isAADMSIAuthMode() is called multiple times and we don't really care how many time it is called. This is the same as mocking + # but it doesn't track how many times isAADMSIAuthMode is called + def extensionUtils.isAADMSIAuthMode + false + end + + nodes_api_response = eval(File.open("test/unit-tests/canned-api-responses/kube-nodes.txt").read) + kubeApiClient.expect(:getResourcesAndContinuationToken, [nil, nodes_api_response], ["nodes?limit=200"]) + kubeApiClient.expect(:getClusterName, "/cluster-name") + kubeApiClient.expect(:getClusterId, "/cluster-id") + + config = "run_interval 999999999" # only run once + + d = create_driver(config, kubernetesApiClient=kubeApiClient, applicationInsightsUtility=appInsightsUtil, extensionUtils=extensionUtils, env=env) + d.instance.start + d.instance.enumerate + d.run(timeout: 99999) # Input plugins decide when to run, so we have to give it enough time to run + + + expected_responses = { ["oneagent.containerInsights.KUBE_NODE_INVENTORY_BLOB", overwrite_collection_time({"CollectionTime"=>"2021-08-17T20:24:18Z", "Computer"=>"aks-nodepool1-24816391-vmss000000", "ClusterName"=>"/cluster-name", "ClusterId"=>"/cluster-id", "CreationTimeStamp"=>"2021-07-21T23:40:14Z", "Labels"=>[{"agentpool"=>"nodepool1", "beta.kubernetes.io/arch"=>"amd64", "beta.kubernetes.io/instance-type"=>"Standard_DS2_v2", "beta.kubernetes.io/os"=>"linux", "failure-domain.beta.kubernetes.io/region"=>"westus2", "failure-domain.beta.kubernetes.io/zone"=>"0", "kubernetes.azure.com/cluster"=>"MC_davidaks16_davidaks16_westus2", "kubernetes.azure.com/mode"=>"system", "kubernetes.azure.com/node-image-version"=>"AKSUbuntu-1804gen2containerd-2021.07.03", "kubernetes.azure.com/os-sku"=>"Ubuntu", "kubernetes.azure.com/role"=>"agent", "kubernetes.io/arch"=>"amd64", "kubernetes.io/hostname"=>"aks-nodepool1-24816391-vmss000000", "kubernetes.io/os"=>"linux", "kubernetes.io/role"=>"agent", "node-role.kubernetes.io/agent"=>"", "node.kubernetes.io/instance-type"=>"Standard_DS2_v2", "storageprofile"=>"managed", "storagetier"=>"Premium_LRS", "topology.kubernetes.io/region"=>"westus2", "topology.kubernetes.io/zone"=>"0"}], "Status"=>"Ready", "KubernetesProviderID"=>"azure", "LastTransitionTimeReady"=>"2021-07-21T23:40:24Z", "KubeletVersion"=>"v1.19.11", "KubeProxyVersion"=>"v1.19.11"})] => true, + ["mdm.kubenodeinventory", overwrite_collection_time({"CollectionTime"=>"2021-08-17T20:24:18Z", "Computer"=>"aks-nodepool1-24816391-vmss000000", "ClusterName"=>"/cluster-name", "ClusterId"=>"/cluster-id", "CreationTimeStamp"=>"2021-07-21T23:40:14Z", "Labels"=>[{"agentpool"=>"nodepool1", "beta.kubernetes.io/arch"=>"amd64", "beta.kubernetes.io/instance-type"=>"Standard_DS2_v2", "beta.kubernetes.io/os"=>"linux", "failure-domain.beta.kubernetes.io/region"=>"westus2", "failure-domain.beta.kubernetes.io/zone"=>"0", "kubernetes.azure.com/cluster"=>"MC_davidaks16_davidaks16_westus2", "kubernetes.azure.com/mode"=>"system", "kubernetes.azure.com/node-image-version"=>"AKSUbuntu-1804gen2containerd-2021.07.03", "kubernetes.azure.com/os-sku"=>"Ubuntu", "kubernetes.azure.com/role"=>"agent", "kubernetes.io/arch"=>"amd64", "kubernetes.io/hostname"=>"aks-nodepool1-24816391-vmss000000", "kubernetes.io/os"=>"linux", "kubernetes.io/role"=>"agent", "node-role.kubernetes.io/agent"=>"", "node.kubernetes.io/instance-type"=>"Standard_DS2_v2", "storageprofile"=>"managed", "storagetier"=>"Premium_LRS", "topology.kubernetes.io/region"=>"westus2", "topology.kubernetes.io/zone"=>"0"}], "Status"=>"Ready", "KubernetesProviderID"=>"azure", "LastTransitionTimeReady"=>"2021-07-21T23:40:24Z", "KubeletVersion"=>"v1.19.11", "KubeProxyVersion"=>"v1.19.11"})] => true, + ["oneagent.containerInsights.CONTAINER_NODE_INVENTORY_BLOB", overwrite_collection_time({"CollectionTime"=>"2021-08-17T20:24:18Z", "Computer"=>"aks-nodepool1-24816391-vmss000000", "OperatingSystem"=>"Ubuntu 18.04.5 LTS", "DockerVersion"=>"containerd://1.4.4+azure"})] => true, + ["oneagent.containerInsights.LINUX_PERF_BLOB", overwrite_collection_time({"Timestamp"=>"2021-08-17T20:24:18Z", "Host"=>"aks-nodepool1-24816391-vmss000000", "Computer"=>"aks-nodepool1-24816391-vmss000000", "ObjectName"=>"K8SNode", "InstanceName"=>"None/aks-nodepool1-24816391-vmss000000", "json_Collections"=>"[{\"CounterName\":\"cpuAllocatableNanoCores\",\"Value\":1900000000.0}]"})] => true, + ["oneagent.containerInsights.LINUX_PERF_BLOB", overwrite_collection_time({"Timestamp"=>"2021-08-17T20:24:18Z", "Host"=>"aks-nodepool1-24816391-vmss000000", "Computer"=>"aks-nodepool1-24816391-vmss000000", "ObjectName"=>"K8SNode", "InstanceName"=>"None/aks-nodepool1-24816391-vmss000000", "json_Collections"=>"[{\"CounterName\":\"memoryAllocatableBytes\",\"Value\":4787511296.0}]"})] => true, + ["oneagent.containerInsights.LINUX_PERF_BLOB", overwrite_collection_time({"Timestamp"=>"2021-08-17T20:24:18Z", "Host"=>"aks-nodepool1-24816391-vmss000000", "Computer"=>"aks-nodepool1-24816391-vmss000000", "ObjectName"=>"K8SNode", "InstanceName"=>"None/aks-nodepool1-24816391-vmss000000", "json_Collections"=>"[{\"CounterName\":\"cpuCapacityNanoCores\",\"Value\":2000000000.0}]"})] => true, + ["oneagent.containerInsights.LINUX_PERF_BLOB", overwrite_collection_time({"Timestamp"=>"2021-08-17T20:24:18Z", "Host"=>"aks-nodepool1-24816391-vmss000000", "Computer"=>"aks-nodepool1-24816391-vmss000000", "ObjectName"=>"K8SNode", "InstanceName"=>"None/aks-nodepool1-24816391-vmss000000", "json_Collections"=>"[{\"CounterName\":\"memoryCapacityBytes\",\"Value\":7291510784.0}]"})] => true} + + d.events.each do |tag, time, record| + cleaned_record = overwrite_collection_time record + if expected_responses.key?([tag, cleaned_record]) + expected_responses[[tag, cleaned_record]] = true + else + assert(false, "got unexpected record") + end + end + + expected_responses.each do |key, val| + assert(val, "expected record not emitted: #{key}") + end + + # make sure all mocked methods were called the expected number of times + kubeApiClient.verify + appInsightsUtil.verify + extensionUtils.verify + end + + # Sometimes customer tooling creates invalid node specs in the Kube API server (its happened more than once). + # This test makes sure that it doesn't creash the entire input plugin and other nodes are still collected + def test_malformed_node_spec + kubeApiClient = Minitest::Mock.new + appInsightsUtil = Minitest::Mock.new + extensionUtils = Minitest::Mock.new + env = {} + env["NODES_CHUNK_SIZE"] = "200" + + kubeApiClient.expect(:==, false, [nil]) + appInsightsUtil.expect(:==, false, [nil]) + extensionUtils.expect(:==, false, [nil]) + + # isAADMSIAuthMode() is called multiple times and we don't really care how many time it is called. This is the same as mocking + # but it doesn't track how many times isAADMSIAuthMode is called + def extensionUtils.isAADMSIAuthMode + false + end + + # Set up the KubernetesApiClient Mock. Note: most of the functions in KubernetesApiClient are pure (access no + # state other than their arguments), so there is no need to mock them (this test file would be far longer and + # more brittle). Instead, in_kube_nodes bypasses the mock and directly calls these functions in KubernetesApiClient. + # Ideally the pure functions in KubernetesApiClient would be refactored into their own file to reduce confusion. + nodes_api_response = eval(File.open("test/unit-tests/canned-api-responses/kube-nodes-malformed.txt").read) + kubeApiClient.expect(:getResourcesAndContinuationToken, [nil, nodes_api_response], ["nodes?limit=200"]) + kubeApiClient.expect(:getClusterName, "/cluster-name") + kubeApiClient.expect(:getClusterName, "/cluster-name") + kubeApiClient.expect(:getClusterId, "/cluster-id") + kubeApiClient.expect(:getClusterId, "/cluster-id") + + def appInsightsUtil.sendExceptionTelemetry(exception) + if exception.to_s != "undefined method `[]' for nil:NilClass" + raise "an unexpected exception has occured" + end + end + + # This test doesn't care if metric telemetry is sent properly. Looking for an unnecessary value would make it needlessly rigid + def appInsightsUtil.sendMetricTelemetry(a, b, c) + end + + config = "run_interval 999999999" # only run once + + d = create_driver(config, kubernetesApiClient=kubeApiClient, applicationInsightsUtility=appInsightsUtil, extensionUtils=extensionUtils, env=env, telemetry_flush_interval=0) + d.instance.start + + d.instance.enumerate + d.run(timeout: 99999) #TODO: is this necessary? + + expected_responses = { + ["oneagent.containerInsights.KUBE_NODE_INVENTORY_BLOB", {"CollectionTime"=>"~CollectionTime~", "Computer"=>"correct-node", "ClusterName"=>"/cluster-name", "ClusterId"=>"/cluster-id", "CreationTimeStamp"=>"2021-07-21T23:40:14Z", "Labels"=>[{"agentpool"=>"nodepool1", "beta.kubernetes.io/arch"=>"amd64", "beta.kubernetes.io/instance-type"=>"Standard_DS2_v2", "beta.kubernetes.io/os"=>"linux", "failure-domain.beta.kubernetes.io/region"=>"westus2", "failure-domain.beta.kubernetes.io/zone"=>"0", "kubernetes.azure.com/cluster"=>"MC_davidaks16_davidaks16_westus2", "kubernetes.azure.com/mode"=>"system", "kubernetes.azure.com/node-image-version"=>"AKSUbuntu-1804gen2containerd-2021.07.03", "kubernetes.azure.com/os-sku"=>"Ubuntu", "kubernetes.azure.com/role"=>"agent", "kubernetes.io/arch"=>"amd64", "kubernetes.io/hostname"=>"correct-node", "kubernetes.io/os"=>"linux", "kubernetes.io/role"=>"agent", "node-role.kubernetes.io/agent"=>"", "node.kubernetes.io/instance-type"=>"Standard_DS2_v2", "storageprofile"=>"managed", "storagetier"=>"Premium_LRS", "topology.kubernetes.io/region"=>"westus2", "topology.kubernetes.io/zone"=>"0"}], "Status"=>"Ready", "KubernetesProviderID"=>"azure", "LastTransitionTimeReady"=>"2021-07-21T23:40:24Z", "KubeletVersion"=>"v1.19.11", "KubeProxyVersion"=>"v1.19.11"}] => false, + ["mdm.kubenodeinventory", {"CollectionTime"=>"~CollectionTime~", "Computer"=>"correct-node", "ClusterName"=>"/cluster-name", "ClusterId"=>"/cluster-id", "CreationTimeStamp"=>"2021-07-21T23:40:14Z", "Labels"=>[{"agentpool"=>"nodepool1", "beta.kubernetes.io/arch"=>"amd64", "beta.kubernetes.io/instance-type"=>"Standard_DS2_v2", "beta.kubernetes.io/os"=>"linux", "failure-domain.beta.kubernetes.io/region"=>"westus2", "failure-domain.beta.kubernetes.io/zone"=>"0", "kubernetes.azure.com/cluster"=>"MC_davidaks16_davidaks16_westus2", "kubernetes.azure.com/mode"=>"system", "kubernetes.azure.com/node-image-version"=>"AKSUbuntu-1804gen2containerd-2021.07.03", "kubernetes.azure.com/os-sku"=>"Ubuntu", "kubernetes.azure.com/role"=>"agent", "kubernetes.io/arch"=>"amd64", "kubernetes.io/hostname"=>"correct-node", "kubernetes.io/os"=>"linux", "kubernetes.io/role"=>"agent", "node-role.kubernetes.io/agent"=>"", "node.kubernetes.io/instance-type"=>"Standard_DS2_v2", "storageprofile"=>"managed", "storagetier"=>"Premium_LRS", "topology.kubernetes.io/region"=>"westus2", "topology.kubernetes.io/zone"=>"0"}], "Status"=>"Ready", "KubernetesProviderID"=>"azure", "LastTransitionTimeReady"=>"2021-07-21T23:40:24Z", "KubeletVersion"=>"v1.19.11", "KubeProxyVersion"=>"v1.19.11"}] => false, + ["oneagent.containerInsights.CONTAINER_NODE_INVENTORY_BLOB", {"CollectionTime"=>"~CollectionTime~", "Computer"=>"correct-node", "OperatingSystem"=>"Ubuntu 18.04.5 LTS", "DockerVersion"=>"containerd://1.4.4+azure"}] => false, + ["oneagent.containerInsights.LINUX_PERF_BLOB", {"Timestamp"=>"~Timestamp~", "Host"=>"correct-node", "Computer"=>"correct-node", "ObjectName"=>"K8SNode", "InstanceName"=>"None/correct-node", "json_Collections"=>"[{\"CounterName\":\"cpuAllocatableNanoCores\",\"Value\":1000000.0}]"}] => false, + ["oneagent.containerInsights.LINUX_PERF_BLOB", {"Timestamp"=>"~Timestamp~", "Host"=>"correct-node", "Computer"=>"correct-node", "ObjectName"=>"K8SNode", "InstanceName"=>"None/correct-node", "json_Collections"=>"[{\"CounterName\":\"memoryAllocatableBytes\",\"Value\":444.0}]"}] => false, + ["oneagent.containerInsights.LINUX_PERF_BLOB", {"Timestamp"=>"~Timestamp~", "Host"=>"correct-node", "Computer"=>"correct-node", "ObjectName"=>"K8SNode", "InstanceName"=>"None/correct-node", "json_Collections"=>"[{\"CounterName\":\"cpuCapacityNanoCores\",\"Value\":2000000.0}]"}] => false, + ["oneagent.containerInsights.LINUX_PERF_BLOB", {"Timestamp"=>"~Timestamp~", "Host"=>"correct-node", "Computer"=>"correct-node", "ObjectName"=>"K8SNode", "InstanceName"=>"None/correct-node", "json_Collections"=>"[{\"CounterName\":\"memoryCapacityBytes\",\"Value\":555.0}]"}] => false, + + # these records are for the malformed node (it doesn't have limits or requests set so there are no PERF records) + ["oneagent.containerInsights.KUBE_NODE_INVENTORY_BLOB", {"CollectionTime"=>"~CollectionTime~", "Computer"=>"malformed-node", "ClusterName"=>"/cluster-name", "ClusterId"=>"/cluster-id", "CreationTimeStamp"=>"2021-07-21T23:40:14Z", "Labels"=>[{"agentpool"=>"nodepool1", "beta.kubernetes.io/arch"=>"amd64", "beta.kubernetes.io/instance-type"=>"Standard_DS2_v2", "beta.kubernetes.io/os"=>"linux", "failure-domain.beta.kubernetes.io/region"=>"westus2", "failure-domain.beta.kubernetes.io/zone"=>"0", "kubernetes.azure.com/cluster"=>"MC_davidaks16_davidaks16_westus2", "kubernetes.azure.com/mode"=>"system", "kubernetes.azure.com/node-image-version"=>"AKSUbuntu-1804gen2containerd-2021.07.03", "kubernetes.azure.com/os-sku"=>"Ubuntu", "kubernetes.azure.com/role"=>"agent", "kubernetes.io/arch"=>"amd64", "kubernetes.io/hostname"=>"malformed-node", "kubernetes.io/os"=>"linux", "kubernetes.io/role"=>"agent", "node-role.kubernetes.io/agent"=>"", "node.kubernetes.io/instance-type"=>"Standard_DS2_v2", "storageprofile"=>"managed", "storagetier"=>"Premium_LRS", "topology.kubernetes.io/region"=>"westus2", "topology.kubernetes.io/zone"=>"0"}], "Status"=>"Ready", "KubernetesProviderID"=>"azure", "LastTransitionTimeReady"=>"2021-07-21T23:40:24Z", "KubeletVersion"=>"v1.19.11", "KubeProxyVersion"=>"v1.19.11"}] => false, + ["mdm.kubenodeinventory", {"CollectionTime"=>"~CollectionTime~", "Computer"=>"malformed-node", "ClusterName"=>"/cluster-name", "ClusterId"=>"/cluster-id", "CreationTimeStamp"=>"2021-07-21T23:40:14Z", "Labels"=>[{"agentpool"=>"nodepool1", "beta.kubernetes.io/arch"=>"amd64", "beta.kubernetes.io/instance-type"=>"Standard_DS2_v2", "beta.kubernetes.io/os"=>"linux", "failure-domain.beta.kubernetes.io/region"=>"westus2", "failure-domain.beta.kubernetes.io/zone"=>"0", "kubernetes.azure.com/cluster"=>"MC_davidaks16_davidaks16_westus2", "kubernetes.azure.com/mode"=>"system", "kubernetes.azure.com/node-image-version"=>"AKSUbuntu-1804gen2containerd-2021.07.03", "kubernetes.azure.com/os-sku"=>"Ubuntu", "kubernetes.azure.com/role"=>"agent", "kubernetes.io/arch"=>"amd64", "kubernetes.io/hostname"=>"malformed-node", "kubernetes.io/os"=>"linux", "kubernetes.io/role"=>"agent", "node-role.kubernetes.io/agent"=>"", "node.kubernetes.io/instance-type"=>"Standard_DS2_v2", "storageprofile"=>"managed", "storagetier"=>"Premium_LRS", "topology.kubernetes.io/region"=>"westus2", "topology.kubernetes.io/zone"=>"0"}], "Status"=>"Ready", "KubernetesProviderID"=>"azure", "LastTransitionTimeReady"=>"2021-07-21T23:40:24Z", "KubeletVersion"=>"v1.19.11", "KubeProxyVersion"=>"v1.19.11"}] => false, + ["oneagent.containerInsights.CONTAINER_NODE_INVENTORY_BLOB", {"CollectionTime"=>"~CollectionTime~", "Computer"=>"malformed-node", "OperatingSystem"=>"Ubuntu 18.04.5 LTS", "DockerVersion"=>"containerd://1.4.4+azure"}] => false + } + + d.events.each do |tag, time, record| + cleaned_record = overwrite_collection_time record + if expected_responses.key?([tag, cleaned_record]) + expected_responses[[tag, cleaned_record]] = true + end + # don't do anything if an unexpected record was emitted. Since the node spec is malformed, there will be some partial data. + # we care more that the non-malformed data is still emitted + end + + expected_responses.each do |key, val| + assert(val, "expected record not emitted: #{key}") + end + + kubeApiClient.verify + appInsightsUtil.verify + extensionUtils.verify + end +end diff --git a/source/plugins/ruby/in_kube_podinventory.rb b/source/plugins/ruby/in_kube_podinventory.rb index 5598602cd..5a33ef790 100644 --- a/source/plugins/ruby/in_kube_podinventory.rb +++ b/source/plugins/ruby/in_kube_podinventory.rb @@ -11,7 +11,6 @@ class Kube_PodInventory_Input < Input @@MDMKubePodInventoryTag = "mdm.kubepodinventory" @@hostName = (OMS::Common.get_hostname) - def initialize super @@ -27,6 +26,7 @@ def initialize require_relative "oms_common" require_relative "omslog" require_relative "constants" + require_relative "extension_utils" # refer tomlparser-agent-config for updating defaults # this configurable via configmap @@ -39,12 +39,12 @@ def initialize @winContainerCount = 0 @controllerData = {} @podInventoryE2EProcessingLatencyMs = 0 - @podsAPIE2ELatencyMs = 0 - + @podsAPIE2ELatencyMs = 0 + @kubeperfTag = "oneagent.containerInsights.LINUX_PERF_BLOB" @kubeservicesTag = "oneagent.containerInsights.KUBE_SERVICES_BLOB" @containerInventoryTag = "oneagent.containerInsights.CONTAINER_INVENTORY_BLOB" - @insightsMetricsTag = "oneagent.containerInsights.INSIGHTS_METRICS_BLOB" + @insightsMetricsTag = "oneagent.containerInsights.INSIGHTS_METRICS_BLOB" end config_param :run_interval, :time, :default => 60 @@ -55,7 +55,7 @@ def configure(conf) @inventoryToMdmConvertor = Inventory2MdmConvertor.new() end - def start + def start if @run_interval super if !ENV["PODS_CHUNK_SIZE"].nil? && !ENV["PODS_CHUNK_SIZE"].empty? && ENV["PODS_CHUNK_SIZE"].to_i > 0 @@ -107,7 +107,30 @@ def enumerate(podList = nil) batchTime = currentTime.utc.iso8601 serviceRecords = [] @podInventoryE2EProcessingLatencyMs = 0 - podInventoryStartTime = (Time.now.to_f * 1000).to_i + podInventoryStartTime = (Time.now.to_f * 1000).to_i + if ExtensionUtils.isAADMSIAuthMode() + $log.info("in_kube_podinventory::enumerate: AAD AUTH MSI MODE") + if @kubeperfTag.nil? || !@kubeperfTag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @kubeperfTag = ExtensionUtils.getOutputStreamId(Constants::PERF_DATA_TYPE) + end + if @kubeservicesTag.nil? || !@kubeservicesTag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @kubeservicesTag = ExtensionUtils.getOutputStreamId(Constants::KUBE_SERVICES_DATA_TYPE) + end + if @containerInventoryTag.nil? || !@containerInventoryTag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @containerInventoryTag = ExtensionUtils.getOutputStreamId(Constants::CONTAINER_INVENTORY_DATA_TYPE) + end + if @insightsMetricsTag.nil? || !@insightsMetricsTag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @insightsMetricsTag = ExtensionUtils.getOutputStreamId(Constants::INSIGHTS_METRICS_DATA_TYPE) + end + if @tag.nil? || !@tag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @tag = ExtensionUtils.getOutputStreamId(Constants::KUBE_POD_INVENTORY_DATA_TYPE) + end + $log.info("in_kube_podinventory::enumerate: using perf tag -#{@kubeperfTag} @ #{Time.now.utc.iso8601}") + $log.info("in_kube_podinventory::enumerate: using kubeservices tag -#{@kubeservicesTag} @ #{Time.now.utc.iso8601}") + $log.info("in_kube_podinventory::enumerate: using containerinventory tag -#{@containerInventoryTag} @ #{Time.now.utc.iso8601}") + $log.info("in_kube_podinventory::enumerate: using insightsmetrics tag -#{@insightsMetricsTag} @ #{Time.now.utc.iso8601}") + $log.info("in_kube_podinventory::enumerate: using kubepodinventory tag -#{@tag} @ #{Time.now.utc.iso8601}") + end # Get services first so that we dont need to make a call for very chunk $log.info("in_kube_podinventory::enumerate : Getting services from Kube API @ #{Time.now.utc.iso8601}") @@ -197,8 +220,8 @@ def enumerate(podList = nil) end def parse_and_emit_records(podInventory, serviceRecords, continuationToken, batchTime = Time.utc.iso8601) - currentTime = Time.now - emitTime = Fluent::Engine.now + currentTime = Time.now + emitTime = Fluent::Engine.now #batchTime = currentTime.utc.iso8601 eventStream = Fluent::MultiEventStream.new containerInventoryStream = Fluent::MultiEventStream.new @@ -214,8 +237,8 @@ def parse_and_emit_records(podInventory, serviceRecords, continuationToken, batc podInventoryRecords = getPodInventoryRecords(item, serviceRecords, batchTime) podInventoryRecords.each do |record| if !record.nil? - eventStream.add(emitTime, record) if record - @inventoryToMdmConvertor.process_pod_inventory_record(record) + eventStream.add(emitTime, record) if record + @inventoryToMdmConvertor.process_pod_inventory_record(record) end end # Setting this flag to true so that we can send ContainerInventory records for containers @@ -232,7 +255,7 @@ def parse_and_emit_records(podInventory, serviceRecords, continuationToken, batc # Send container inventory records for containers on windows nodes @winContainerCount += containerInventoryRecords.length containerInventoryRecords.each do |cirecord| - if !cirecord.nil? + if !cirecord.nil? containerInventoryStream.add(emitTime, cirecord) if cirecord end end @@ -255,7 +278,7 @@ def parse_and_emit_records(podInventory, serviceRecords, continuationToken, batc containerMetricDataItems.concat(KubernetesApiClient.getContainerResourceRequestsAndLimits(item, "limits", "cpu", "cpuLimitNanoCores", batchTime)) containerMetricDataItems.concat(KubernetesApiClient.getContainerResourceRequestsAndLimits(item, "limits", "memory", "memoryLimitBytes", batchTime)) - containerMetricDataItems.each do |record| + containerMetricDataItems.each do |record| kubePerfEventStream.add(emitTime, record) if record end @@ -274,7 +297,7 @@ def parse_and_emit_records(podInventory, serviceRecords, continuationToken, batc containerGPUInsightsMetricsDataItems.concat(KubernetesApiClient.getContainerResourceRequestsAndLimitsAsInsightsMetrics(item, "limits", "nvidia.com/gpu", "containerGpuLimits", batchTime)) containerGPUInsightsMetricsDataItems.concat(KubernetesApiClient.getContainerResourceRequestsAndLimitsAsInsightsMetrics(item, "requests", "amd.com/gpu", "containerGpuRequests", batchTime)) containerGPUInsightsMetricsDataItems.concat(KubernetesApiClient.getContainerResourceRequestsAndLimitsAsInsightsMetrics(item, "limits", "amd.com/gpu", "containerGpuLimits", batchTime)) - containerGPUInsightsMetricsDataItems.each do |insightsMetricsRecord| + containerGPUInsightsMetricsDataItems.each do |insightsMetricsRecord| insightsMetricsEventStream.add(emitTime, insightsMetricsRecord) if insightsMetricsRecord end @@ -341,7 +364,7 @@ def parse_and_emit_records(podInventory, serviceRecords, continuationToken, batc if !kubeServiceRecord.nil? # adding before emit to reduce memory foot print kubeServiceRecord["ClusterId"] = KubernetesApiClient.getClusterId - kubeServiceRecord["ClusterName"] = KubernetesApiClient.getClusterName + kubeServiceRecord["ClusterName"] = KubernetesApiClient.getClusterName kubeServicesEventStream.add(emitTime, kubeServiceRecord) if kubeServiceRecord if @PODS_EMIT_STREAM_BATCH_SIZE > 0 && kubeServicesEventStream.count >= @PODS_EMIT_STREAM_BATCH_SIZE $log.info("in_kube_podinventory::parse_and_emit_records: number of service records emitted #{@PODS_EMIT_STREAM_BATCH_SIZE} @ #{Time.now.utc.iso8601}") @@ -648,6 +671,6 @@ def getServiceNameFromLabels(namespace, labels, serviceRecords) ApplicationInsightsUtility.sendExceptionTelemetry(errorStr) end return serviceName - end + end end # Kube_Pod_Input end # module diff --git a/source/plugins/ruby/in_kube_pvinventory.rb b/source/plugins/ruby/in_kube_pvinventory.rb index 40eebac8a..6af3c280f 100644 --- a/source/plugins/ruby/in_kube_pvinventory.rb +++ b/source/plugins/ruby/in_kube_pvinventory.rb @@ -20,6 +20,7 @@ def initialize require_relative "oms_common" require_relative "omslog" require_relative "constants" + require_relative "extension_utils" # Response size is around 1500 bytes per PV @PV_CHUNK_SIZE = "5000" @@ -33,7 +34,7 @@ def configure(conf) super end - def start + def start if @run_interval super @finished = false @@ -61,7 +62,13 @@ def enumerate telemetryFlush = false @pvTypeToCountHash = {} currentTime = Time.now - batchTime = currentTime.utc.iso8601 + batchTime = currentTime.utc.iso8601 + if ExtensionUtils.isAADMSIAuthMode() + $log.info("in_kube_pvinventory::enumerate: AAD AUTH MSI MODE") + if @tag.nil? || !@tag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @tag = ExtensionUtils.getOutputStreamId(Constants::KUBE_PV_INVENTORY_DATA_TYPE) + end + end continuationToken = nil $log.info("in_kube_pvinventory::enumerate : Getting PVs from Kube API @ #{Time.now.utc.iso8601}") @@ -93,7 +100,6 @@ def enumerate if (timeDifferenceInMinutes >= Constants::TELEMETRY_FLUSH_INTERVAL_IN_MINUTES) telemetryFlush = true end - # Flush AppInsights telemetry once all the processing is done if telemetryFlush == true telemetryProperties = {} @@ -110,8 +116,8 @@ def enumerate end # end enumerate def parse_and_emit_records(pvInventory, batchTime = Time.utc.iso8601) - currentTime = Time.now - emitTime = Fluent::Engine.now + currentTime = Time.now + emitTime = Fluent::Engine.now eventStream = Fluent::MultiEventStream.new @@istestvar = ENV["ISTEST"] begin @@ -152,8 +158,8 @@ def parse_and_emit_records(pvInventory, batchTime = Time.utc.iso8601) end records.each do |record| - if !record.nil? - eventStream.add(emitTime, record) + if !record.nil? + eventStream.add(emitTime, record) end end @@ -191,7 +197,6 @@ def getTypeInfo(item) begin if !item["spec"].nil? (Constants::PV_TYPES).each do |pvType| - # PV is this type if !item["spec"][pvType].nil? @@ -252,6 +257,6 @@ def run_periodic @mutex.lock end @mutex.unlock - end + end end # Kube_PVInventory_Input end # module diff --git a/source/plugins/ruby/in_kubestate_deployments.rb b/source/plugins/ruby/in_kubestate_deployments.rb index 182c3ffc1..0b563a890 100644 --- a/source/plugins/ruby/in_kubestate_deployments.rb +++ b/source/plugins/ruby/in_kubestate_deployments.rb @@ -22,6 +22,7 @@ def initialize require_relative "omslog" require_relative "ApplicationInsightsUtility" require_relative "constants" + require_relative "extension_utils" # refer tomlparser-agent-config for defaults # this configurable via configmap @@ -44,7 +45,7 @@ def configure(conf) super end - def start + def start if @run_interval super if !ENV["DEPLOYMENTS_CHUNK_SIZE"].nil? && !ENV["DEPLOYMENTS_CHUNK_SIZE"].empty? && ENV["DEPLOYMENTS_CHUNK_SIZE"].to_i > 0 @@ -55,11 +56,11 @@ def start @DEPLOYMENTS_CHUNK_SIZE = 500 end $log.info("in_kubestate_deployments::start : DEPLOYMENTS_CHUNK_SIZE @ #{@DEPLOYMENTS_CHUNK_SIZE}") - + @finished = false @condition = ConditionVariable.new @mutex = Mutex.new - @thread = Thread.new(&method(:run_periodic)) + @thread = Thread.new(&method(:run_periodic)) end end @@ -81,8 +82,14 @@ def enumerate batchTime = currentTime.utc.iso8601 #set the running total for this batch to 0 - @deploymentsRunningTotal = 0 - + @deploymentsRunningTotal = 0 + + if ExtensionUtils.isAADMSIAuthMode() + $log.info("in_kubestate_deployments::enumerate: AAD AUTH MSI MODE") + if @tag.nil? || !@tag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @tag = ExtensionUtils.getOutputStreamId(Constants::INSIGHTS_METRICS_DATA_TYPE) + end + end # Initializing continuation token to nil continuationToken = nil $log.info("in_kubestate_deployments::enumerate : Getting deployments from Kube API @ #{Time.now.utc.iso8601}") @@ -186,7 +193,7 @@ def parse_and_emit_records(deployments, batchTime = Time.utc.iso8601) end time = Fluent::Engine.now - metricItems.each do |insightsMetricsRecord| + metricItems.each do |insightsMetricsRecord| insightsMetricsEventStream.add(time, insightsMetricsRecord) if insightsMetricsRecord end @@ -233,6 +240,6 @@ def run_periodic @mutex.lock end @mutex.unlock - end + end end end diff --git a/source/plugins/ruby/in_kubestate_hpa.rb b/source/plugins/ruby/in_kubestate_hpa.rb index 8f60bfb72..178f7944f 100644 --- a/source/plugins/ruby/in_kubestate_hpa.rb +++ b/source/plugins/ruby/in_kubestate_hpa.rb @@ -18,7 +18,8 @@ def initialize require_relative "oms_common" require_relative "omslog" require_relative "ApplicationInsightsUtility" - require_relative "constants" + require_relative "constants" + require_relative "extension_utils" # refer tomlparser-agent-config for defaults # this configurable via configmap @@ -41,7 +42,7 @@ def configure(conf) super end - def start + def start if @run_interval super if !ENV["HPA_CHUNK_SIZE"].nil? && !ENV["HPA_CHUNK_SIZE"].empty? && ENV["HPA_CHUNK_SIZE"].to_i > 0 @@ -78,7 +79,14 @@ def enumerate batchTime = currentTime.utc.iso8601 @hpaCount = 0 - + + if ExtensionUtils.isAADMSIAuthMode() + $log.info("in_kubestate_hpa::enumerate: AAD AUTH MSI MODE") + if @tag.nil? || !@tag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @tag = ExtensionUtils.getOutputStreamId(Constants::INSIGHTS_METRICS_DATA_TYPE) + end + $log.info("in_kubestate_hpa::enumerate: using tag -#{@tag} @ #{Time.now.utc.iso8601}") + end # Initializing continuation token to nil continuationToken = nil $log.info("in_kubestate_hpa::enumerate : Getting HPAs from Kube API @ #{Time.now.utc.iso8601}") @@ -186,7 +194,7 @@ def parse_and_emit_records(hpas, batchTime = Time.utc.iso8601) end time = Fluent::Engine.now - metricItems.each do |insightsMetricsRecord| + metricItems.each do |insightsMetricsRecord| insightsMetricsEventStream.add(time, insightsMetricsRecord) if insightsMetricsRecord end @@ -231,6 +239,6 @@ def run_periodic @mutex.lock end @mutex.unlock - end + end end end diff --git a/source/plugins/ruby/in_win_cadvisor_perf.rb b/source/plugins/ruby/in_win_cadvisor_perf.rb index 9ab2474b1..dd462fdf2 100644 --- a/source/plugins/ruby/in_win_cadvisor_perf.rb +++ b/source/plugins/ruby/in_win_cadvisor_perf.rb @@ -20,6 +20,7 @@ def initialize require_relative "oms_common" require_relative "omslog" require_relative "constants" + require_relative "extension_utils" @insightsMetricsTag = "oneagent.containerInsights.INSIGHTS_METRICS_BLOB" end @@ -58,6 +59,17 @@ def enumerate() timeDifference = (DateTime.now.to_time.to_i - @@winNodeQueryTimeTracker).abs timeDifferenceInMinutes = timeDifference / 60 @@istestvar = ENV["ISTEST"] + if ExtensionUtils.isAADMSIAuthMode() + $log.info("in_win_cadvisor_perf::enumerate: AAD AUTH MSI MODE") + if @tag.nil? || !@tag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @tag = ExtensionUtils.getOutputStreamId(Constants::PERF_DATA_TYPE) + end + if @insightsMetricsTag.nil? || !@insightsMetricsTag.start_with?(Constants::EXTENSION_OUTPUT_STREAM_ID_TAG_PREFIX) + @insightsMetricsTag = ExtensionUtils.getOutputStreamId(Constants::INSIGHTS_METRICS_DATA_TYPE) + end + $log.info("in_win_cadvisor_perf::enumerate: using perf tag -#{@kubeperfTag} @ #{Time.now.utc.iso8601}") + $log.info("in_win_cadvisor_perf::enumerate: using insightsmetrics tag -#{@insightsMetricsTag} @ #{Time.now.utc.iso8601}") + end #Resetting this cache so that it is populated with the current set of containers with every call CAdvisorMetricsAPIClient.resetWinContainerIdCache() diff --git a/source/plugins/ruby/kubelet_utils.rb b/source/plugins/ruby/kubelet_utils.rb index 22bc87c0e..e31407b54 100644 --- a/source/plugins/ruby/kubelet_utils.rb +++ b/source/plugins/ruby/kubelet_utils.rb @@ -41,6 +41,114 @@ def get_node_capacity end end + def get_node_allocatable(cpu_capacity, memory_capacity) + begin + if cpu_capacity == 0.0 || memory_capacity == 0.0 + @log.error "kubelet_utils.rb::get_node_allocatble - cpu_capacity or memory_capacity values not set. Hence we cannot calculate allocatable values" + end + + cpu_allocatable = 1.0 + memory_allocatable = 1.0 + + allocatable_response = CAdvisorMetricsAPIClient.getCongifzCAdvisor(winNode: nil) + parsed_response = JSON.parse(allocatable_response.body) + + begin + kubereserved_cpu = parsed_response["kubeletconfig"]["kubeReserved"]["cpu"] + if kubereserved_cpu.nil? || kubereserved_cpu == "" + kubereserved_cpu = "0" + end + @log.info "get_node_allocatable::kubereserved_cpu #{kubereserved_cpu}" + rescue => errorStr + @log.error "Error in get_node_allocatable::kubereserved_cpu: #{errorStr}" + kubereserved_cpu = "0" + ApplicationInsightsUtility.sendExceptionTelemetry("Error in get_node_allocatable::kubereserved_cpu: #{errorStr}") + end + + begin + kubereserved_memory = parsed_response["kubeletconfig"]["kubeReserved"]["memory"] + if kubereserved_memory.nil? || kubereserved_memory == "" + kubereserved_memory = "0" + end + @log.info "get_node_allocatable::kubereserved_memory #{kubereserved_memory}" + rescue => errorStr + @log.error "Error in get_node_allocatable::kubereserved_memory: #{errorStr}" + kubereserved_memory = "0" + ApplicationInsightsUtility.sendExceptionTelemetry("Error in get_node_allocatable::kubereserved_cpu: #{errorStr}") + end + begin + systemReserved_cpu = parsed_response["kubeletconfig"]["systemReserved"]["cpu"] + if systemReserved_cpu.nil? || systemReserved_cpu == "" + systemReserved_cpu = "0" + end + @log.info "get_node_allocatable::systemReserved_cpu #{systemReserved_cpu}" + rescue => errorStr + # this will likely always reach this condition for AKS ~ only applicable for hyrid + MDM combination + @log.error "Error in get_node_allocatable::systemReserved_cpu: #{errorStr}" + systemReserved_cpu = "0" + ApplicationInsightsUtility.sendExceptionTelemetry("Error in get_node_allocatable::kubereserved_cpu: #{errorStr}") + end + + begin + explicitlyReserved_cpu = parsed_response["kubeletconfig"]["reservedCPUs"] + if explicitlyReserved_cpu.nil? || explicitlyReserved_cpu == "" + explicitlyReserved_cpu = "0" + end + @log.info "get_node_allocatable::explicitlyReserved_cpu #{explicitlyReserved_cpu}" + rescue => errorStr + # this will likely always reach this condition for AKS ~ only applicable for hyrid + MDM combination + @log.error "Error in get_node_allocatable::explicitlyReserved_cpu: #{errorStr}" + explicitlyReserved_cpu = "0" + ApplicationInsightsUtility.sendExceptionTelemetry("Error in get_node_allocatable::explicitlyReserved_cpu: #{errorStr}") + end + + begin + systemReserved_memory = parsed_response["kubeletconfig"]["systemReserved"]["memory"] + if systemReserved_memory.nil? || systemReserved_memory == "" + systemReserved_memory = "0" + end + @log.info "get_node_allocatable::systemReserved_memory #{systemReserved_memory}" + rescue => errorStr + @log.error "Error in get_node_allocatable::systemReserved_memory: #{errorStr}" + systemReserved_memory = "0" + ApplicationInsightsUtility.sendExceptionTelemetry("Error in get_node_allocatable::kubereserved_cpu: #{errorStr}") + end + + begin + evictionHard_memory = parsed_response["kubeletconfig"]["evictionHard"]["memory.available"] + if evictionHard_memory.nil? || evictionHard_memory == "" + evictionHard_memory = "0" + end + @log.info "get_node_allocatable::evictionHard_memory #{evictionHard_memory}" + rescue => errorStr + @log.error "Error in get_node_allocatable::evictionHard_memory: #{errorStr}" + evictionHard_memory = "0" + ApplicationInsightsUtility.sendExceptionTelemetry("Error in get_node_allocatable::kubereserved_cpu: #{errorStr}") + end + + # do calculation in nanocore since that's what KubernetesApiClient.getMetricNumericValue expects + cpu_capacity_number = cpu_capacity.to_i * 1000.0 ** 2 + # subtract to get allocatable. Formula : Allocatable = Capacity - ( kube reserved + system reserved + eviction threshold ) + # https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable + if KubernetesApiClient.getMetricNumericValue("cpu", explicitlyReserved_cpu) > 0 + cpu_allocatable = cpu_capacity_number - KubernetesApiClient.getMetricNumericValue("cpu", explicitlyReserved_cpu) + else + cpu_allocatable = cpu_capacity_number - (KubernetesApiClient.getMetricNumericValue("cpu", kubereserved_cpu) + KubernetesApiClient.getMetricNumericValue("cpu", systemReserved_cpu)) + end + # convert back to units similar to what we get for capacity + cpu_allocatable = cpu_allocatable / (1000.0 ** 2) + @log.info "CPU Allocatable #{cpu_allocatable}" + + memory_allocatable = memory_capacity - (KubernetesApiClient.getMetricNumericValue("memory", kubereserved_memory) + KubernetesApiClient.getMetricNumericValue("memory", systemReserved_memory) + KubernetesApiClient.getMetricNumericValue("memory", evictionHard_memory)) + @log.info "Memory Allocatable #{memory_allocatable}" + + return [cpu_allocatable, memory_allocatable] + rescue => errorStr + @log.info "Error get_node_allocatable: #{errorStr}" + ApplicationInsightsUtility.sendExceptionTelemetry(errorStr) + end + end + def get_all_container_limits begin @log.info "in get_all_container_limits..." diff --git a/source/plugins/ruby/oms_common.rb b/source/plugins/ruby/oms_common.rb new file mode 100644 index 000000000..c10cb8638 --- /dev/null +++ b/source/plugins/ruby/oms_common.rb @@ -0,0 +1,143 @@ +module OMS + + MSDockerCImprovHostnameFilePath = '/var/opt/microsoft/docker-cimprov/state/containerhostname' + IPV6_REGEX = '\h{4}:\h{4}:\h{4}:\h{4}:\h{4}:\h{4}:\h{4}:\h{4}' + IPV4_Approximate_REGEX = '\d+\.\d+\.\d+\.\d+' + + class RetryRequestException < Exception + # Throw this exception to tell the fluentd engine to retry and + # inform the output plugin that it is indeed retryable + end + + class Common + require 'socket' + require_relative 'omslog' + + @@Hostname = nil + @@HostnameFilePath = MSDockerCImprovHostnameFilePath + + + class << self + + # Internal methods + # (left public for easy testing, though protected may be better later) + + def clean_hostname_string(hnBuffer) + return "" if hnBuffer.nil? # So give the rest of the program a string to deal with. + hostname_buffer = hnBuffer.strip + return hostname_buffer + end + + def has_designated_hostnamefile? + return false if @@HostnameFilePath.nil? + return false unless @@HostnameFilePath =~ /\w/ + return false unless File.exist?(@@HostnameFilePath) + return true + end + + def is_dot_separated_string?(hnBuffer) + return true if /[^.]+\.[^.]+/ =~ hnBuffer + return false + end + + def is_hostname_compliant?(hnBuffer) + # RFC 2181: + # Size limit is 1 to 63 octets, so probably bytesize is appropriate method. + return false if hnBuffer.nil? + return false if /\./ =~ hnBuffer # Hostname by definition may not contain a dot. + return false if /:/ =~ hnBuffer # Hostname by definition may not contain a colon. + return false unless 1 <= hnBuffer.bytesize && hnBuffer.bytesize <= 63 + return true + end + + def is_like_ipv4_string?(hnBuffer) + return false unless /\A#{IPV4_Approximate_REGEX}\z/ =~ hnBuffer + qwa = hnBuffer.split('.') + return false unless qwa.length == 4 + return false if qwa[0].to_i == 0 + qwa.each do |quadwordstring| + bi = quadwordstring.to_i + # This may need more detail if 255 octets are sometimes allowed, but I don't think so. + return false unless 0 <= bi and bi < 255 + end + return true + end + + def is_like_ipv6_string?(hnBuffer) + return true if /\A#{IPV6_REGEX}\z/ =~ hnBuffer + return false + end + + def look_for_socket_class_host_address + hostname_buffer = nil + + begin + hostname_buffer = Socket.gethostname + rescue => error + OMS::Log.error_once("Unable to get the Host Name using socket facility: #{error}") + return + end + @@Hostname = clean_hostname_string(hostname_buffer) + + return # Thwart accidental return to force correct use. + end + + def look_in_designated_hostnamefile + # Issue: + # When omsagent runs inside a container, gethostname returns the hostname of the container (random name) + # not the actual machine hostname. + # One way to solve this problem is to set the container hostname same as machine name, but this is not + # possible when host-machine is a private VM inside a cluster. + # Solution: + # Share/mount ‘/etc/hostname’ as '/var/opt/microsoft/omsagent/state/containername' with container and + # omsagent will read hostname from shared file. + hostname_buffer = nil + + unless File.readable?(@@HostnameFilePath) + OMS::Log.warn_once("File '#{@@HostnameFilePath}' exists but is not readable.") + return + end + + begin + hostname_buffer = File.read(@@HostnameFilePath) + rescue => error + OMS::Log.warn_once("Unable to read the hostname from #{@@HostnameFilePath}: #{error}") + end + @@Hostname = clean_hostname_string(hostname_buffer) + return # Thwart accidental return to force correct use. + end + + def validate_hostname_equivalent(hnBuffer) + # RFC 1123 and 2181 + # Note that for now we are limiting the earlier maximum of 63 for fqdn labels and thus + # hostnames UNTIL we are assured azure will allow 255, as specified in RFC 1123, or + # we are otherwise instructed. + rfcl = "RFCs 1123, 2181 with hostname range of {1,63} octets for non-root item." + return if is_hostname_compliant?(hnBuffer) + return if is_like_ipv4_string?(hnBuffer) + return if is_like_ipv6_string?(hnBuffer) + msg = "Hostname '#{hnBuffer}' not compliant (#{rfcl}). Not IP Address Either." + OMS::Log.warn_once(msg) + raise NameError, msg + end + + # End of Internal methods + + def get_hostname(ignoreOldValue = false) + if not is_hostname_compliant?(@@Hostname) or ignoreOldValue then + + look_in_designated_hostnamefile if has_designated_hostnamefile? + + look_for_socket_class_host_address unless is_hostname_compliant?(@@Hostname) + end + + begin + validate_hostname_equivalent(@@Hostname) + rescue => error + OMS::Log.warn_once("Hostname '#{@@Hostname}' found, but did NOT validate as compliant. #{error}. Using anyway.") + end + return @@Hostname + end + end # Class methods + end # class Common +end # module OMS diff --git a/source/plugins/ruby/omslog.rb b/source/plugins/ruby/omslog.rb new file mode 100644 index 000000000..b65bf947c --- /dev/null +++ b/source/plugins/ruby/omslog.rb @@ -0,0 +1,50 @@ +module OMS + class Log + require 'set' + require 'digest' + + @@error_proc = Proc.new {|message| $log.error message } + @@warn_proc = Proc.new {|message| $log.warn message } + @@info_proc = Proc.new {|message| $log.info message } + @@debug_proc = Proc.new {|message| $log.debug message } + + @@logged_hashes = Set.new + + class << self + def error_once(message, tag=nil) + log_once(@@error_proc, @@debug_proc, message, tag) + end + + def warn_once(message, tag=nil) + log_once(@@warn_proc, @@debug_proc, message, tag) + end + + def info_once(message, tag=nil) + log_once(@@info_proc, @@debug_proc, message, tag) + end + + def log_once(first_loglevel_proc, next_loglevel_proc, message, tag=nil) + # Will log a message once with the first procedure and subsequently with the second + # This allows repeated messages to be ignored by having the second logging function at a lower log level + # An optional tag can be used as the message key + + if tag == nil + tag = message + end + + md5_digest = Digest::MD5.new + tag_hash = md5_digest.update(tag).base64digest + res = @@logged_hashes.add?(tag_hash) + + if res == nil + # The hash was already in the set + next_loglevel_proc.call(message) + else + # First time we see this hash + first_loglevel_proc.call(message) + end + end + end # Class methods + + end # Class Log +end # Module OMS diff --git a/source/plugins/ruby/out_mdm.rb b/source/plugins/ruby/out_mdm.rb index 8e80fb753..82d6e07db 100644 --- a/source/plugins/ruby/out_mdm.rb +++ b/source/plugins/ruby/out_mdm.rb @@ -21,6 +21,9 @@ def initialize require_relative "proxy_utils" @@token_resource_url = "https://monitoring.azure.com/" + # AAD auth supported only in public cloud and handle other clouds when enabled + # this is unified new token audience for LA AAD MSI auth & metrics + @@token_resource_audience = "https://monitor.azure.com/" @@grant_type = "client_credentials" @@azure_json_path = "/etc/kubernetes/host/azure.json" @@post_request_url_template = "https://%{aks_region}.monitoring.azure.com%{aks_resource_id}/metrics" @@ -28,6 +31,8 @@ def initialize # msiEndpoint is the well known endpoint for getting MSI authentications tokens @@msi_endpoint_template = "http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=%{user_assigned_client_id}&resource=%{resource}" + # IMDS msiEndpoint for AAD MSI Auth is the proxy endpoint whcih serves the MSI auth tokens with resource claim + @@imds_msi_endpoint_template = "http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=%{resource}" @@user_assigned_client_id = ENV["USER_ASSIGNED_IDENTITY_CLIENT_ID"] @@plugin_name = "AKSCustomMetricsMDM" @@ -46,6 +51,7 @@ def initialize @last_telemetry_sent_time = nil # Setting useMsi to false by default @useMsi = false + @isAADMSIAuth = false @metrics_flushed_count = 0 @cluster_identity = nil @@ -124,7 +130,14 @@ def start @parsed_token_uri = URI.parse(aad_token_url) else @useMsi = true - msi_endpoint = @@msi_endpoint_template % { user_assigned_client_id: @@user_assigned_client_id, resource: @@token_resource_url } + if !@@user_assigned_client_id.nil? && !@@user_assigned_client_id.empty? + msi_endpoint = @@msi_endpoint_template % { user_assigned_client_id: @@user_assigned_client_id, resource: @@token_resource_url } + else + # in case of aad msi auth user_assigned_client_id will be empty + @log.info "using aad msi auth" + @isAADMSIAuth = true + msi_endpoint = @@imds_msi_endpoint_template % { resource: @@token_resource_audience } + end @parsed_token_uri = URI.parse(msi_endpoint) end @@ -148,8 +161,14 @@ def get_access_token @log.info "Refreshing access token for out_mdm plugin.." if (!!@useMsi) - @log.info "Using msi to get the token to post MDM data" - ApplicationInsightsUtility.sendCustomEvent("AKSCustomMetricsMDMToken-MSI", {}) + properties = {} + if (!!@isAADMSIAuth) + @log.info "Using aad msi auth to get the token to post MDM data" + properties["aadAuthMSIMode"] = "true" + else + @log.info "Using msi to get the token to post MDM data" + end + ApplicationInsightsUtility.sendCustomEvent("AKSCustomMetricsMDMToken-MSI", properties) @log.info "Opening TCP connection" http_access_token = Net::HTTP.start(@parsed_token_uri.host, @parsed_token_uri.port, :use_ssl => false) # http_access_token.use_ssl = false @@ -320,7 +339,7 @@ def send_to_mdm(post_body) ApplicationInsightsUtility.sendCustomEvent("AKSCustomMetricsMDMSendSuccessful", {}) @last_telemetry_sent_time = Time.now end - rescue Net::HTTPClientException => e # see https://docs.ruby-lang.org/en/2.6.0/NEWS.html about deprecating HTTPServerException and adding HTTPClientException + rescue Net::HTTPClientException => e # see https://docs.ruby-lang.org/en/2.6.0/NEWS.html about deprecating HTTPServerException and adding HTTPClientException if !response.nil? && !response.body.nil? #body will have actual error @log.info "Failed to Post Metrics to MDM : #{e} Response.body: #{response.body}" else diff --git a/test/e2e/conformance.yaml b/test/e2e/conformance.yaml new file mode 100644 index 000000000..ff790e690 --- /dev/null +++ b/test/e2e/conformance.yaml @@ -0,0 +1,15 @@ +sonobuoy-config: + driver: Job + plugin-name: azure-arc-ci-conformance + result-format: junit +spec: + image: mcr.microsoft.com/azuremonitor/containerinsights/cidev:ciconftest08142021 + imagePullPolicy: Always + name: plugin + resources: {} + volumes: + - name: results + emptyDir: {} + volumeMounts: + - mountPath: /tmp/results + name: results diff --git a/test/e2e/e2e-tests.yaml b/test/e2e/e2e-tests.yaml index 06dfa1fb0..25817be12 100644 --- a/test/e2e/e2e-tests.yaml +++ b/test/e2e/e2e-tests.yaml @@ -68,7 +68,7 @@ data: containers: [] restartPolicy: Never serviceAccountName: sonobuoy-serviceaccount - nodeSelector: + nodeSelector: kubernetes.io/os: linux tolerations: - effect: NoSchedule @@ -84,8 +84,11 @@ data: result-format: junit spec: env: + # this should be false if the test environment is non ARC K8s for example AKS + - name: IS_NON_ARC_K8S_TEST_ENVIRONMENT + value: "true" # Update values of CLIENT_ID, CLIENT_SECRET of the service principal which has permission to query LA ad Metrics API - # Update value of TENANT_ID corresponding your Azure Service principal + # Update value of TENANT_ID corresponding your Azure Service principal - name: CLIENT_ID value: "SP_CLIENT_ID_VALUE" - name: CLIENT_SECRET @@ -93,15 +96,15 @@ data: - name: TENANT_ID value: "SP_TENANT_ID_VALUE" - name: DEFAULT_QUERY_TIME_INTERVAL_IN_MINUTES - value: "10" + value: "10" - name: DEFAULT_METRICS_QUERY_TIME_INTERVAL_IN_MINUTES - value: "10" + value: "10" - name: AGENT_POD_EXPECTED_RESTART_COUNT - value: "0" + value: "0" - name: AZURE_CLOUD - value: "AZURE_PUBLIC_CLOUD" - # image tag should be updated if new tests being added after this image - image: mcr.microsoft.com/azuremonitor/containerinsights/cidev:ciagenttest02152021 + value: "AZURE_PUBLIC_CLOUD" + # image tag should be updated if new tests being added after this image + image: mcr.microsoft.com/azuremonitor/containerinsights/cidev:ciconftest08142021 imagePullPolicy: IfNotPresent name: plugin resources: {} @@ -144,7 +147,7 @@ spec: name: output-volume restartPolicy: Never serviceAccountName: sonobuoy-serviceaccount - nodeSelector: + nodeSelector: kubernetes.io/os: linux tolerations: - key: "kubernetes.io/e2e-evict-taint-key" diff --git a/test/e2e/src/common/constants.py b/test/e2e/src/common/constants.py index 770964cb5..392b10554 100644 --- a/test/e2e/src/common/constants.py +++ b/test/e2e/src/common/constants.py @@ -40,6 +40,8 @@ TIMEOUT = 300 +# WAIT TIME BEFORE READING THE AGENT LOGS +AGENT_WAIT_TIME_SECS = "180" # Azure Monitor for Container Extension related AGENT_RESOURCES_NAMESPACE = 'kube-system' AGENT_DEPLOYMENT_NAME = 'omsagent-rs' @@ -47,7 +49,9 @@ AGENT_WIN_DAEMONSET_NAME = 'omsagent-win' AGENT_DEPLOYMENT_PODS_LABEL_SELECTOR = 'rsName=omsagent-rs' -AGENT_DAEMON_SET_PODS_LABEL_SELECTOR = 'component=oms-agent' +AGENT_DAEMON_SET_PODS_LABEL_SELECTOR = 'dsName=omsagent-ds' +AGENT_DAEMON_SET_PODS_LABEL_SELECTOR_NON_ARC = 'component=oms-agent' +AGENT_FLUENTD_LOG_PATH = '/var/opt/microsoft/docker-cimprov/log/fluentd.log' AGENT_OMSAGENT_LOG_PATH = '/var/opt/microsoft/omsagent/log/omsagent.log' AGENT_REPLICASET_WORKFLOWS = ["kubePodInventoryEmitStreamSuccess", "kubeNodeInventoryEmitStreamSuccess"] diff --git a/test/e2e/src/core/Dockerfile b/test/e2e/src/core/Dockerfile index 9f85bdf4c..cd85aee40 100644 --- a/test/e2e/src/core/Dockerfile +++ b/test/e2e/src/core/Dockerfile @@ -1,11 +1,26 @@ FROM python:3.6 -RUN pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org pytest pytest-xdist filelock requests kubernetes adal msrestazure +RUN pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org pytest pytest-xdist filelock requests kubernetes adal msrestazure RUN curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash \ && helm version +RUN apt-get update && apt-get -y upgrade && \ + apt-get -f -y install curl apt-transport-https lsb-release gnupg python3-pip python-pip && \ + curl -sL https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > /etc/apt/trusted.gpg.d/microsoft.asc.gpg && \ + CLI_REPO=$(lsb_release -cs) && \ + echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ ${CLI_REPO} main" \ + > /etc/apt/sources.list.d/azure-cli.list && \ + apt-get update && \ + apt-get install -y azure-cli && \ + rm -rf /var/lib/apt/lists/* + +RUN python3 -m pip install junit_xml + +COPY --from=lachlanevenson/k8s-kubectl:v1.20.5 /usr/local/bin/kubectl /usr/local/bin/kubectl + COPY ./core/e2e_tests.sh / +COPY ./core/setup_failure_handler.py / COPY ./core/pytest.ini /e2etests/ COPY ./core/conftest.py /e2etests/ COPY ./core/helper.py /e2etests/ diff --git a/test/e2e/src/core/conftest.py b/test/e2e/src/core/conftest.py index e659d5189..9fe34952c 100644 --- a/test/e2e/src/core/conftest.py +++ b/test/e2e/src/core/conftest.py @@ -22,42 +22,48 @@ def env_dict(): create_results_dir('/tmp/results') # Setting some environment variables - env_dict['SETUP_LOG_FILE'] = '/tmp/results/setup' + env_dict['SETUP_LOG_FILE'] = '/tmp/results/setup' env_dict['TEST_AGENT_LOG_FILE'] = '/tmp/results/containerinsights' env_dict['NUM_TESTS_COMPLETED'] = 0 - + print("Starting setup...") append_result_output("Starting setup...\n", env_dict['SETUP_LOG_FILE']) - + # Collecting environment variables env_dict['TENANT_ID'] = os.getenv('TENANT_ID') env_dict['CLIENT_ID'] = os.getenv('CLIENT_ID') env_dict['CLIENT_SECRET'] = os.getenv('CLIENT_SECRET') - + env_dict['IS_NON_ARC_K8S_TEST_ENVIRONMENT'] = os.getenv('IS_NON_ARC_K8S_TEST_ENVIRONMENT') + # released agent for Arc K8s still uses omsagent and when we rollout the agent with mdsd + # this shouldnt set after agent rollout with mdsd + env_dict['USING_OMSAGENT_BASE_AGENT'] = os.getenv('USING_OMSAGENT_BASE_AGENT') + + waitTimeInterval = int(os.getenv('AGENT_WAIT_TIME_SECS')) if os.getenv('AGENT_WAIT_TIME_SECS') else constants.AGENT_WAIT_TIME_SECS + env_dict['AGENT_WAIT_TIME_SECS'] = waitTimeInterval # get default query time interval for log analytics queries queryTimeInterval = int(os.getenv('DEFAULT_QUERY_TIME_INTERVAL_IN_MINUTES')) if os.getenv('DEFAULT_QUERY_TIME_INTERVAL_IN_MINUTES') else constants.DEFAULT_QUERY_TIME_INTERVAL_IN_MINUTES # add minute suffix since this format required for LA queries env_dict['DEFAULT_QUERY_TIME_INTERVAL_IN_MINUTES'] = str(queryTimeInterval) + "m" - + # get default query time interval for metrics queries env_dict['DEFAULT_METRICS_QUERY_TIME_INTERVAL_IN_MINUTES'] = int(os.getenv('DEFAULT_METRICS_QUERY_TIME_INTERVAL_IN_MINUTES')) if os.getenv('DEFAULT_METRICS_QUERY_TIME_INTERVAL_IN_MINUTES') else constants.DEFAULT_METRICS_QUERY_TIME_INTERVAL_IN_MINUTES - - - # expected agent pod restart count + + + # expected agent pod restart count env_dict['AGENT_POD_EXPECTED_RESTART_COUNT'] = int(os.getenv('AGENT_POD_EXPECTED_RESTART_COUNT')) if os.getenv('AGENT_POD_EXPECTED_RESTART_COUNT') else constants.AGENT_POD_EXPECTED_RESTART_COUNT # default to azure public cloud if AZURE_CLOUD not specified env_dict['AZURE_ENDPOINTS'] = constants.AZURE_CLOUD_DICT.get(os.getenv('AZURE_CLOUD')) if os.getenv('AZURE_CLOUD') else constants.AZURE_PUBLIC_CLOUD_ENDPOINTS - + if not env_dict.get('TENANT_ID'): pytest.fail('ERROR: variable TENANT_ID is required.') - + if not env_dict.get('CLIENT_ID'): pytest.fail('ERROR: variable CLIENT_ID is required.') - + if not env_dict.get('CLIENT_SECRET'): pytest.fail('ERROR: variable CLIENT_SECRET is required.') - + print("Setup Complete.") append_result_output("Setup Complete.\n", env_dict['SETUP_LOG_FILE']) @@ -66,22 +72,21 @@ def env_dict(): else: with Path.open(my_file, "rb") as f: env_dict = pickle.load(f) - + yield env_dict - + my_file = Path("env.pkl") with FileLock(str(my_file) + ".lock"): with Path.open(my_file, "rb") as f: env_dict = pickle.load(f) env_dict['NUM_TESTS_COMPLETED'] = 1 + env_dict.get('NUM_TESTS_COMPLETED') - if env_dict['NUM_TESTS_COMPLETED'] == int(os.getenv('NUM_TESTS')): + if env_dict['NUM_TESTS_COMPLETED'] == int(os.getenv('NUM_TESTS')): # Checking if cleanup is required. if os.getenv('SKIP_CLEANUP'): return print('Starting cleanup...') append_result_output("Starting Cleanup...\n", env_dict['SETUP_LOG_FILE']) - print("Cleanup Complete.") append_result_output("Cleanup Complete.\n", env_dict['SETUP_LOG_FILE']) return diff --git a/test/e2e/src/core/e2e_tests.sh b/test/e2e/src/core/e2e_tests.sh index 3bfafdce9..dd9d93073 100644 --- a/test/e2e/src/core/e2e_tests.sh +++ b/test/e2e/src/core/e2e_tests.sh @@ -1,7 +1,158 @@ -#!/bin/sh +#!/bin/bash +set -x results_dir="${RESULTS_DIR:-/tmp/results}" +waitForResourcesReady() { + ready=false + max_retries=60 + sleep_seconds=10 + NAMESPACE=$1 + RESOURCETYPE=$2 + RESOURCE=$3 + # if resource not specified, set to --all + if [ -z $RESOURCE ]; then + RESOURCE="--all" + fi + for i in $(seq 1 $max_retries) + do + if [[ ! $(kubectl wait --for=condition=Ready ${RESOURCETYPE} ${RESOURCE} --namespace ${NAMESPACE}) ]]; then + echo "waiting for the resource:${RESOURCE} of the type:${RESOURCETYPE} in namespace:${NAMESPACE} to be ready state, iteration:${i}" + sleep ${sleep_seconds} + else + echo "resource:${RESOURCE} of the type:${RESOURCETYPE} in namespace:${NAMESPACE} in ready state" + ready=true + break + fi + done + + echo "waitForResourcesReady state: $ready" +} + + +waitForArcK8sClusterCreated() { + connectivityState=false + max_retries=60 + sleep_seconds=10 + for i in $(seq 1 $max_retries) + do + echo "iteration: ${i}, clustername: ${CLUSTER_NAME}, resourcegroup: ${RESOURCE_GROUP}" + clusterState=$(az connectedk8s show --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP --query connectivityStatus -o json) + clusterState=$(echo $clusterState | tr -d '"' | tr -d '"\r\n') + echo "cluster current state: ${clusterState}" + if [ ! -z "$clusterState" ]; then + if [[ ("${clusterState}" == "Connected") || ("${clusterState}" == "Connecting") ]]; then + connectivityState=true + break + fi + fi + sleep ${sleep_seconds} + done + echo "Arc K8s cluster connectivityState: $connectivityState" +} + +waitForCIExtensionInstalled() { + installedState=false + max_retries=60 + sleep_seconds=10 + for i in $(seq 1 $max_retries) + do + echo "iteration: ${i}, clustername: ${CLUSTER_NAME}, resourcegroup: ${RESOURCE_GROUP}" + installState=$(az k8s-extension show --cluster-name $CLUSTER_NAME --resource-group $RESOURCE_GROUP --cluster-type connectedClusters --name azuremonitor-containers --query installState -o json) + installState=$(echo $installState | tr -d '"' | tr -d '"\r\n') + echo "extension install state: ${installState}" + if [ ! -z "$installState" ]; then + if [ "${installState}" == "Installed" ]; then + installedState=true + break + fi + fi + sleep ${sleep_seconds} + done + echo "container insights extension installedState: $installedState" +} + +validateCommonParameters() { + if [ -z $TENANT_ID ]; then + echo "ERROR: parameter TENANT_ID is required." > ${results_dir}/error + python3 setup_failure_handler.py + fi + if [ -z $CLIENT_ID ]; then + echo "ERROR: parameter CLIENT_ID is required." > ${results_dir}/error + python3 setup_failure_handler.py + fi + + if [ -z $CLIENT_SECRET ]; then + echo "ERROR: parameter CLIENT_SECRET is required." > ${results_dir}/error + python3 setup_failure_handler.py + fi +} + +validateArcConfTestParameters() { + if [ -z $SUBSCRIPTION_ID ]; then + echo "ERROR: parameter SUBSCRIPTION_ID is required." > ${results_dir}/error + python3 setup_failure_handler.py + fi + + if [ -z $RESOURCE_GROUP ]]; then + echo "ERROR: parameter RESOURCE_GROUP is required." > ${results_dir}/error + python3 setup_failure_handler.py + fi + + if [ -z $CLUSTER_NAME ]; then + echo "ERROR: parameter CLUSTER_NAME is required." > ${results_dir}/error + python3 setup_failure_handler.py + fi +} + +addArcConnectedK8sExtension() { + echo "adding Arc K8s connectedk8s extension" + az extension add --name connectedk8s 2> ${results_dir}/error || python3 setup_failure_handler.py +} + +addArcK8sCLIExtension() { + echo "adding Arc K8s k8s-extension extension" + az extension add --name k8s-extension +} + +createArcCIExtension() { + echo "creating extension type: Microsoft.AzureMonitor.Containers" + basicparameters="--cluster-name $CLUSTER_NAME --resource-group $RESOURCE_GROUP --cluster-type connectedClusters --extension-type Microsoft.AzureMonitor.Containers --scope cluster --name azuremonitor-containers" + if [ ! -z "$CI_ARC_RELEASE_TRAIN" ]; then + basicparameters="$basicparameters --release-train $CI_ARC_RELEASE_TRAIN" + fi + if [ ! -z "$CI_ARC_VERSION" ]; then + basicparameters="$basicparameters --version $CI_ARC_VERSION" + fi + + az k8s-extension create $basicparameters --configuration-settings omsagent.ISTEST=true +} + +showArcCIExtension() { + echo "arc ci extension status" + az k8s-extension show --cluster-name $CLUSTER_NAME --resource-group $RESOURCE_GROUP --cluster-type connectedClusters --name azuremonitor-containers +} + +deleteArcCIExtension() { + az k8s-extension delete --name azuremonitor-containers \ + --cluster-type connectedClusters \ + --cluster-name $CLUSTER_NAME \ + --resource-group $RESOURCE_GROUP --yes +} + +login_to_azure() { + # Login with service principal + echo "login to azure using the SP creds" + az login --service-principal \ + -u ${CLIENT_ID} \ + -p ${CLIENT_SECRET} \ + --tenant ${TENANT_ID} 2> ${results_dir}/error || python3 setup_failure_handler.py + + echo "setting subscription: ${SUBSCRIPTION_ID} as default subscription" + az account set -s $SUBSCRIPTION_ID +} + + # saveResults prepares the results for handoff to the Sonobuoy worker. # See: https://github.com/vmware-tanzu/sonobuoy/blob/master/docs/plugins.md saveResults() { @@ -17,6 +168,50 @@ saveResults() { # Ensure that we tell the Sonobuoy worker we are done regardless of results. trap saveResults EXIT +# validate common params +validateCommonParameters + +IS_ARC_K8S_ENV="true" +if [ -z $IS_NON_ARC_K8S_TEST_ENVIRONMENT ]; then + echo "arc k8s environment" +else + if [ "$IS_NON_ARC_K8S_TEST_ENVIRONMENT" = "true" ]; then + IS_ARC_K8S_ENV="false" + echo "non arc k8s environment" + fi +fi + +if [ "$IS_ARC_K8S_ENV" = "false" ]; then + echo "skipping installing of ARC K8s container insights extension since the test environment is non-arc K8s" +else + # validate params + validateArcConfTestParameters + + # login to azure + login_to_azure + + # add arc k8s connectedk8s extension + addArcConnectedK8sExtension + + # wait for arc k8s pods to be ready state + waitForResourcesReady azure-arc pods + + # wait for Arc K8s cluster to be created + waitForArcK8sClusterCreated + + # add CLI extension + addArcK8sCLIExtension + + # add ARC K8s container insights extension + createArcCIExtension + + # show the ci extension status + showArcCIExtension + + #wait for extension state to be installed + waitForCIExtensionInstalled +fi + # The variable 'TEST_LIST' should be provided if we want to run specific tests. If not provided, all tests are run NUM_PROCESS=$(pytest /e2etests/ --collect-only -k "$TEST_NAME_LIST" -m "$TEST_MARKER_LIST" | grep ""NodeList", + "apiVersion"=>"v1", + "metadata"=>{ + "selfLink"=>"/api/v1/nodes", + "resourceVersion"=>"5974879" + }, + "items"=>[ + { + "metadata"=>{ + "name"=>"malformed-node", + "selfLink"=>"/api/v1/nodes/malformed-node", + "uid"=>"fe073f0a-e6bf-4d68-b4e5-ffaa42b91528", + "resourceVersion"=>"5974522", + "creationTimestamp"=>"2021-07-21T23:40:14Z", + "labels"=>{ + "agentpool"=>"nodepool1", + "beta.kubernetes.io/arch"=>"amd64", + "beta.kubernetes.io/instance-type"=>"Standard_DS2_v2", + "beta.kubernetes.io/os"=>"linux", + "failure-domain.beta.kubernetes.io/region"=>"westus2", + "failure-domain.beta.kubernetes.io/zone"=>"0", + "kubernetes.azure.com/cluster"=>"MC_davidaks16_davidaks16_westus2", + "kubernetes.azure.com/mode"=>"system", + "kubernetes.azure.com/node-image-version"=>"AKSUbuntu-1804gen2containerd-2021.07.03", + "kubernetes.azure.com/os-sku"=>"Ubuntu", + "kubernetes.azure.com/role"=>"agent", + "kubernetes.io/arch"=>"amd64", + "kubernetes.io/hostname"=>"malformed-node", + "kubernetes.io/os"=>"linux", + "kubernetes.io/role"=>"agent", + "node-role.kubernetes.io/agent"=>"", + "node.kubernetes.io/instance-type"=>"Standard_DS2_v2", + "storageprofile"=>"managed", + "storagetier"=>"Premium_LRS", + "topology.kubernetes.io/region"=>"westus2", + "topology.kubernetes.io/zone"=>"0" + }, + "annotations"=>{ + "node.alpha.kubernetes.io/ttl"=>"0", + "volumes.kubernetes.io/controller-managed-attach-detach"=>"true" + }, + "managedFields"=>[ + { + "manager"=>"kube-controller-manager", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-07-21T23:40:20Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:metadata"=>{ + "f:annotations"=>{ + "f:node.alpha.kubernetes.io/ttl"=>{} + } + } + } + }, + { + "manager"=>"kubelet", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-07-21T23:40:24Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:metadata"=>{ + "f:annotations"=>{ + "."=>{}, + "f:volumes.kubernetes.io/controller-managed-attach-detach"=>{} + }, + "f:labels"=>{ + "."=>{}, + "f:agentpool"=>{}, + "f:beta.kubernetes.io/arch"=>{}, + "f:beta.kubernetes.io/instance-type"=>{}, + "f:beta.kubernetes.io/os"=>{}, + "f:failure-domain.beta.kubernetes.io/region"=>{}, + "f:failure-domain.beta.kubernetes.io/zone"=>{}, + "f:kubernetes.azure.com/cluster"=>{}, + "f:kubernetes.azure.com/mode"=>{}, + "f:kubernetes.azure.com/node-image-version"=>{}, + "f:kubernetes.azure.com/os-sku"=>{}, + "f:kubernetes.azure.com/role"=>{}, + "f:kubernetes.io/arch"=>{}, + "f:kubernetes.io/hostname"=>{}, + "f:kubernetes.io/os"=>{}, + "f:node.kubernetes.io/instance-type"=>{}, + "f:storageprofile"=>{}, + "f:storagetier"=>{}, + "f:topology.kubernetes.io/region"=>{}, + "f:topology.kubernetes.io/zone"=>{} + } + }, + "f:spec"=>{ + "f:providerID"=>{} + }, + "f:status"=>{ + "f:addresses"=>{ + "."=>{}, + "k:{\"type\":\"Hostname\"}"=>{ + "."=>{}, + "f:address"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"InternalIP\"}"=>{ + "."=>{}, + "f:address"=>{}, + "f:type"=>{} + } + }, + "f:allocatable"=>{ + "."=>{}, + "f:attachable-volumes-azure-disk"=>{}, + "f:cpu"=>{}, + "f:ephemeral-storage"=>{}, + "f:hugepages-1Gi"=>{}, + "f:hugepages-2Mi"=>{}, + "f:memory"=>{}, + "f:pods"=>{} + }, + "f:capacity"=>{ + "."=>{}, + "f:attachable-volumes-azure-disk"=>{}, + "f:cpu"=>{}, + "f:ephemeral-storage"=>{}, + "f:hugepages-1Gi"=>{}, + "f:hugepages-2Mi"=>{}, + "f:memory"=>{}, + "f:pods"=>{} + }, + "f:conditions"=>{ + "."=>{}, + "k:{\"type\":\"DiskPressure\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"MemoryPressure\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"PIDPressure\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"Ready\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + } + }, + "f:config"=>{}, + "f:daemonEndpoints"=>{ + "f:kubeletEndpoint"=>{ + "f:Port"=>{} + } + }, + "f:images"=>{}, + "f:nodeInfo"=>{ + "f:architecture"=>{}, + "f:bootID"=>{}, + "f:containerRuntimeVersion"=>{}, + "f:kernelVersion"=>{}, + "f:kubeProxyVersion"=>{}, + "f:kubeletVersion"=>{}, + "f:machineID"=>{}, + "f:operatingSystem"=>{}, + "f:osImage"=>{}, + "f:systemUUID"=>{} + } + } + } + }, + { + "manager"=>"kubectl-label", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-07-21T23:40:53Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:metadata"=>{ + "f:labels"=>{ + "f:kubernetes.io/role"=>{}, + "f:node-role.kubernetes.io/agent"=>{} + } + } + } + }, + { + "manager"=>"node-problem-detector", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-08-10T18:10:02Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:status"=>{ + "f:conditions"=>{ + "k:{\"type\":\"ContainerRuntimeProblem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FilesystemCorruptionProblem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FreezeScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentContainerdRestart\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentDockerRestart\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentKubeletRestart\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentUnregisterNetDevice\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"KernelDeadlock\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"KubeletProblem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"PreemptScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"ReadonlyFilesystem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"RebootScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"RedeployScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"TerminateScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + } + } + } + } + } + ] + }, + "spec"=>{ + "providerID"=>"azure:///subscriptions/3b875bf3-0eec-4d8c-bdee-25c7ccc1f130/resourceGroups/mc_davidaks16_davidaks16_westus2/providers/Microsoft.Compute/virtualMachineScaleSets/aks-nodepool1-24816391-vmss/virtualMachines/0" + }, + "status"=>{ + "conditions"=>[ + { + "type"=>"FrequentDockerRestart", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentDockerRestart", + "message"=>"docker is functioning properly" + }, + { + "type"=>"FilesystemCorruptionProblem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"FilesystemIsOK", + "message"=>"Filesystem is healthy" + }, + { + "type"=>"KernelDeadlock", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"KernelHasNoDeadlock", + "message"=>"kernel has no deadlock" + }, + { + "type"=>"FrequentContainerdRestart", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentContainerdRestart", + "message"=>"containerd is functioning properly" + }, + { + "type"=>"FreezeScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-11T23:25:04Z", + "reason"=>"NoFreezeScheduled", + "message"=>"VM has no scheduled Freeze event" + }, + { + "type"=>"FrequentUnregisterNetDevice", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentUnregisterNetDevice", + "message"=>"node is functioning properly" + }, + { + "type"=>"TerminateScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoTerminateScheduled", + "message"=>"VM has no scheduled Terminate event" + }, + { + "type"=>"ReadonlyFilesystem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"FilesystemIsNotReadOnly", + "message"=>"Filesystem is not read-only" + }, + { + "type"=>"RedeployScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoRedeployScheduled", + "message"=>"VM has no scheduled Redeploy event" + }, + { + "type"=>"KubeletProblem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"KubeletIsUp", + "message"=>"kubelet service is up" + }, + { + "type"=>"PreemptScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:11:11Z", + "reason"=>"NoPreemptScheduled", + "message"=>"VM has no scheduled Preempt event" + }, + { + "type"=>"RebootScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoRebootScheduled", + "message"=>"VM has no scheduled Reboot event" + }, + { + "type"=>"ContainerRuntimeProblem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"ContainerRuntimeIsUp", + "message"=>"container runtime service is up" + }, + { + "type"=>"FrequentKubeletRestart", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentKubeletRestart", + "message"=>"kubelet is functioning properly" + }, + { + "type"=>"MemoryPressure", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:14Z", + "reason"=>"KubeletHasSufficientMemory", + "message"=>"kubelet has sufficient memory available" + }, + { + "type"=>"DiskPressure", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:14Z", + "reason"=>"KubeletHasNoDiskPressure", + "message"=>"kubelet has no disk pressure" + }, + { + "type"=>"PIDPressure", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:14Z", + "reason"=>"KubeletHasSufficientPID", + "message"=>"kubelet has sufficient PID available" + }, + { + "type"=>"Ready", + "status"=>"True", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:24Z", + "reason"=>"KubeletReady", + "message"=>"kubelet is posting ready status. AppArmor enabled" + } + ], + "addresses"=>[ + { + "type"=>"Hostname", + "address"=>"malformed-node" + }, + { + "type"=>"InternalIP", + "address"=>"10.240.0.4" + } + ], + "daemonEndpoints"=>{ + "kubeletEndpoint"=>{ + "Port"=>10250 + } + }, + "nodeInfo"=>{ + "machineID"=>"17a654260e2c4a9bb3a3eb4b4188e4b4", + "systemUUID"=>"7ff599e4-909e-4950-a044-ff8613af3af9", + "bootID"=>"02bb865b-a469-43cd-8b0b-5ceb4ecd80b0", + "kernelVersion"=>"5.4.0-1051-azure", + "osImage"=>"Ubuntu 18.04.5 LTS", + "containerRuntimeVersion"=>"containerd://1.4.4+azure", + "kubeletVersion"=>"v1.19.11", + "kubeProxyVersion"=>"v1.19.11", + "operatingSystem"=>"linux", + "architecture"=>"amd64" + }, + "images"=>[ + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021-1" + ], + "sizeBytes"=>331689060 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021" + ], + "sizeBytes"=>330099815 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05202021-hotfix" + ], + "sizeBytes"=>271471426 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05202021" + ], + "sizeBytes"=>269703297 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod03262021" + ], + "sizeBytes"=>264732875 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/ingress/nginx-ingress-controller:0.19.0" + ], + "sizeBytes"=>166352383 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/hcp-tunnel-front:master.210623.2" + ], + "sizeBytes"=>147750148 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/hcp-tunnel-front:master.210524.1" + ], + "sizeBytes"=>146446618 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/hcp-tunnel-front:master.210427.1" + ], + "sizeBytes"=>136242776 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.8.9.5" + ], + "sizeBytes"=>101794833 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/ingress/nginx-ingress-controller:0.47.0" + ], + "sizeBytes"=>101445696 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/autoscaler/cluster-proportional-autoscaler:1.3.0_v0.0.5" + ], + "sizeBytes"=>101194562 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/tunnel-openvpn:master.210623.2" + ], + "sizeBytes"=>96125176 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/tunnel-openvpn:master.210524.1" + ], + "sizeBytes"=>95879501 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/exechealthz:1.2_v0.0.5" + ], + "sizeBytes"=>94348102 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.8.9.2" + ], + "sizeBytes"=>93537927 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/acc/sgx-attestation:2.0" + ], + "sizeBytes"=>91841669 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi:v1.4.0" + ], + "sizeBytes"=>91324193 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi:v1.2.0" + ], + "sizeBytes"=>89103171 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.0.1-rc3" + ], + "sizeBytes"=>86839805 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.2.0" + ], + "sizeBytes"=>86488586 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/tunnel-openvpn:master.210427.1" + ], + "sizeBytes"=>86120048 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.3.0" + ], + "sizeBytes"=>81252495 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.4.0" + ], + "sizeBytes"=>79586703 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azuredisk-csi:v1.4.0" + ], + "sizeBytes"=>78795016 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azuredisk-csi:v1.2.0" + ], + "sizeBytes"=>76527179 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.1.8" + ], + "sizeBytes"=>75025803 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.2.2_hotfix" + ], + "sizeBytes"=>73533889 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.3.1" + ], + "sizeBytes"=>72242894 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.2.8" + ], + "sizeBytes"=>70622822 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/nvidia/k8s-device-plugin:v0.9.0" + ], + "sizeBytes"=>67291599 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/dashboard:v2.0.1" + ], + "sizeBytes"=>66415836 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/dashboard:v2.0.0-rc7" + ], + "sizeBytes"=>65965658 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.2.1" + ], + "sizeBytes"=>64123775 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/cni:v3.8.9.3" + ], + "sizeBytes"=>63581323 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/networkmonitor:v1.1.8" + ], + "sizeBytes"=>63154716 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/cni:v3.8.9.2" + ], + "sizeBytes"=>61626312 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.18.1" + ], + "sizeBytes"=>60500885 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.17.2" + ], + "sizeBytes"=>58419768 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/networkmonitor:v1.1.8_hotfix", + "mcr.microsoft.com/containernetworking/networkmonitor:v1.1.8post2" + ], + "sizeBytes"=>56368756 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kube-proxy@sha256:282543237a1aa3f407656290f454b7068a92e1abe2156082c750d5abfbcad90c", + "mcr.microsoft.com/oss/kubernetes/kube-proxy:v1.19.11-hotfix.20210526.2" + ], + "sizeBytes"=>56310724 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.19.0" + ], + "sizeBytes"=>55228749 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kube-proxy:v1.19.11-hotfix.20210526.1" + ], + "sizeBytes"=>54692048 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/dashboard:v2.0.0-rc3" + ], + "sizeBytes"=>50803639 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/secrets-store/driver:v0.0.19" + ], + "sizeBytes"=>49759361 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/azure/aad-pod-identity/nmi:v1.7.5" + ], + "sizeBytes"=>49704644 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/secrets-store/driver:v0.0.21" + ], + "sizeBytes"=>49372390 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kube-proxy@sha256:a64d3538b72905b07356881314755b02db3675ff47ee2bcc49dd7be856e285d5", + "mcr.microsoft.com/oss/kubernetes/kube-proxy:v1.19.11-hotfix.20210526" + ], + "sizeBytes"=>49322942 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/azure/aad-pod-identity/nmi:v1.7.4" + ], + "sizeBytes"=>48108311 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kubernetes-dashboard:v1.10.1" + ], + "sizeBytes"=>44907744 + } + ], + "config"=>{} + } + }, + { + "metadata"=>{ + "name"=>"correct-node", + "selfLink"=>"/api/v1/nodes/correct-node", + "uid"=>"fe073f0a-e6bf-4d68-b4e5-ffaa42b91528", + "resourceVersion"=>"5974522", + "creationTimestamp"=>"2021-07-21T23:40:14Z", + "labels"=>{ + "agentpool"=>"nodepool1", + "beta.kubernetes.io/arch"=>"amd64", + "beta.kubernetes.io/instance-type"=>"Standard_DS2_v2", + "beta.kubernetes.io/os"=>"linux", + "failure-domain.beta.kubernetes.io/region"=>"westus2", + "failure-domain.beta.kubernetes.io/zone"=>"0", + "kubernetes.azure.com/cluster"=>"MC_davidaks16_davidaks16_westus2", + "kubernetes.azure.com/mode"=>"system", + "kubernetes.azure.com/node-image-version"=>"AKSUbuntu-1804gen2containerd-2021.07.03", + "kubernetes.azure.com/os-sku"=>"Ubuntu", + "kubernetes.azure.com/role"=>"agent", + "kubernetes.io/arch"=>"amd64", + "kubernetes.io/hostname"=>"correct-node", + "kubernetes.io/os"=>"linux", + "kubernetes.io/role"=>"agent", + "node-role.kubernetes.io/agent"=>"", + "node.kubernetes.io/instance-type"=>"Standard_DS2_v2", + "storageprofile"=>"managed", + "storagetier"=>"Premium_LRS", + "topology.kubernetes.io/region"=>"westus2", + "topology.kubernetes.io/zone"=>"0" + }, + "annotations"=>{ + "node.alpha.kubernetes.io/ttl"=>"0", + "volumes.kubernetes.io/controller-managed-attach-detach"=>"true" + }, + "managedFields"=>[ + { + "manager"=>"kube-controller-manager", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-07-21T23:40:20Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:metadata"=>{ + "f:annotations"=>{ + "f:node.alpha.kubernetes.io/ttl"=>{} + } + } + } + }, + { + "manager"=>"kubelet", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-07-21T23:40:24Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:metadata"=>{ + "f:annotations"=>{ + "."=>{}, + "f:volumes.kubernetes.io/controller-managed-attach-detach"=>{} + }, + "f:labels"=>{ + "."=>{}, + "f:agentpool"=>{}, + "f:beta.kubernetes.io/arch"=>{}, + "f:beta.kubernetes.io/instance-type"=>{}, + "f:beta.kubernetes.io/os"=>{}, + "f:failure-domain.beta.kubernetes.io/region"=>{}, + "f:failure-domain.beta.kubernetes.io/zone"=>{}, + "f:kubernetes.azure.com/cluster"=>{}, + "f:kubernetes.azure.com/mode"=>{}, + "f:kubernetes.azure.com/node-image-version"=>{}, + "f:kubernetes.azure.com/os-sku"=>{}, + "f:kubernetes.azure.com/role"=>{}, + "f:kubernetes.io/arch"=>{}, + "f:kubernetes.io/hostname"=>{}, + "f:kubernetes.io/os"=>{}, + "f:node.kubernetes.io/instance-type"=>{}, + "f:storageprofile"=>{}, + "f:storagetier"=>{}, + "f:topology.kubernetes.io/region"=>{}, + "f:topology.kubernetes.io/zone"=>{} + } + }, + "f:spec"=>{ + "f:providerID"=>{} + }, + "f:status"=>{ + "f:addresses"=>{ + "."=>{}, + "k:{\"type\":\"Hostname\"}"=>{ + "."=>{}, + "f:address"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"InternalIP\"}"=>{ + "."=>{}, + "f:address"=>{}, + "f:type"=>{} + } + }, + "f:allocatable"=>{ + "."=>{}, + "f:attachable-volumes-azure-disk"=>{}, + "f:cpu"=>{}, + "f:ephemeral-storage"=>{}, + "f:hugepages-1Gi"=>{}, + "f:hugepages-2Mi"=>{}, + "f:memory"=>{}, + "f:pods"=>{} + }, + "f:capacity"=>{ + "."=>{}, + "f:attachable-volumes-azure-disk"=>{}, + "f:cpu"=>{}, + "f:ephemeral-storage"=>{}, + "f:hugepages-1Gi"=>{}, + "f:hugepages-2Mi"=>{}, + "f:memory"=>{}, + "f:pods"=>{} + }, + "f:conditions"=>{ + "."=>{}, + "k:{\"type\":\"DiskPressure\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"MemoryPressure\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"PIDPressure\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"Ready\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + } + }, + "f:config"=>{}, + "f:daemonEndpoints"=>{ + "f:kubeletEndpoint"=>{ + "f:Port"=>{} + } + }, + "f:images"=>{}, + "f:nodeInfo"=>{ + "f:architecture"=>{}, + "f:bootID"=>{}, + "f:containerRuntimeVersion"=>{}, + "f:kernelVersion"=>{}, + "f:kubeProxyVersion"=>{}, + "f:kubeletVersion"=>{}, + "f:machineID"=>{}, + "f:operatingSystem"=>{}, + "f:osImage"=>{}, + "f:systemUUID"=>{} + } + } + } + }, + { + "manager"=>"kubectl-label", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-07-21T23:40:53Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:metadata"=>{ + "f:labels"=>{ + "f:kubernetes.io/role"=>{}, + "f:node-role.kubernetes.io/agent"=>{} + } + } + } + }, + { + "manager"=>"node-problem-detector", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-08-10T18:10:02Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:status"=>{ + "f:conditions"=>{ + "k:{\"type\":\"ContainerRuntimeProblem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FilesystemCorruptionProblem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FreezeScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentContainerdRestart\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentDockerRestart\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentKubeletRestart\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentUnregisterNetDevice\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"KernelDeadlock\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"KubeletProblem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"PreemptScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"ReadonlyFilesystem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"RebootScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"RedeployScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"TerminateScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + } + } + } + } + } + ] + }, + "spec"=>{ + "providerID"=>"azure:///subscriptions/3b875bf3-0eec-4d8c-bdee-25c7ccc1f130/resourceGroups/mc_davidaks16_davidaks16_westus2/providers/Microsoft.Compute/virtualMachineScaleSets/aks-nodepool1-24816391-vmss/virtualMachines/0" + }, + "status"=>{ + "capacity"=>{ + "attachable-volumes-azure-disk"=>"8", + "cpu"=>"2m", + "ephemeral-storage"=>"666", + "hugepages-1Gi"=>"0", + "hugepages-2Mi"=>"0", + "memory"=>"555", + "pods"=>"30" + }, + "allocatable"=>{ + "attachable-volumes-azure-disk"=>"8", + "cpu"=>"1m", + "ephemeral-storage"=>"333", + "hugepages-1Gi"=>"0", + "hugepages-2Mi"=>"0", + "memory"=>"444", + "pods"=>"30" + }, + "conditions"=>[ + { + "type"=>"FrequentDockerRestart", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentDockerRestart", + "message"=>"docker is functioning properly" + }, + { + "type"=>"FilesystemCorruptionProblem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"FilesystemIsOK", + "message"=>"Filesystem is healthy" + }, + { + "type"=>"KernelDeadlock", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"KernelHasNoDeadlock", + "message"=>"kernel has no deadlock" + }, + { + "type"=>"FrequentContainerdRestart", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentContainerdRestart", + "message"=>"containerd is functioning properly" + }, + { + "type"=>"FreezeScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-11T23:25:04Z", + "reason"=>"NoFreezeScheduled", + "message"=>"VM has no scheduled Freeze event" + }, + { + "type"=>"FrequentUnregisterNetDevice", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentUnregisterNetDevice", + "message"=>"node is functioning properly" + }, + { + "type"=>"TerminateScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoTerminateScheduled", + "message"=>"VM has no scheduled Terminate event" + }, + { + "type"=>"ReadonlyFilesystem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"FilesystemIsNotReadOnly", + "message"=>"Filesystem is not read-only" + }, + { + "type"=>"RedeployScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoRedeployScheduled", + "message"=>"VM has no scheduled Redeploy event" + }, + { + "type"=>"KubeletProblem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"KubeletIsUp", + "message"=>"kubelet service is up" + }, + { + "type"=>"PreemptScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:11:11Z", + "reason"=>"NoPreemptScheduled", + "message"=>"VM has no scheduled Preempt event" + }, + { + "type"=>"RebootScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoRebootScheduled", + "message"=>"VM has no scheduled Reboot event" + }, + { + "type"=>"ContainerRuntimeProblem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"ContainerRuntimeIsUp", + "message"=>"container runtime service is up" + }, + { + "type"=>"FrequentKubeletRestart", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentKubeletRestart", + "message"=>"kubelet is functioning properly" + }, + { + "type"=>"MemoryPressure", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:14Z", + "reason"=>"KubeletHasSufficientMemory", + "message"=>"kubelet has sufficient memory available" + }, + { + "type"=>"DiskPressure", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:14Z", + "reason"=>"KubeletHasNoDiskPressure", + "message"=>"kubelet has no disk pressure" + }, + { + "type"=>"PIDPressure", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:14Z", + "reason"=>"KubeletHasSufficientPID", + "message"=>"kubelet has sufficient PID available" + }, + { + "type"=>"Ready", + "status"=>"True", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:24Z", + "reason"=>"KubeletReady", + "message"=>"kubelet is posting ready status. AppArmor enabled" + } + ], + "addresses"=>[ + { + "type"=>"Hostname", + "address"=>"correct-node" + }, + { + "type"=>"InternalIP", + "address"=>"10.240.0.4" + } + ], + "daemonEndpoints"=>{ + "kubeletEndpoint"=>{ + "Port"=>10250 + } + }, + "nodeInfo"=>{ + "machineID"=>"17a654260e2c4a9bb3a3eb4b4188e4b4", + "systemUUID"=>"7ff599e4-909e-4950-a044-ff8613af3af9", + "bootID"=>"02bb865b-a469-43cd-8b0b-5ceb4ecd80b0", + "kernelVersion"=>"5.4.0-1051-azure", + "osImage"=>"Ubuntu 18.04.5 LTS", + "containerRuntimeVersion"=>"containerd://1.4.4+azure", + "kubeletVersion"=>"v1.19.11", + "kubeProxyVersion"=>"v1.19.11", + "operatingSystem"=>"linux", + "architecture"=>"amd64" + }, + "images"=>[ + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021-1" + ], + "sizeBytes"=>331689060 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021" + ], + "sizeBytes"=>330099815 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05202021-hotfix" + ], + "sizeBytes"=>271471426 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05202021" + ], + "sizeBytes"=>269703297 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod03262021" + ], + "sizeBytes"=>264732875 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/ingress/nginx-ingress-controller:0.19.0" + ], + "sizeBytes"=>166352383 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/hcp-tunnel-front:master.210623.2" + ], + "sizeBytes"=>147750148 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/hcp-tunnel-front:master.210524.1" + ], + "sizeBytes"=>146446618 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/hcp-tunnel-front:master.210427.1" + ], + "sizeBytes"=>136242776 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.8.9.5" + ], + "sizeBytes"=>101794833 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/ingress/nginx-ingress-controller:0.47.0" + ], + "sizeBytes"=>101445696 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/autoscaler/cluster-proportional-autoscaler:1.3.0_v0.0.5" + ], + "sizeBytes"=>101194562 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/tunnel-openvpn:master.210623.2" + ], + "sizeBytes"=>96125176 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/tunnel-openvpn:master.210524.1" + ], + "sizeBytes"=>95879501 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/exechealthz:1.2_v0.0.5" + ], + "sizeBytes"=>94348102 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.8.9.2" + ], + "sizeBytes"=>93537927 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/acc/sgx-attestation:2.0" + ], + "sizeBytes"=>91841669 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi:v1.4.0" + ], + "sizeBytes"=>91324193 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi:v1.2.0" + ], + "sizeBytes"=>89103171 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.0.1-rc3" + ], + "sizeBytes"=>86839805 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.2.0" + ], + "sizeBytes"=>86488586 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/tunnel-openvpn:master.210427.1" + ], + "sizeBytes"=>86120048 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.3.0" + ], + "sizeBytes"=>81252495 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.4.0" + ], + "sizeBytes"=>79586703 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azuredisk-csi:v1.4.0" + ], + "sizeBytes"=>78795016 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azuredisk-csi:v1.2.0" + ], + "sizeBytes"=>76527179 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.1.8" + ], + "sizeBytes"=>75025803 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.2.2_hotfix" + ], + "sizeBytes"=>73533889 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.3.1" + ], + "sizeBytes"=>72242894 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.2.8" + ], + "sizeBytes"=>70622822 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/nvidia/k8s-device-plugin:v0.9.0" + ], + "sizeBytes"=>67291599 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/dashboard:v2.0.1" + ], + "sizeBytes"=>66415836 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/dashboard:v2.0.0-rc7" + ], + "sizeBytes"=>65965658 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.2.1" + ], + "sizeBytes"=>64123775 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/cni:v3.8.9.3" + ], + "sizeBytes"=>63581323 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/networkmonitor:v1.1.8" + ], + "sizeBytes"=>63154716 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/cni:v3.8.9.2" + ], + "sizeBytes"=>61626312 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.18.1" + ], + "sizeBytes"=>60500885 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.17.2" + ], + "sizeBytes"=>58419768 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/networkmonitor:v1.1.8_hotfix", + "mcr.microsoft.com/containernetworking/networkmonitor:v1.1.8post2" + ], + "sizeBytes"=>56368756 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kube-proxy@sha256:282543237a1aa3f407656290f454b7068a92e1abe2156082c750d5abfbcad90c", + "mcr.microsoft.com/oss/kubernetes/kube-proxy:v1.19.11-hotfix.20210526.2" + ], + "sizeBytes"=>56310724 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.19.0" + ], + "sizeBytes"=>55228749 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kube-proxy:v1.19.11-hotfix.20210526.1" + ], + "sizeBytes"=>54692048 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/dashboard:v2.0.0-rc3" + ], + "sizeBytes"=>50803639 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/secrets-store/driver:v0.0.19" + ], + "sizeBytes"=>49759361 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/azure/aad-pod-identity/nmi:v1.7.5" + ], + "sizeBytes"=>49704644 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/secrets-store/driver:v0.0.21" + ], + "sizeBytes"=>49372390 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kube-proxy@sha256:a64d3538b72905b07356881314755b02db3675ff47ee2bcc49dd7be856e285d5", + "mcr.microsoft.com/oss/kubernetes/kube-proxy:v1.19.11-hotfix.20210526" + ], + "sizeBytes"=>49322942 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/azure/aad-pod-identity/nmi:v1.7.4" + ], + "sizeBytes"=>48108311 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kubernetes-dashboard:v1.10.1" + ], + "sizeBytes"=>44907744 + } + ], + "config"=>{} + } + } + ] +} \ No newline at end of file diff --git a/test/unit-tests/canned-api-responses/kube-nodes.txt b/test/unit-tests/canned-api-responses/kube-nodes.txt new file mode 100644 index 000000000..ed411c2e5 --- /dev/null +++ b/test/unit-tests/canned-api-responses/kube-nodes.txt @@ -0,0 +1,851 @@ +{ + "kind"=>"NodeList", + "apiVersion"=>"v1", + "metadata"=>{ + "selfLink"=>"/api/v1/nodes", + "resourceVersion"=>"5974879" + }, + "items"=>[ + { + "metadata"=>{ + "name"=>"aks-nodepool1-24816391-vmss000000", + "selfLink"=>"/api/v1/nodes/aks-nodepool1-24816391-vmss000000", + "uid"=>"fe073f0a-e6bf-4d68-b4e5-ffaa42b91528", + "resourceVersion"=>"5974522", + "creationTimestamp"=>"2021-07-21T23:40:14Z", + "labels"=>{ + "agentpool"=>"nodepool1", + "beta.kubernetes.io/arch"=>"amd64", + "beta.kubernetes.io/instance-type"=>"Standard_DS2_v2", + "beta.kubernetes.io/os"=>"linux", + "failure-domain.beta.kubernetes.io/region"=>"westus2", + "failure-domain.beta.kubernetes.io/zone"=>"0", + "kubernetes.azure.com/cluster"=>"MC_davidaks16_davidaks16_westus2", + "kubernetes.azure.com/mode"=>"system", + "kubernetes.azure.com/node-image-version"=>"AKSUbuntu-1804gen2containerd-2021.07.03", + "kubernetes.azure.com/os-sku"=>"Ubuntu", + "kubernetes.azure.com/role"=>"agent", + "kubernetes.io/arch"=>"amd64", + "kubernetes.io/hostname"=>"aks-nodepool1-24816391-vmss000000", + "kubernetes.io/os"=>"linux", + "kubernetes.io/role"=>"agent", + "node-role.kubernetes.io/agent"=>"", + "node.kubernetes.io/instance-type"=>"Standard_DS2_v2", + "storageprofile"=>"managed", + "storagetier"=>"Premium_LRS", + "topology.kubernetes.io/region"=>"westus2", + "topology.kubernetes.io/zone"=>"0" + }, + "annotations"=>{ + "node.alpha.kubernetes.io/ttl"=>"0", + "volumes.kubernetes.io/controller-managed-attach-detach"=>"true" + }, + "managedFields"=>[ + { + "manager"=>"kube-controller-manager", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-07-21T23:40:20Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:metadata"=>{ + "f:annotations"=>{ + "f:node.alpha.kubernetes.io/ttl"=>{} + } + } + } + }, + { + "manager"=>"kubelet", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-07-21T23:40:24Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:metadata"=>{ + "f:annotations"=>{ + "."=>{}, + "f:volumes.kubernetes.io/controller-managed-attach-detach"=>{} + }, + "f:labels"=>{ + "."=>{}, + "f:agentpool"=>{}, + "f:beta.kubernetes.io/arch"=>{}, + "f:beta.kubernetes.io/instance-type"=>{}, + "f:beta.kubernetes.io/os"=>{}, + "f:failure-domain.beta.kubernetes.io/region"=>{}, + "f:failure-domain.beta.kubernetes.io/zone"=>{}, + "f:kubernetes.azure.com/cluster"=>{}, + "f:kubernetes.azure.com/mode"=>{}, + "f:kubernetes.azure.com/node-image-version"=>{}, + "f:kubernetes.azure.com/os-sku"=>{}, + "f:kubernetes.azure.com/role"=>{}, + "f:kubernetes.io/arch"=>{}, + "f:kubernetes.io/hostname"=>{}, + "f:kubernetes.io/os"=>{}, + "f:node.kubernetes.io/instance-type"=>{}, + "f:storageprofile"=>{}, + "f:storagetier"=>{}, + "f:topology.kubernetes.io/region"=>{}, + "f:topology.kubernetes.io/zone"=>{} + } + }, + "f:spec"=>{ + "f:providerID"=>{} + }, + "f:status"=>{ + "f:addresses"=>{ + "."=>{}, + "k:{\"type\":\"Hostname\"}"=>{ + "."=>{}, + "f:address"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"InternalIP\"}"=>{ + "."=>{}, + "f:address"=>{}, + "f:type"=>{} + } + }, + "f:allocatable"=>{ + "."=>{}, + "f:attachable-volumes-azure-disk"=>{}, + "f:cpu"=>{}, + "f:ephemeral-storage"=>{}, + "f:hugepages-1Gi"=>{}, + "f:hugepages-2Mi"=>{}, + "f:memory"=>{}, + "f:pods"=>{} + }, + "f:capacity"=>{ + "."=>{}, + "f:attachable-volumes-azure-disk"=>{}, + "f:cpu"=>{}, + "f:ephemeral-storage"=>{}, + "f:hugepages-1Gi"=>{}, + "f:hugepages-2Mi"=>{}, + "f:memory"=>{}, + "f:pods"=>{} + }, + "f:conditions"=>{ + "."=>{}, + "k:{\"type\":\"DiskPressure\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"MemoryPressure\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"PIDPressure\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"Ready\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + } + }, + "f:config"=>{}, + "f:daemonEndpoints"=>{ + "f:kubeletEndpoint"=>{ + "f:Port"=>{} + } + }, + "f:images"=>{}, + "f:nodeInfo"=>{ + "f:architecture"=>{}, + "f:bootID"=>{}, + "f:containerRuntimeVersion"=>{}, + "f:kernelVersion"=>{}, + "f:kubeProxyVersion"=>{}, + "f:kubeletVersion"=>{}, + "f:machineID"=>{}, + "f:operatingSystem"=>{}, + "f:osImage"=>{}, + "f:systemUUID"=>{} + } + } + } + }, + { + "manager"=>"kubectl-label", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-07-21T23:40:53Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:metadata"=>{ + "f:labels"=>{ + "f:kubernetes.io/role"=>{}, + "f:node-role.kubernetes.io/agent"=>{} + } + } + } + }, + { + "manager"=>"node-problem-detector", + "operation"=>"Update", + "apiVersion"=>"v1", + "time"=>"2021-08-10T18:10:02Z", + "fieldsType"=>"FieldsV1", + "fieldsV1"=>{ + "f:status"=>{ + "f:conditions"=>{ + "k:{\"type\":\"ContainerRuntimeProblem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FilesystemCorruptionProblem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FreezeScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentContainerdRestart\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentDockerRestart\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentKubeletRestart\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"FrequentUnregisterNetDevice\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"KernelDeadlock\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"KubeletProblem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"PreemptScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"ReadonlyFilesystem\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"RebootScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"RedeployScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + }, + "k:{\"type\":\"TerminateScheduled\"}"=>{ + "."=>{}, + "f:lastHeartbeatTime"=>{}, + "f:lastTransitionTime"=>{}, + "f:message"=>{}, + "f:reason"=>{}, + "f:status"=>{}, + "f:type"=>{} + } + } + } + } + } + ] + }, + "spec"=>{ + "providerID"=>"azure:///subscriptions/3b875bf3-0eec-4d8c-bdee-25c7ccc1f130/resourceGroups/mc_davidaks16_davidaks16_westus2/providers/Microsoft.Compute/virtualMachineScaleSets/aks-nodepool1-24816391-vmss/virtualMachines/0" + }, + "status"=>{ + "capacity"=>{ + "attachable-volumes-azure-disk"=>"8", + "cpu"=>"2", + "ephemeral-storage"=>"129900528Ki", + "hugepages-1Gi"=>"0", + "hugepages-2Mi"=>"0", + "memory"=>"7120616Ki", + "pods"=>"30" + }, + "allocatable"=>{ + "attachable-volumes-azure-disk"=>"8", + "cpu"=>"1900m", + "ephemeral-storage"=>"119716326407", + "hugepages-1Gi"=>"0", + "hugepages-2Mi"=>"0", + "memory"=>"4675304Ki", + "pods"=>"30" + }, + "conditions"=>[ + { + "type"=>"FrequentDockerRestart", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentDockerRestart", + "message"=>"docker is functioning properly" + }, + { + "type"=>"FilesystemCorruptionProblem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"FilesystemIsOK", + "message"=>"Filesystem is healthy" + }, + { + "type"=>"KernelDeadlock", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"KernelHasNoDeadlock", + "message"=>"kernel has no deadlock" + }, + { + "type"=>"FrequentContainerdRestart", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentContainerdRestart", + "message"=>"containerd is functioning properly" + }, + { + "type"=>"FreezeScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-11T23:25:04Z", + "reason"=>"NoFreezeScheduled", + "message"=>"VM has no scheduled Freeze event" + }, + { + "type"=>"FrequentUnregisterNetDevice", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentUnregisterNetDevice", + "message"=>"node is functioning properly" + }, + { + "type"=>"TerminateScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoTerminateScheduled", + "message"=>"VM has no scheduled Terminate event" + }, + { + "type"=>"ReadonlyFilesystem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"FilesystemIsNotReadOnly", + "message"=>"Filesystem is not read-only" + }, + { + "type"=>"RedeployScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoRedeployScheduled", + "message"=>"VM has no scheduled Redeploy event" + }, + { + "type"=>"KubeletProblem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"KubeletIsUp", + "message"=>"kubelet service is up" + }, + { + "type"=>"PreemptScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:11:11Z", + "reason"=>"NoPreemptScheduled", + "message"=>"VM has no scheduled Preempt event" + }, + { + "type"=>"RebootScheduled", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoRebootScheduled", + "message"=>"VM has no scheduled Reboot event" + }, + { + "type"=>"ContainerRuntimeProblem", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"ContainerRuntimeIsUp", + "message"=>"container runtime service is up" + }, + { + "type"=>"FrequentKubeletRestart", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:25:56Z", + "lastTransitionTime"=>"2021-08-10T18:10:01Z", + "reason"=>"NoFrequentKubeletRestart", + "message"=>"kubelet is functioning properly" + }, + { + "type"=>"MemoryPressure", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:14Z", + "reason"=>"KubeletHasSufficientMemory", + "message"=>"kubelet has sufficient memory available" + }, + { + "type"=>"DiskPressure", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:14Z", + "reason"=>"KubeletHasNoDiskPressure", + "message"=>"kubelet has no disk pressure" + }, + { + "type"=>"PIDPressure", + "status"=>"False", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:14Z", + "reason"=>"KubeletHasSufficientPID", + "message"=>"kubelet has sufficient PID available" + }, + { + "type"=>"Ready", + "status"=>"True", + "lastHeartbeatTime"=>"2021-08-17T19:28:21Z", + "lastTransitionTime"=>"2021-07-21T23:40:24Z", + "reason"=>"KubeletReady", + "message"=>"kubelet is posting ready status. AppArmor enabled" + } + ], + "addresses"=>[ + { + "type"=>"Hostname", + "address"=>"aks-nodepool1-24816391-vmss000000" + }, + { + "type"=>"InternalIP", + "address"=>"10.240.0.4" + } + ], + "daemonEndpoints"=>{ + "kubeletEndpoint"=>{ + "Port"=>10250 + } + }, + "nodeInfo"=>{ + "machineID"=>"17a654260e2c4a9bb3a3eb4b4188e4b4", + "systemUUID"=>"7ff599e4-909e-4950-a044-ff8613af3af9", + "bootID"=>"02bb865b-a469-43cd-8b0b-5ceb4ecd80b0", + "kernelVersion"=>"5.4.0-1051-azure", + "osImage"=>"Ubuntu 18.04.5 LTS", + "containerRuntimeVersion"=>"containerd://1.4.4+azure", + "kubeletVersion"=>"v1.19.11", + "kubeProxyVersion"=>"v1.19.11", + "operatingSystem"=>"linux", + "architecture"=>"amd64" + }, + "images"=>[ + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021-1" + ], + "sizeBytes"=>331689060 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021" + ], + "sizeBytes"=>330099815 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05202021-hotfix" + ], + "sizeBytes"=>271471426 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05202021" + ], + "sizeBytes"=>269703297 + }, + { + "names"=>[ + "mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod03262021" + ], + "sizeBytes"=>264732875 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/ingress/nginx-ingress-controller:0.19.0" + ], + "sizeBytes"=>166352383 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/hcp-tunnel-front:master.210623.2" + ], + "sizeBytes"=>147750148 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/hcp-tunnel-front:master.210524.1" + ], + "sizeBytes"=>146446618 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/hcp-tunnel-front:master.210427.1" + ], + "sizeBytes"=>136242776 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.8.9.5" + ], + "sizeBytes"=>101794833 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/ingress/nginx-ingress-controller:0.47.0" + ], + "sizeBytes"=>101445696 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/autoscaler/cluster-proportional-autoscaler:1.3.0_v0.0.5" + ], + "sizeBytes"=>101194562 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/tunnel-openvpn:master.210623.2" + ], + "sizeBytes"=>96125176 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/tunnel-openvpn:master.210524.1" + ], + "sizeBytes"=>95879501 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/exechealthz:1.2_v0.0.5" + ], + "sizeBytes"=>94348102 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.8.9.2" + ], + "sizeBytes"=>93537927 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/acc/sgx-attestation:2.0" + ], + "sizeBytes"=>91841669 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi:v1.4.0" + ], + "sizeBytes"=>91324193 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi:v1.2.0" + ], + "sizeBytes"=>89103171 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.0.1-rc3" + ], + "sizeBytes"=>86839805 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.2.0" + ], + "sizeBytes"=>86488586 + }, + { + "names"=>[ + "mcr.microsoft.com/aks/hcp/tunnel-openvpn:master.210427.1" + ], + "sizeBytes"=>86120048 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.3.0" + ], + "sizeBytes"=>81252495 + }, + { + "names"=>[ + "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.4.0" + ], + "sizeBytes"=>79586703 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azuredisk-csi:v1.4.0" + ], + "sizeBytes"=>78795016 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/azuredisk-csi:v1.2.0" + ], + "sizeBytes"=>76527179 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.1.8" + ], + "sizeBytes"=>75025803 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.2.2_hotfix" + ], + "sizeBytes"=>73533889 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.3.1" + ], + "sizeBytes"=>72242894 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.2.8" + ], + "sizeBytes"=>70622822 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/nvidia/k8s-device-plugin:v0.9.0" + ], + "sizeBytes"=>67291599 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/dashboard:v2.0.1" + ], + "sizeBytes"=>66415836 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/dashboard:v2.0.0-rc7" + ], + "sizeBytes"=>65965658 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/azure-npm:v1.2.1" + ], + "sizeBytes"=>64123775 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/cni:v3.8.9.3" + ], + "sizeBytes"=>63581323 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/networkmonitor:v1.1.8" + ], + "sizeBytes"=>63154716 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/cni:v3.8.9.2" + ], + "sizeBytes"=>61626312 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.18.1" + ], + "sizeBytes"=>60500885 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.17.2" + ], + "sizeBytes"=>58419768 + }, + { + "names"=>[ + "mcr.microsoft.com/containernetworking/networkmonitor:v1.1.8_hotfix", + "mcr.microsoft.com/containernetworking/networkmonitor:v1.1.8post2" + ], + "sizeBytes"=>56368756 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kube-proxy@sha256:282543237a1aa3f407656290f454b7068a92e1abe2156082c750d5abfbcad90c", + "mcr.microsoft.com/oss/kubernetes/kube-proxy:v1.19.11-hotfix.20210526.2" + ], + "sizeBytes"=>56310724 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/calico/node:v3.19.0" + ], + "sizeBytes"=>55228749 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kube-proxy:v1.19.11-hotfix.20210526.1" + ], + "sizeBytes"=>54692048 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/dashboard:v2.0.0-rc3" + ], + "sizeBytes"=>50803639 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/secrets-store/driver:v0.0.19" + ], + "sizeBytes"=>49759361 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/azure/aad-pod-identity/nmi:v1.7.5" + ], + "sizeBytes"=>49704644 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes-csi/secrets-store/driver:v0.0.21" + ], + "sizeBytes"=>49372390 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kube-proxy@sha256:a64d3538b72905b07356881314755b02db3675ff47ee2bcc49dd7be856e285d5", + "mcr.microsoft.com/oss/kubernetes/kube-proxy:v1.19.11-hotfix.20210526" + ], + "sizeBytes"=>49322942 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/azure/aad-pod-identity/nmi:v1.7.4" + ], + "sizeBytes"=>48108311 + }, + { + "names"=>[ + "mcr.microsoft.com/oss/kubernetes/kubernetes-dashboard:v1.10.1" + ], + "sizeBytes"=>44907744 + } + ], + "config"=>{} + } + } + ] +} \ No newline at end of file diff --git a/test/unit-tests/run_go_tests.sh b/test/unit-tests/run_go_tests.sh new file mode 100755 index 000000000..7036531fd --- /dev/null +++ b/test/unit-tests/run_go_tests.sh @@ -0,0 +1,12 @@ +set -e + +OLD_PATH=$(pwd) +SCRIPTPATH="$( cd -- "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )" +cd $SCRIPTPATH/../../source/plugins/go/src +echo "# Runnign go generate" +go generate + +echo "# Running go test ." +go test . + +cd $OLD_PATH diff --git a/test/unit-tests/run_ruby_tests.sh b/test/unit-tests/run_ruby_tests.sh new file mode 100755 index 000000000..824346eee --- /dev/null +++ b/test/unit-tests/run_ruby_tests.sh @@ -0,0 +1,13 @@ +# this script will exit with an error if any commands exit with an error +set -e + +# NOTE: to run a specific test (instead of all) use the following arguments: --name test_name +# ex: run_ruby_tests.sh --name test_basic_single_node + +OLD_PATH=$(pwd) +SCRIPTPATH="$( cd -- "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )" +# cd $SCRIPTPATH/../../source/plugins/ruby +echo "# Running ruby $SCRIPTPATH/test_driver.rb $1 $2" +ruby $SCRIPTPATH/test_driver.rb $1 $2 + +cd $OLD_PATH diff --git a/test/unit-tests/test_driver.rb b/test/unit-tests/test_driver.rb new file mode 100644 index 000000000..32687cc99 --- /dev/null +++ b/test/unit-tests/test_driver.rb @@ -0,0 +1,13 @@ +$in_unit_test = true + +script_path = __dir__ +# go to the base directory of the repository +Dir.chdir(File.join(__dir__, "../..")) + +Dir.glob(File.join(script_path, "../../source/plugins/ruby/*_test.rb")) do |filename| + require_relative filename +end + +Dir.glob(File.join(script_path, "../../build/linux/installer/scripts/*_test.rb")) do |filename| + require_relative filename +end