-
Notifications
You must be signed in to change notification settings - Fork 110
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* add docker.version file * move installer files from pal directory * update configure file * wip * clean up makefile and installer datafiles * fix ai import casing * re add data files * update makefile * update make file * remove super project git references * remove legacy host agent code * merge windows code * add docs, troubleshoot and alerting * move windows code to source dir * refactor the scripts * remove unnecessary files * fix path certificategenerator.zip * bring windows code back * get latest changes from ci_feature * remove redundant files * fix liveness probe path issue * rename the dir names * clean up * clean up * omi dependencies * remove redundant files * update release notes * merge latest changes for 05222020 * removed weird whitespaces * update readme with windows version * update readme with windows agent build instructions * reorganize scripts * minor update * update readme * re-oragnize the code structure and add build script for windows agent * update readme * updates to readme and clean up * fix build errors * fix build errors * fix build errors * update readme * update main and setup scripts * fix path issue in windows docker file * readme updates * windows agent path issue * resolve ai package name conflict * rename things * rename file names to have uniform casing * get ridoff glide files * add version file * merge latest chanegs in ci_feature branch * do go get as part of the build * doc updates * read me update * update docker file * reorganize the windows code * reorganize windows code * update gitignore to not include files under windows * clean up files * update readme file * update to use version info for windows agent * merge Makefile.common into Makefile * fix build issues * fix build error * add sudo for go commands * remove go get from makefile * update to use go1.14.1 * fix bug in windows Makefile script * fix build error * remove weird char in makefile.ps1 * readme update * fix windows build issue * read me update * version update to synch with ciprod05262020 release * add omsagent-ai-res-id yaml * build with -buildmode=c-shared for out_oms.so for windows agent * final yaml updates * use version info for linux go so file * fix build error * readme updates * update readme * add shell script for installing for linux agent pre-requisites * take latest release notes * readme update * disable monitoring addon script * pr feedback * pr feedback * update disable addon script * clean up code * clean up commented code * pr feedback * wip * wip * refactor scripts to managed * wip * cleanup scripts * update for aks * add more validation * Wip * update powershell script * add disable monitoring powershell script * fix bugs with disable script * fix bugs in ps scripts * update readme * re-arrange source code * re-arrange test code * update readme * update path * move go code under src dir * update build dependencies to work on wsl * update readme with code of conduct * final readme updates Co-authored-by: root <[email protected]>
- Loading branch information
Showing
318 changed files
with
15,462 additions
and
22,758 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,9 @@ | ||
require 'rake/testtask' | ||
require "rake/testtask" | ||
|
||
task default: "test" | ||
|
||
Rake::TestTask.new do |task| | ||
task.libs << "test" | ||
task.pattern = './test/code/plugin/health/*_spec.rb' | ||
task.warning = false | ||
end | ||
task.libs << "test" | ||
task.pattern = "./test/unit-tests/plugins/health/*_spec.rb" | ||
task.warning = false | ||
end |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
``` | ||
let endDateTime = now(); | ||
let startDateTime = ago(1h); | ||
let trendBinSize = 1m; | ||
let clusterName = 'YOURCLUSTERNAME'; | ||
KubeNodeInventory | ||
| where TimeGenerated < endDateTime | ||
| where TimeGenerated >= startDateTime | ||
| distinct ClusterName, Computer, TimeGenerated | ||
| summarize ClusterSnapshotCount = count() by bin(TimeGenerated, trendBinSize), ClusterName, Computer | ||
| join hint.strategy=broadcast kind=inner ( | ||
KubeNodeInventory | ||
| where TimeGenerated < endDateTime | ||
| where TimeGenerated >= startDateTime | ||
| summarize TotalCount = count(), ReadyCount = sumif(1, Status contains ('Ready')) | ||
by ClusterName, Computer, bin(TimeGenerated, trendBinSize) | ||
| extend NotReadyCount = TotalCount - ReadyCount | ||
) on ClusterName, Computer, TimeGenerated | ||
| project TimeGenerated, | ||
ClusterName, | ||
Computer, | ||
ReadyCount = todouble(ReadyCount) / ClusterSnapshotCount, | ||
NotReadyCount = todouble(NotReadyCount) / ClusterSnapshotCount | ||
| order by ClusterName asc, Computer asc, TimeGenerated desc | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
``` | ||
let endDateTime = now(); | ||
let startDateTime = ago(1h); | ||
let trendBinSize = 1m; | ||
let clusterName = 'YOURCLUSTERNAME'; //can remove references for this from the query to show data for all clusters | ||
KubeNodeInventory | ||
| where TimeGenerated < endDateTime | ||
| where TimeGenerated >= startDateTime | ||
| where ClusterName == clusterName | ||
| distinct ClusterName, TimeGenerated | ||
| summarize ClusterSnapshotCount = count() by Timestamp = bin(TimeGenerated, trendBinSize), ClusterName | ||
| join hint.strategy=broadcast ( | ||
KubeNodeInventory | ||
| where TimeGenerated < endDateTime | ||
| where TimeGenerated >= startDateTime | ||
| summarize TotalCount = count(), ReadyCount = sumif(1, Status contains ('Ready')) | ||
by ClusterName, Timestamp = bin(TimeGenerated, trendBinSize) | ||
| extend NotReadyCount = TotalCount - ReadyCount | ||
) on ClusterName, Timestamp | ||
| project Timestamp, | ||
ReadyCount = todouble(ReadyCount) / ClusterSnapshotCount, | ||
NotReadyCount = todouble(NotReadyCount) / ClusterSnapshotCount | ||
| render timechart | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
``` | ||
let endDateTime = now(); | ||
let startDateTime = ago(1h); | ||
let trendBinSize = 1m; | ||
let clusterName = 'YOURCLUSTERNAME'; | ||
KubePodInventory | ||
| where TimeGenerated < endDateTime | ||
| where TimeGenerated >= startDateTime | ||
| where ClusterName == clusterName | ||
| distinct ClusterName, TimeGenerated | ||
| summarize ClusterSnapshotCount = count() by bin(TimeGenerated, trendBinSize), ClusterName | ||
| join hint.strategy=broadcast ( | ||
KubePodInventory | ||
| where TimeGenerated < endDateTime | ||
| where TimeGenerated >= startDateTime | ||
| distinct ClusterName, Computer, PodUid, TimeGenerated, PodStatus | ||
| summarize TotalCount = count(), | ||
PendingCount = sumif(1, PodStatus =~ 'Pending'), | ||
RunningCount = sumif(1, PodStatus =~ 'Running'), | ||
SucceededCount = sumif(1, PodStatus =~ 'Succeeded'), | ||
FailedCount = sumif(1, PodStatus =~ 'Failed') | ||
by ClusterName, bin(TimeGenerated, trendBinSize) | ||
) on ClusterName, TimeGenerated | ||
| extend UnknownCount = TotalCount - PendingCount - RunningCount - SucceededCount - FailedCount | ||
| project TimeGenerated, | ||
TotalCount = todouble(TotalCount) / ClusterSnapshotCount, | ||
PendingCount = todouble(PendingCount) / ClusterSnapshotCount, | ||
RunningCount = todouble(RunningCount) / ClusterSnapshotCount, | ||
SucceededCount = todouble(SucceededCount) / ClusterSnapshotCount, | ||
FailedCount = todouble(FailedCount) / ClusterSnapshotCount, | ||
UnknownCount = todouble(UnknownCount) / ClusterSnapshotCount | ||
| summarize AggregatedValue = avg(PendingCount) by bin(TimeGenerated, trendBinSize) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# How to set up alerts for performance problems in Azure Monitor for containers | ||
|
||
Azure Monitor for containers monitors the performance of container workloads deployed to either Azure Container Instances or managed Kubernetes clusters hosted on Azure Kubernetes Service (AKS). To enable monitoring, you will need to first create alert rules using kusto queries. This article will provide information on how to create alert rules with sample alerting queries. | ||
|
||
### How to create alert rules | ||
For step by step procedures on how to create alert rules, please go [here.](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-alerts#create-alert-rule) | ||
|
||
### Alerting situations (Queries): | ||
- [Node CPU and memory utilization exceeds your defined threshold](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-alerts#resource-utilization-log-search-queries) | ||
- [Pod CPU or memory utilization within a controller exceeds your defined threshold as compared to the set limit](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-alerts#resource-utilization-log-search-queries) | ||
- ["NotReady" Status Node counts](NotReadyQuery.md) | ||
- [Pod phase counts (Failed, Pending, Unknown, Running, Succeeded)](PendingPodCount.md) | ||
|
||
#### *Note on the queries* | ||
- Make sure to change the cluster name to your cluster. | ||
```let clusterName = 'YOURCLUSTERNAME';``` | ||
|
||
- *Alert by Pod Phases:* To alert on certain pod phases such as Pending, Failed, or Unknown, you will need to modify the last line of the query in [Pod phase counts](PendingPodCount.md). | ||
For example) Alert on FailedCount | ||
```| summarize AggregatedValue = avg(FailedCount) by bin(TimeGenerated, trendBinSize) ``` | ||
|
||
- *View in Chart*: If you want to see what the query does in the chart, go to Log Analytics and replace the last line that starts with ```| summarize ...``` to ```| render timechart```. Also you can change the start date time and duration by modifying the following: | ||
``` | ||
let startDateTime = startofday(ago(14d)); | ||
let trendBinSize = 1d; | ||
``` |
Oops, something went wrong.