Hey! You're here because you want to show your worth as a Site Reliability Engineering (a.k.a SRE). You know what? I'm really happy you're here. If you want more information about what is a SRE, we recommend read the books published by Google to increase your knowledge.
Let's go!
You have to fork this repository to complete the following challenges in your own github
account. Feel free to solve the challenge you want. If you have any doubt, don't hesitate to open an issue to ask any question about any challenge.
Exists 6 basic challenges and 3 extras challenge. So, the basic we recommend to finish them and the extra only if you want demostrate more.
- Every challenge must have the SOLUTION.md in their directory.
- The content of SOLUTION.md is how-to obtain the result, executed commands and short explanation (if necessary).
And... this is all. The first step is clone the repository and read quietly.
NOTE: Go challenge-1
directory.
We've found a sample.log
file with 3360 lines but we need some info. Can you help us?
- Count all lines with
500
HTTP code. - Count all
GET
requests fromyoko
to/rrhh
location and was OK (200
). - How many requests go to
/
? - Count all lines without
5XX
HTTP code. - Replace all
503
HTTP code by500
, how many requests have500
HTTP code?
NOTE: Create challenge-2
directory.
We would like get some info. about the server. Can you help us? Someone told us about sysstat
package.
- Check the distribution.
- Check CPU usage.
- Check RAM usage. Can you explain the difference of
free
,used
,shared
andavailable
stats? - List block devices and file system disk.
- Obtain TCP and UDP listen ports.
- Get only PID top 10 process with more CPU usage.
- List all pid which open/used
/dev/null
.
NOTE: Create challenge-3
directory.
We would like use the challenge-2
commands with a simple menu (develop with bash
script). In my mind the -h
(help) print this:
Usage: myscript [options..]
Myscript description
Myscript options:
-d, --disk check disk stats
-c, --cpu check cpu stats
-p, --ports check listen ports
-r, --ram check ram stats
-o, --overview top 10 process with more CPU usage.
NOTE: Go challenge-4
directory.
We've the server.py
code and we want containerized (with docker
) this HTTP server. Can you give us the Dockerfile
? Ah! Can you check everything is running? Our technical team told us we need make a request with Challenge: intelygenz.com
header. Can you give us the result that server print?
NOTE: Go challenge-5
directory.
Oh, no! I don't know what happen on this binary! Can you help me? When I executed the binary told me always Ooooh, what's wrong? :(
. How to fix it? We expected Congrats! :)
message.
NOTE: Go challenge-6
directory.
NOTE 2: We recommend use a Virtual Machine with Debian (or you favorite flavour).
You find a playbook but is incomplete. Can you develop Ansible tasks to deploy the challenge-4
?
- Add the server on the inventory.
- Install
docker
. build
the image fromDockerfile
(challenge-4).- Deploy the image on the server.
- Check if HTTP server is running and response properly.
- Save the output of the
ansible-playbook
execution inansible.log
file and upload. - Group tasks with
tags
.
We've some modules to solve it:
NOTE: Create challenge-extra-1
directory.
- Use kreuzwerker/docker and hashicorp/http providers to replicate
challenge-6
with Terraform. - Upload all files when you finished the task.
NOTE: Go challenge-extra-2
directory.
Prepare environment:
Get info.:
- Get all namespaces.
- Get all pods from all namespaces.
- Get all resources from all namespaces.
- Get all services from namespace
intelygenz
. - Get all deployments from
tools
. - Get image from
nginx
deployment onintelygenz
namespace. - Create a
port-forward
to accessnginx
pod onintelygenz
namespace.
NOTE: Create challenge-extra-3
directory.