Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving litmus-docs by some characteristics and experiment creation #250

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions website/docs/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,16 @@ You need to Provide the correct socket path. By default in Portal `CONTAINER_RUN
If Your container runtime is `containerd` then you have to change the `CONTAINER_RUNTIME` to `containerd` and `SOCKET_PATH` to `/var/run/containerd/containerd.sock`.
You can find these in tune faults part of the tune chaos experiment page.

### The probe only accepts values in `ns, us, ms, m, s, or h`. But Why do experiments fail with `must be of type integer`?

In a fault's helper pod you may see the following error logs:

```shell
{"mainLogs":"W1003 08:59:55.273647 1 client_config.go:552] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.\n2023/10/03 08:59:55 Error Creating Resource : ChaosEngine.litmuschaos.io 'pod-network-loss-h6srhrls' is invalid: [spec.experiments[0].spec.probe[0].runProperties.interval: Invalid value: 'string': spec.experiments[0].spec.probe[0].runProperties.interval in body must be of type integer: 'string', spec.experiments[0].spec.probe[0].runProperties.probeTimeout: Invalid value: 'string': spec.experiments[0].spec.probe[0].runProperties.probeTimeout in body must be of type integer: 'string']\n"}
```

This issue is caused due to old CRDs. To fix this, delete all Litmus CRD resources and reinstall them for the version that you're using. The definition for CRDs can be found in the Chaos Infrastructure YAML file or at this link: `https://github.com/litmuschaos/litmus/blob/master/mkdocs/docs/<chaoscenter-version>/litmus-portal-crds-<chaoscenter-version>.yml`

## Chaoshub

### We have installed ChaosCenter successfully but the Litmus ChaosHub is in error state and manual cloning of a Git repository does not work.
Expand Down
55 changes: 54 additions & 1 deletion website/docs/user-guides/create-resilience-probe.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,15 @@ sidebar_label: Create a Resilience Probe

## Before you begin

You can learn about the concept of resilience probes [here](../concepts/probes.md) and chaos experiments [here](../concepts/chaos-workflow.md). For this user guide, we will use a HTTP probe.
You can learn about the concept of resilience probes [here](../concepts/probes.md) and chaos experiments [here](../concepts/chaos-workflow.md).

Here are some characterstics of resilience probes.
- **Unique Identifier**: Each Resilience Probe is identified by a unique name, serving as its identifier. Probe names cannot be reused for a given fault.
- **Deletion Behavior**: Deleting a Resilience Probe will disable it from further use but does not delete it from the system. This ensures that the probe's history and configuration remain intact for reference and analysis.
:::note
Starting from v3.0, it is required to add at least one Resilience Probe per chaos fault. This allows for a stricter chaos hypothesis validation which is independent of only the chaos fault's successful execution.
:::
For this user guide, we will use a HTTP probe.

## 1. Go to the Resilience Probes section

Expand Down Expand Up @@ -41,3 +49,48 @@ Configure the details for the probe you are creating, once completed, click the
The new probe will appear in the list as shown:

<img src={require('../assets/user-guides/resilience-probes/create-probe/step-6.png').default} />




### Annotations for Experiment Configuration
anshikavashistha marked this conversation as resolved.
Show resolved Hide resolved

When creating experiments, it's crucial to include a probeRef in annotations to link Resilience Probes with the experiment. This step enables seamless integration of probes into the chaos engineering workflow, whether creating experiments manually or uploading YAML configurations.

Example YAML manifest:
``` yaml
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: example-chaos-engine
namespace: litmus
spec:
appinfo:
appns: 'litmus'
applabel: 'app=nginx'
chaosServiceAccount: litmus-admin
monitoring: false
jobCleanUpPolicy: retain
experiments:
- name: pod-delete
spec:
components:
env:
- name: TOTAL_CHAOS_DURATION
value: "30"
- name: CHAOS_INTERVAL
value: "10"
- name: FORCE
value: "true"
annotationCheck: 'true'
components:
- name: runner
value: "go"
```
> **Note:** Add essential annotations, like annotationCheck: 'true', in the experiment's spec section to connect the Resilience Probe with the experiment and activate validation of the experiment configuration.Feel free to customize the YAML manifest according to your specific experiment requirements and configuration.

1. **Identify Probe to Associate**: Determine the Resilience Probe that you want to associate with the experiment.

2. **Add probeRef in Annotations**: In the experiment YAML configuration, include a `probeRef` field in annotations and specify the name of the Resilience Probe. Ensure that the `probeRef` is correctly formatted and matches the name of the chosen probe.

3. **Validate Annotations**: Before initiating the experiment, validate the experiment YAML configuration to ensure that the `probeRef` is properly included and associated with the Resilience Probe.
12 changes: 12 additions & 0 deletions website/docs/user-guides/schedule-experiment.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,18 @@ You can select Advanced Options on the Experiment Builder tab to configure the a
<img src={require('../assets/user-guides/injecting-fault/schedule-workflow/advanced-options-experiment-creation.png').default} alt="advanced-options-experiment-creation" />
</figure>

### Experiment Creation

When creating an experiment, it's imperative to include the Resilience Probe as part of the setup. This step is now mandatory to ensure accurate chaos injection and monitoring during the experiment. Follow these steps to add the probe to the experiment configuration:

1. **Identify Chaos Injection Points**: Determine the points within your system where chaos will be injected.

2. **Select Resilience Probe**: Choose the appropriate Resilience Probe that aligns with your experimentation goals and the type of chaos you want to inject as discussed above.

3. **Integrate Probe into Experiment YAML**: Add the Resilience Probe configuration to your experiment YAML file. Ensure that the probe is properly configured and referenced within the experiment setup.

4. **Validate Experiment Configuration**: Before initiating the experiment, validate the experiment configuration to ensure that the Resilience Probe is correctly included and configured.

## General options

### Node Selector
Expand Down
10 changes: 10 additions & 0 deletions website/versioned_docs/version-3.0.0/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,16 @@ You need to Provide the correct socket path. By default in Portal `CONTAINER_RUN
If Your container runtime is `containerd` then you have to change the `CONTAINER_RUNTIME` to `containerd` and `SOCKET_PATH` to `/var/run/containerd/containerd.sock`.
You can find these in tune faults part of the tune chaos experiment page.

### The probe only accepts values in `ns, us, ms, m, s, or h`. But Why do experiments fail with `must be of type integer`?

In a fault's helper pod you may see the following error logs:

```shell
{"mainLogs":"W1003 08:59:55.273647 1 client_config.go:552] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.\n2023/10/03 08:59:55 Error Creating Resource : ChaosEngine.litmuschaos.io 'pod-network-loss-h6srhrls' is invalid: [spec.experiments[0].spec.probe[0].runProperties.interval: Invalid value: 'string': spec.experiments[0].spec.probe[0].runProperties.interval in body must be of type integer: 'string', spec.experiments[0].spec.probe[0].runProperties.probeTimeout: Invalid value: 'string': spec.experiments[0].spec.probe[0].runProperties.probeTimeout in body must be of type integer: 'string']\n"}
```

This issue is caused due to old CRDs. To fix this, delete all Litmus CRD resources and reinstall them for the version that you're using. The definition for CRDs can be found in the Chaos Infrastructure YAML file or at this link: `https://github.com/litmuschaos/litmus/blob/master/mkdocs/docs/<chaoscenter-version>/litmus-portal-crds-<chaoscenter-version>.yml`

## Chaoshub

### We have installed ChaosCenter successfully but the Litmus ChaosHub is in error state and manual cloning of a Git repository does not work.
Expand Down
Loading