GreenScheduler · abhidg · Apr 25, 2024 · Apr 22, 2024 · Apr 22, 2024 · tlestang
diff --git a/README.md b/README.md
@@ -26,76 +26,19 @@ pip install git+https://github.com/GreenScheduler/cats
 
 ## Documentation
 
-Full documentation is available at [greenscheduler.github.io/cats/](https://greenscheduler.github.io/cats/). The below sections
-demonstrate some capability, for illustration, but please consult
-the documentation for more details.
+Documentation is available at [greenscheduler.github.io/cats/](https://greenscheduler.github.io/cats/).
 
-#### Basic example
+We recommend the
+[quickstart](https://greenscheduler.github.io/cats/quickstart.html#basic-usage)
+if you are new to CATS. CATS can optionally [display carbon footprint
+savings](https://greenscheduler.github.io/cats/quickstart.html#displaying-carbon-footprint-estimates)
+using a [configuration file](cats/config.yml).
 
-You can run `cats` with:
-
-```bash
-cats -d <job_duration> --loc <postcode>
-```
-
-The postcode is optional, and can be pulled from the `config.yml` file or, if that is not present, inferred using the server IP address. Job duration is in minutes, specified as an integer.
-
-The scheduler then calls a function that estimates the best time to start the job given predicted carbon intensity over the next 48 hours. The workflow is the same as for other popular schedulers. Switching to `cats` should be transparent to cluster users.
-
-By default, the optimal time to start the job is shown in a human readable format. This information can be output in a machine readable format by passing `--format=json`. The date format in the machine readable output can be controlled using `--dateformat` which accepts a [strftime(3)](https://manpages.debian.org/stable/manpages-dev/strftime.3.en.html) format date.
-
-
-#### Use with schedulers
-
-You can use CATS with, for example, the ``at`` job scheduler by running:
-
-```bash
-cats -d 5 --loc OX1 --scheduler at --command 'ls'
-```
-This schedules a command (`ls`) that has an expected runtime less than 5 minutes using the at scheduler.
-
-#### Console demonstration
+### Console demonstration
+CATS predicting optimal start time for the `ls` command in the `OX1` postcode:
 
 ![CATS animated usage example](cats.gif)
 
-#### Displaying carbon footprint estimates
-
-`cats` is able to provide an estimate for the carbon footprint
-reduction resulting from delaying your job.  To enable the footprint
-estimation, you must provide information about the machine in the form
-of a YAML configuration file.  An example is given below:
-
-```yaml
-location: "EH8"
-api: "carbonintensity.org.uk"
-PUE: 1.20 # > 1
-partitions:
-  CPU_partition:
-    type: CPU # CPU or GPU
-    model: "Xeon Gold 6142"
-    TDP: 9.4 # Thermal Design Power in W/core
-  GPU_partition:
-    type: GPU
-    model: "NVIDIA A100-SXM-80GB GPUs"
-    TDP: 300
-    CPU_model: "AMD EPYC 7763"
-    TDP_CPU: 4.4
-```
-
-Use the `--config` option to specify a path to the configuration
-file. If no path is specified, `cats` looks for a file named
-`config.yml` in the current directory.
-
-Additionally, to obtain carbon footprints, job-specific information
-must be provided to `cats` through the `--jobinfo` option.  The
-example below demonstrates running `cats` with footprint estimation
-for a job using 8GB of memory, 2 CPU cores and no GPU:
-
-```bash
-cats -d 120 --config .config/config.yml \
-  --jobinfo cpus=2,gpus=0,memory=8,partition=CPU_partition
-```
-
 ## Contributing
 
 We welcome contributions from the community! If you find a bug or have an idea for a new feature, please open an issue on our GitHub repository or submit a pull request.

diff --git a/cats/__init__.py b/cats/__init__.py
@@ -3,8 +3,9 @@
 import logging
 import subprocess
 import sys
-from argparse import ArgumentParser
+from argparse import ArgumentParser, RawDescriptionHelpFormatter
 from datetime import timedelta
+from pathlib import Path
 from typing import Optional
 
 from .carbonFootprint import Estimates, get_footprint_reduction_estimate
@@ -19,6 +20,10 @@
 SCHEDULER_DATE_FORMAT = {"at": "%Y%m%d%H%M"}
 
 
+def indent_lines(lines, spaces):
+    return "\n".join(" " * spaces + line for line in lines.split("\n"))
+
+
 def parse_arguments():
     """
     Parse command line arguments
@@ -35,9 +40,10 @@ def parse_arguments():
     (gCO2/kWh) of running the calculation now with the carbon intensity at that
     time in the future. To undertake this calculation, cats needs to know the
     predicted duration of the calculation (which you must supply, see `-d`) and
-    your location (which can be inferred from your IP address (but see `-l`). If
-    additional information about the power consumption of your computer is
-    available (see `--jobinfo`) the predicted CO2 usage will be reported.
+    your location, either inferred from your IP address, or passed using `-l`.
+    If additional information about the power consumption of your computer is
+    available and passed to CATS via the `--config` option, the predicted CO2
+    usage will be reported.
 
     To make use of this information, you will need to couple cats with a task
     scheduler of some kind. The command to schedule is specified with the `-c`
@@ -48,24 +54,41 @@ def parse_arguments():
        cats -d 1 --loc RG1 --scheduler=at --command='ls'
     """
 
-    example_text = """
-    Examples\n
-    ********\n
+    config_text = indent_lines(
+        Path(__file__).with_name("config.yml").read_text(), spaces=8
+    )
+    example_text = f"""
+    Examples
+    ********
 
-    Cats can be used to report information on the best time to run a calculation and the amount
-    of CO2. Information about a 90 minute calculation in centeral Oxford can be found by running:
+    CATS can be used to report information on the best time to run a calculation
+    and the amount of CO2. Information about a 90 minute calculation in centeral
+    Oxford can be found by running:
 
-        cats -d 90 --loc OX1 --jobinfo="cpus=2,gpus=0,memory=8,partition=CPU_partition"
+        cats -d 90 --loc OX1
 
-    The `at` scheduler is available from the command line on  most Linux and MacOS computers,
-    and can be the easest way to use cats to minimise the carbon intensity of calculations on
-    smaller computers. For example, the above calculation can be scheduled by running:
+    The `at` scheduler is available from the command line on most Linux and
+    MacOS computers, and can be the easest way to use cats to minimise the
+    carbon intensity of calculations on smaller computers. For example, the
+    above calculation can be scheduled by running:
 
         cats -d 90 --loc OX1 -s at -c 'mycommand'
+
+    To report carbon footprint, pass the `--config` option to select a
+    configuration file and the `--profile` option to select a profile. An
+    example config file is given below:
+
+{config_text}
+
+    The configuration file is documented in the Quickstart section of the online
+    documentation.
     """
 
     parser = ArgumentParser(
-        prog="cats", description=description_text, epilog=example_text
+        prog="cats",
+        description=description_text,
+        epilog=example_text,
+        formatter_class=RawDescriptionHelpFormatter,
     )
 
     def positive_integer(string):

diff --git a/cats/config.yml b/cats/config.yml
@@ -0,0 +1,15 @@
+profiles:
+  my_cpu_only_profile:
+    cpu:
+      model: "Xeon Gold 6142"
+      power: 9.4 # in W, per core
+      nunits: 2
+  my_gpu_profile:
+    gpu:
+      model: "NVIDIA A100-SXM-80GB GPUs"
+      power: 300
+      nunits: 2
+    cpu:
+      model: "AMD EPYC 7763"
+      power: 4.4
+      nunits: 1
diff --git a/docs/source/quickstart.rst b/docs/source/quickstart.rst
@@ -79,26 +79,11 @@ file <configuration-file>`.
 You can define an arbitraty number of profiles as subsection of the
 top-level ``profiles`` section:
 
-.. code-block:: yaml
+.. literalinclude :: ../../cats/config.yml
+   :language: yaml
    :caption: *An example provision of machine information by YAML file
              to enable estimation of the carbon footprint reduction.*
 
-   profiles:
-     my_cpu_only_profile:
-       cpu:
-         model: "Xeon Gold 6142"
-         power: 9.4 # in W, per core
-         nunits: 2
-     my_gpu_profile:
-       gpu:
-         model: "NVIDIA A100-SXM-80GB GPUs"
-         power: 300
-         nunits: 2
-       cpu:
-         model: "AMD EPYC 7763"
-         power: 4.4
-         nunits: 1
-
 The name of the profile section is arbitrary, but each profile section
 *must* contain one ``cpu`` section, or one ``gpu`` section, or both.
 Each hardware type (``cpu`` or ``gpu``) section *must* contain the

diff --git a/pyproject.toml b/pyproject.toml
@@ -4,6 +4,7 @@
 
 [tool.setuptools]
   packages = ["cats"]
+  package-data.cats = ["config.yml"]
 
 [project]
   name = "climate-aware-task-scheduler"