diff --git a/README.md b/README.md
index e9442f78..5014aa1e 100644
--- a/README.md
+++ b/README.md
@@ -40,51 +40,42 @@ It provides detailed insights into model serving performance, offering both a us
 - 📝 **Rich Logs**: Automatically flushed to both terminal and file upon experiment completion.
 - 📈 **Experiment Analyzer**: Generates comprehensive Excel reports with pricing and raw metrics data, plus flexible plot configurations (default 2x4 grid) that visualize key performance metrics including throughput, latency (TTFT, E2E, TPOT), error rates, and RPS across different traffic scenarios and concurrency levels. Supports custom plot layouts and multi-line comparisons.
 
-## How to Start
+## Installation
 
-Please check [User Guide](https://docs.sglang.ai/genai-bench/user-guide/) and [CONTRIBUTING.md](https://docs.sglang.ai/genai-bench/development/contributing/) for how to install and use genai-bench.
+**Quick Start**: Install with `pip install genai-bench`.
+Alternatively, check [Installation Guide](https://docs.sglang.ai/genai-bench/getting-started/installation) for other options.
 
-## Benchmark Metrics Definition
+## How to use
 
-This section puts together the standard metrics required for LLM serving performance analysis. We classify metrics to two types: **single-request level metrics**, representing the metrics collected from one request. And **aggregated level metrics**, summarizing the single-request metrics from one run (with specific traffic scenario and num concurrency).
+### Quick Start
 
-**NOTE**:
+1. **Run a benchmark** against your model:
+   ```bash
+   genai-bench benchmark --api-backend openai \
+     --api-base "http://localhost:8080" \
+     --api-key "your-api-key" \
+     --api-model-name "your-model" \
+     --task text-to-text \
+     --max-time-per-run 5 \
+     --max-requests-per-run 100
+   ```
 
-- Each single-request metric includes standard statistics: **percentile**, **min**, **max**, **stddev**, and **mean**.
-- The following metrics cover **input**, **output**, and **end-to-end (e2e)** stages. For *chat* tasks, all stages are relevant for evaluation. For *embedding* tasks, where there is no output stage, output metrics will be set to 0. For details about output metrics collection, please check out `OUTPUT_METRICS_FIELDS` in [metrics.py](genai_bench/metrics/metrics.py).
+2. **Generate Excel reports** from your results:
+   ```bash
+   genai-bench excel --experiment-folder ./experiments/your_experiment \
+     --excel-name results --metric-percentile mean
+   ```
 
-### Single Request Level Metrics
+3. **Create visualizations**:
+   ```bash
+   genai-bench plot --experiments-folder ./experiments \
+     --group-key traffic_scenario --preset 2x4_default
+   ```
 
-The following metrics capture token-level performance for a single request, providing insights into server efficiency for each individual request.
+### Next Steps
 
-| Glossary               | Meaning                                                                                                                                                   | Calculation Formula                                            | Units         |
-|------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|---------------|
-| TTFT                   | Time to First Token. Initial response time when the first output token is generated. <br/> This is also known as the latency for the input (input) stage. | `TTFT = time_at_first_token - start_time`                      | seconds       |
-| End-to-End Latency     | End-to-End latency. This metric indicates how long it takes from submitting a query to receiving the full response, including network latencies.          | `e2e_latency = end_time - start_time`                          | seconds       |
-| TPOT                   | Time Per Output Token. The average time between two subsequent generated tokens.                                                                          | `TPOT = (e2e_latency - TTFT) / (num_output_tokens - 1)`        | seconds       |
-| Output Latency         | Output latency. This metric indicates how long it takes to receive the full response after the first token is generated. | `output_latency = e2e_latency - TTFT`                           | seconds       |
-| Output Inference Speed | The rate of how many tokens the model can generate per second for a single request.                                                                       | `inference_speed = 1 / TPOT`                                   | tokens/second |
-| Num of Input Tokens    | Number of prompt tokens.                                                                                                                                  | `num_input_tokens = tokenizer.encode(prompt)`                  | tokens        |
-| Num of Output Tokens   | Number of output tokens.                                                                                                                                  | `num_output_tokens = num_completion_tokens`                    | tokens        |
-| Num of Request Tokens  | Total number of tokens processed in one request.                                                                                                          | `num_request_tokens = num_input_tokens + num_output_tokens`    | tokens        |
-| Input Throughput       | The overall throughput of input (input process).                                                                                                          | `input_throughput = num_input_tokens / TTFT`                   | tokens/second |
-| Output Throughput      | The throughput of output (output generation) for a single request.                                                                                        | `output_throughput = (num_output_tokens - 1) / output_latency` | tokens/second |
+For detailed instructions, advanced configuration options, and comprehensive examples, check out the [User Guide](https://docs.sglang.ai/genai-bench/user-guide/).
 
-### Aggregated Metrics
+## Development
 
-This metrics collection summarizes the metrics relevant to a specific traffic load pattern, defined by the traffic scenario and the num of concurrency. It provides insights into server capacity and performance under pressure.
-
-| Glossary                  | Meaning                                                                                                                      | Calculation Formula                                                                         | Units         |
-|---------------------------|------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|---------------|
-| Mean Input Throughput     | The average throughput of how many input tokens can be processed by the model in one run with multiple concurrent requests.  | `mean_input_throughput = sum(input_tokens_for_all_requests) / run_duration`                 | tokens/second |
-| Mean Output Throughput    | The average throughput of how many output tokens can be processed by the model in one run with multiple concurrent requests. | `mean_output_throughput = sum(output_tokens_for_all_requests) / run_duration`               | tokens/second |
-| Total Tokens Throughput   | The average throughput of how many tokens can be processed by the model, including both input and output tokens.             | `mean_total_tokens_throughput = all_requests["total_tokens"]["sum"] / run_duration`         | tokens/second |
-| Total Chars Per Hour[^1]  | The average total characters can be processed by the model per hour.                                                         | `total_chars_per_hour = total_tokens_throughput * dataset_chars_to_token_ratio * 3600`      | Characters    |
-| Requests Per Minute       | The number of requests processed by the model per minute.                                                                    | `num_completed_requests_per_min = num_completed_requests / (end_time - start_time) * 60`    | Requests      |
-| Error Codes to Frequency  | A map that shows the returned error status code to its frequency.                                                            |                                                                                             |               |
-| Error Rate                | The rate of error requests over total requests.                                                                              | `error_rate = num_error_requests / num_requests`                                            |               |
-| Num of Error Requests     | The number of error requests in one load.                                                                                    | <pre><code>if requests.status_code != '200': <br/> num_error_requests += 1</code></pre>     |               |
-| Num of Completed Requests | The number of completed requests in one load.                                                                                | <pre><code>if requests.status_code == '200': <br/> num_completed_requests += 1</code></pre> |               |
-| Num of Requests           | The total number of requests processed for one load.                                                                         | `total_requests = num_completed_requests + num_error_requests`                              |               |
-
-[^1]: *Total Chars Per Hour* is derived from a character-to-token ratio based on sonnet.txt and the model’s tokenizer. This metric aids in pricing decisions for an LLM serving solution. For tasks with multi-modal inputs, non-text tokens are converted to an equivalent character count using the same character-to-token ratio.
+If you are interested in contributing to GenAI-Bench, you can use the [Development Guide](https://docs.sglang.ai/genai-bench/development/).
\ No newline at end of file
diff --git a/docs/.config/mkdocs-gh-pages.yml b/docs/.config/mkdocs-gh-pages.yml
index b5fcb9b8..1547b378 100644
--- a/docs/.config/mkdocs-gh-pages.yml
+++ b/docs/.config/mkdocs-gh-pages.yml
@@ -116,14 +116,14 @@ nav:
       - getting-started/index.md
       - Installation: getting-started/installation.md
       - Task Definition: getting-started/task-definition.md
-      - Command Guidelines: getting-started/command-guidelines.md
+      - Entrypoints: getting-started/entrypoints.md
       - Metrics Definition: getting-started/metrics-definition.md
   - User Guide:
       - user-guide/index.md
       - Run Benchmark: user-guide/run-benchmark.md
       - Traffic Scenarios: user-guide/scenario-definition.md
       - Multi-Cloud Authentication: user-guide/multi-cloud-auth-storage.md
-      - Quick Reference: user-guide/multi-cloud-quick-reference.md
+      - Multi-Cloud Quick Reference: user-guide/multi-cloud-quick-reference.md
       - Docker Deployment: user-guide/run-benchmark-using-docker.md
       - Excel Reports: user-guide/generate-excel-sheet.md
       - Visualizations: user-guide/generate-plot.md
diff --git a/docs/.config/mkdocs.yml b/docs/.config/mkdocs.yml
index 4e1c4e52..a1babf80 100644
--- a/docs/.config/mkdocs.yml
+++ b/docs/.config/mkdocs.yml
@@ -123,15 +123,12 @@ nav:
       - Run Benchmark: user-guide/run-benchmark.md
       - Traffic Scenarios: user-guide/scenario-definition.md
       - Multi-Cloud Authentication: user-guide/multi-cloud-auth-storage.md
-      - Quick Reference: user-guide/multi-cloud-quick-reference.md
+      - Multi-Cloud Quick Reference: user-guide/multi-cloud-quick-reference.md
       - Docker Deployment: user-guide/run-benchmark-using-docker.md
       - Excel Reports: user-guide/generate-excel-sheet.md
       - Visualizations: user-guide/generate-plot.md
       - Upload Results: user-guide/upload-benchmark-result.md
-  - Examples:
-      - examples/index.md
   - Development:
       - development/index.md
-      - Contributing: development/contributing.md
-  - API Reference:
-      - api/index.md
+      - Adding New Features: development/adding-new-features.md
+      - API Reference: development/api-reference.md
diff --git a/docs/api/index.md b/docs/api/index.md
deleted file mode 100644
index 51bcaecd..00000000
--- a/docs/api/index.md
+++ /dev/null
@@ -1,98 +0,0 @@
-# API Reference
-
-This section provides detailed API documentation for GenAI Bench components.
-
-!!! info "Coming Soon"
-    Comprehensive API documentation is being developed. In the meantime, please refer to the source code docstrings.
-
-## Core Components
-
-### Authentication
-
-- **UnifiedAuthFactory** - Factory for creating authentication providers
-- **ModelAuthProvider** - Base class for model authentication
-- **StorageAuthProvider** - Base class for storage authentication
-
-### Storage
-
-- **BaseStorage** - Abstract base class for storage implementations
-- **StorageFactory** - Factory for creating storage providers
-
-### CLI
-
-- **option_groups** - Modular CLI option definitions
-- **validation** - Input validation functions
-
-### Metrics
-
-- **AggregatedMetricsCollector** - Collects and aggregates benchmark metrics
-- **RequestMetricsCollector** - Collects per-request metrics
-
-### User Classes
-
-- **BaseUser** - Abstract base class for user implementations
-- **OpenAIUser** - OpenAI API implementation
-- **AWSBedrockUser** - AWS Bedrock implementation
-- **AzureOpenAIUser** - Azure OpenAI implementation
-- **GCPVertexUser** - GCP Vertex AI implementation
-- **OCICohereUser** - OCI Cohere implementation
-
-## Example Usage
-
-### Creating an Authentication Provider
-
-```python
-from genai_bench.auth.unified_factory import UnifiedAuthFactory
-
-# Create OpenAI auth
-auth = UnifiedAuthFactory.create_model_auth(
-    "openai",
-    api_key="sk-..."
-)
-
-# Create AWS Bedrock auth
-auth = UnifiedAuthFactory.create_model_auth(
-    "aws-bedrock",
-    access_key_id="AKIA...",
-    secret_access_key="...",
-    region="us-east-1"
-)
-```
-
-### Creating a Storage Provider
-
-```python
-from genai_bench.auth.unified_factory import UnifiedAuthFactory
-from genai_bench.storage.factory import StorageFactory
-
-# Create storage auth
-storage_auth = UnifiedAuthFactory.create_storage_auth(
-    "aws",
-    profile="default",
-    region="us-east-1"
-)
-
-# Create storage instance
-storage = StorageFactory.create_storage(
-    "aws",
-    storage_auth
-)
-
-# Upload a folder
-storage.upload_folder(
-    "/path/to/results",
-    "my-bucket",
-    prefix="benchmarks/2024"
-)
-```
-
-## Contributing to API Documentation
-
-We welcome contributions to improve our API documentation! If you'd like to help:
-
-1. Add docstrings to undocumented functions
-2. Provide usage examples
-3. Document edge cases and gotchas
-4. Submit a pull request
-
-See our [Contributing Guide](../development/contributing.md) for more details.
\ No newline at end of file
diff --git a/docs/development/contributing.md b/docs/development/adding-new-features.md
similarity index 57%
rename from docs/development/contributing.md
rename to docs/development/adding-new-features.md
index 6dde24eb..af2ba5ae 100644
--- a/docs/development/contributing.md
+++ b/docs/development/adding-new-features.md
@@ -1,88 +1,35 @@
-# Contribution Guideline
+# Adding New Features
 
-Welcome and thank you for your interest in contributing to genai-bench.
+This guide covers how to add new features to GenAI Bench, including model providers, storage providers, and tasks.
 
-## Coding Style Guide
+## Adding a New Model Provider
 
-genai-bench uses python 3.11, and we adhere to [Google Python style guide](https://google.github.io/styleguide/pyguide.html).
+1. Create auth provider in `genai_bench/auth/`
+2. Create user class in `genai_bench/user/`
+3. Update `UnifiedAuthFactory`
+4. Add validation in `cli/validation.py`
+5. Write tests
 
-We use `make format` to format our code using `isort` and `ruff`. The detailed configuration can be found in
-[pyproject.toml](https://github.com/sgl-project/genai-bench/blob/main/pyproject.toml).
+## Adding a New Storage Provider
 
-## Pull Requests
+1. Create storage auth in `genai_bench/auth/`
+2. Create storage implementation in `genai_bench/storage/`
+3. Update `StorageFactory`
+4. Write tests
 
-Please follow the PR template, which will be automatically populated when you open a new [Pull Request on GitHub](https://github.com/sgl-project/genai-bench/compare).
-
-### Code Reviews
-
-All submissions, including submissions by project members, require a code review.
-To make the review process as smooth as possible, please:
-
-1. Keep your changes as concise as possible.
-   If your pull request involves multiple unrelated changes, consider splitting it into separate pull requests.
-2. Respond to all comments within a reasonable time frame.
-   If a comment isn't clear,
-   or you disagree with a suggestion, feel free to ask for clarification or discuss the suggestion.
-3. Provide constructive feedback and meaningful comments. Focus on specific improvements
-   and suggestions that can enhance the code quality or functionality. Remember to
-   acknowledge and respect the work the author has already put into the submission.
-
-
-## Setup Development Environment
-
-### `make`
-
-genai-bench utilizes `make` for a lot of useful commands.
-
-If your laptop doesn't have `GNU make` installed, (check this by typing `make --version` in your terminal),
-you can ask our GenerativeAI's chatbot about how to install it in your system.
-
-### `uv`
-
-Install uv with `make uv` or install it from the [official website](https://docs.astral.sh/uv/).
-If installing from the website, create a project venv with `uv venv -p python3.11`.
-
-Once you have `make` and `uv` installed, you can follow the command below to build genai-bench wheel:
-
-```shell
-# check out commands genai-bench supports
-make help
-#activate virtual env managed by uv
-source .venv/bin/activate
-# install dependencies
-make install
-```
-
-You can utilize wheel to install genai-bench.
-
-```shell
-# build a .whl under genai-bench/dist
-make build
-# send the wheel to your remote machine if applies
-rsync --delete -avz ~/genai-bench/dist/<.wheel> <remote-user>@<remote-ip>:<dest-addr>
-```
-
-On your remote machine, you can simply use the `pip` to install genai-bench.
-
-```shell
-pip install <dest-addr>/<.wheel>
-```
-
-# Development Guide: Adding a New Task in `genai-bench`
+## Adding a New Task
 
 This guide explains how to add support for a new task in `genai-bench`. Follow the steps below to ensure consistency and compatibility with the existing codebase.
 
----
-
-## 1. Define the Request and Response in `protocol.py`
+### 1. Define the Request and Response in `protocol.py`
 
-### Steps
+#### Steps
 
 1. Add relevant fields to the appropriate request/response data classes in [`protocol.py`](https://github.com/sgl-project/genai-bench/blob/main/genai_bench/protocol.py)
 2. If the new task involves a new input-output modality, create a new request/response class.
 3. Use existing request/response classes (`UserChatRequest`, `UserEmbeddingRequest`, `UserImageChatRequest`, etc.) if they suffice.
 
-### Example
+#### Example
 
 ```python
 class UserTextToImageRequest(UserRequest):
@@ -92,25 +39,23 @@ class UserTextToImageRequest(UserRequest):
     image_resolution: Tuple[int, int] = Field(..., description="Resolution of the generated images.")
 ```
 
----
-
-## 2. Update or Create a Sampler
+### 2. Update or Create a Sampler
 
-### 2.1 If Input Modality Is Supported by an Existing Sampler
+#### 2.1 If Input Modality Is Supported by an Existing Sampler
 
 1. Check if the current [`TextSampler`](https://github.com/sgl-project/genai-bench/blob/main/genai_bench/sampling/text_sampler.py) or [`ImageSampler`](https://github.com/sgl-project/genai-bench/blob/main/genai_bench/sampling/image_sampler.py) supports the input-modality.
 2. Add request creation logic in the relevant `TextSampler` or `ImageSampler` class.
 3. Refactor the sampler's `_create_request` method to support the new task.
 4. **Tip:** Avoid adding long `if-else` chains for new tasks. Utilize helper methods or design a request creator pattern if needed.
 
-### 2.2 If Input Modality Is Not Supported
+#### 2.2 If Input Modality Is Not Supported
 
 1. Create a new sampler class inheriting from [`BaseSampler`](https://github.com/sgl-project/genai-bench/blob/main/genai_bench/sampling/base_sampler.py).
 2. Define the `sample` method to generate requests for the new task.
 3. Refer to `TextSampler` and `ImageSampler` for implementation patterns.
 4. Add utility functions for data preprocessing or validation specific to the new modality if necessary.
 
-### Example for a New Sampler
+#### Example for a New Sampler
 
 ```python
 class AudioSampler(Sampler):
@@ -129,13 +74,11 @@ class AudioSampler(Sampler):
             raise ValueError(f"Unsupported output_modality: {self.output_modality}")
 ```
 
----
-
-## 3. Add Task Support in the User Class
+### 3. Add Task Support in the User Class
 
 Each `User` corresponds to one API backend, such as [`OpenAIUser`](https://github.com/sgl-project/genai-bench/blob/main/genai_bench/user/openai_user.py) for OpenAI. Users can have multiple tasks, each corresponding to an endpoint.
 
-### Steps
+#### Steps
 
 1. Add the new task to the `supported_tasks` dictionary in the relevant `User` class.
 2. Map the new task to its corresponding function name in the dictionary.
@@ -143,7 +86,7 @@ Each `User` corresponds to one API backend, such as [`OpenAIUser`](https://githu
 4. If the new task uses an existing endpoint, refactor the function to support both tasks without duplicating logic.
 5. **Important:** Avoid creating multiple functions for tasks that use the same endpoint.
 
-### Example
+#### Example
 
 ```python
 class OpenAIUser(BaseUser):
@@ -164,11 +107,9 @@ class OpenAIUser(BaseUser):
         self.send_request(False, endpoint, payload, self.parse_audio_response)
 ```
 
----
-
-## 4. Add Unit Tests
+### 4. Add Unit Tests
 
-### Steps
+#### Steps
 
 1. Add tests for the new task in the appropriate test files.
 2. Include tests for:
@@ -176,12 +117,10 @@ class OpenAIUser(BaseUser):
     - Task validation in the `User` class.
     - End-to-end workflow using the new task.
 
----
-
-## 5. Update Documentation
+### 5. Update Documentation
 
-### Steps
+#### Steps
 
 1. Add the new task to the list of supported tasks in the [Task Definition guide](../getting-started/task-definition.md).
 2. Provide sample commands and explain any required configuration changes.
-3. Mention the new task in this contributing guide for future developers.
+3. Mention the new task in this development guide for future developers.
diff --git a/docs/development/api-reference.md b/docs/development/api-reference.md
new file mode 100644
index 00000000..614725e5
--- /dev/null
+++ b/docs/development/api-reference.md
@@ -0,0 +1,492 @@
+# API Reference
+
+This section provides comprehensive API documentation for all GenAI Bench components, organized by functional category.
+
+## Project Structure
+
+```
+genai-bench/
+├── genai_bench/        # Main package
+│   ├── analysis/       # Result analysis and reporting
+│   ├── auth/           # Authentication providers
+│   ├── cli/            # CLI implementation
+│   ├── data/           # Dataset loading and management
+│   ├── distributed/    # Distributed execution
+│   ├── metrics/        # Metrics collection
+│   ├── sampling/       # Data sampling
+│   ├── scenarios/      # Traffic generation scenarios
+│   ├── storage/        # Storage providers
+│   ├── ui/             # User interface components
+│   └── user/           # User implementations
+├── tests/              # Test suite
+└── docs/               # Documentation
+```
+
+## Analysis
+
+Components for analyzing benchmark results and generating reports.
+
+### Data Loading
+
+- **`ExperimentLoader`** - Loads experiment data from files
+- **`load_multiple_experiments()`** - Loads multiple experiment results
+- **`load_one_experiment()`** - Loads single experiment result
+
+### Plot Generation
+
+- **`FlexiblePlotGenerator`** - Generates plots using flexible configuration
+- **`plot_experiment_data_flexible()`** - Generates flexible plots
+
+### Configuration
+
+- **`PlotConfig`** - Configuration for plot generation
+- **`PlotConfigManager`** - Manages plot configurations
+- **`PlotSpec`** - Specification for individual plots
+
+### Report Generation
+
+- **`create_workbook()`** - Creates Excel workbooks from experiment data
+
+### Data Types
+
+- **`ExperimentMetrics`** - Metrics data structure for experiments
+- **`MetricsData`** - Union type for aggregated or individual metrics
+
+## Authentication
+
+Components for handling authentication across different cloud providers and services.
+
+### Base Classes
+
+- **`AuthProvider`** - Base class for authentication providers
+
+### Factories
+
+- **`UnifiedAuthFactory`** - Unified factory for creating authentication providers
+- **`AuthFactory`** - Legacy factory for authentication providers
+
+### Model Authentication Providers
+
+- **`ModelAuthProvider`** - Base class for model endpoint authentication
+- **`OpenAIAuth`** - OpenAI API authentication
+- **`AWSBedrockAuth`** - AWS Bedrock authentication
+- **`AzureOpenAIAuth`** - Azure OpenAI authentication
+- **`GCPVertexAuth`** - GCP Vertex AI authentication
+- **`OCIModelAuthAdapter`** - OCI model authentication adapter
+
+### Storage Authentication Providers
+
+- **`StorageAuthProvider`** - Base class for storage authentication
+- **`AWSS3Auth`** - AWS S3 authentication
+- **`AzureBlobAuth`** - Azure Blob Storage authentication
+- **`GCPStorageAuth`** - GCP Cloud Storage authentication
+- **`GitHubAuth`** - GitHub authentication
+- **`OCIStorageAuthAdapter`** - OCI storage authentication adapter
+
+### OCI Authentication Providers
+
+- **`OCIUserPrincipalAuth`** - OCI user principal authentication
+- **`OCIInstancePrincipalAuth`** - OCI instance principal authentication
+- **`OCISessionAuth`** - OCI session authentication
+- **`OCIOBOTokenAuth`** - OCI on-behalf-of token authentication
+
+## Storage
+
+Components for multi-cloud storage operations.
+
+### Base Classes
+
+- **`BaseStorage`** - Abstract base class for storage providers
+- **`StorageFactory`** - Factory for creating storage providers
+
+### Storage Implementations
+
+- **`AWSS3Storage`** - AWS S3 storage implementation
+- **`AzureBlobStorage`** - Azure Blob Storage implementation
+- **`GCPCloudStorage`** - GCP Cloud Storage implementation
+- **`OCIObjectStorage`** - OCI Object Storage implementation
+- **`GitHubStorage`** - GitHub storage implementation
+
+### OCI Object Storage Components
+
+- **`DataStore`** - Interface for data store operations
+- **`OSDataStore`** - OCI Object Storage data store
+- **`ObjectURI`** - Object URI representation
+
+### Operations
+
+- **File Operations**: `upload_file`, `download_file`, `delete_object`
+- **Folder Operations**: `upload_folder`
+- **Listing**: `list_objects`
+- **Multi-cloud Support**: AWS, Azure, GCP, OCI, GitHub
+
+## CLI
+
+Command-line interface components for user interaction.
+
+### Commands
+
+- **`cli`** - Main CLI entry point
+- **`benchmark`** - Benchmark command
+- **`excel`** - Excel report generation command
+- **`plot`** - Plot generation command
+
+### Option Groups
+
+- **`api_options`** - API-related CLI options
+- **`model_auth_options`** - Model authentication options
+- **`storage_auth_options`** - Storage authentication options
+- **`distributed_locust_options`** - Distributed execution options
+- **`experiment_options`** - Experiment configuration options
+- **`sampling_options`** - Data sampling options
+- **`server_options`** - Server configuration options
+- **`object_storage_options`** - Object storage options
+- **`oci_auth_options`** - OCI-specific authentication options
+
+### Utilities
+
+- **`get_experiment_path()`** - Get experiment file paths
+- **`get_run_params()`** - Extract run parameters
+- **`manage_run_time()`** - Manage run time limits
+- **`validate_tokenizer()`** - Validate tokenizer configuration
+
+### Validation
+
+- **`validate_api_backend()`** - Validate API backend selection
+- **`validate_api_key()`** - Validate API keys
+- **`validate_task()`** - Validate task selection
+- **`validate_dataset_config()`** - Validate dataset configuration
+- **`validate_additional_request_params()`** - Validate request parameters
+
+## Data
+
+Components for loading and managing datasets.
+
+### Configuration
+
+- **`DatasetConfig`** - Configuration for dataset loading
+- **`DatasetSourceConfig`** - Configuration for dataset sources
+
+### Loaders
+
+- **`DatasetLoader`** - Abstract base class for dataset loaders
+- **`TextDatasetLoader`** - Text dataset loader
+- **`ImageDatasetLoader`** - Image dataset loader
+- **`DataLoaderFactory`** - Factory for creating data loaders
+
+### Sources
+
+- **`DatasetSource`** - Abstract base class for dataset sources
+- **`FileDatasetSource`** - Local file dataset source
+- **`HuggingFaceDatasetSource`** - HuggingFace Hub dataset source
+- **`CustomDatasetSource`** - Custom dataset source
+- **`DatasetSourceFactory`** - Factory for creating dataset sources
+
+## Distributed
+
+Components for distributed benchmark execution.
+
+### Core Components
+
+- **`DistributedRunner`** - Manages distributed load test execution
+- **`DistributedConfig`** - Configuration for distributed runs
+- **`MessageHandler`** - Protocol for message handling
+
+### Architecture Features
+
+- Master-worker architecture
+- Message passing between processes
+- Metrics aggregation
+- Process management and cleanup
+
+## Metrics
+
+Components for collecting and analyzing performance metrics.
+
+### Data Structures
+
+- **`RequestLevelMetrics`** - Metrics for individual requests
+- **`AggregatedMetrics`** - Aggregated metrics for entire runs
+- **`MetricStats`** - Statistical metrics (mean, std, percentiles)
+
+### Collectors
+
+- **`AggregatedMetricsCollector`** - Collects and aggregates metrics
+- **`RequestMetricsCollector`** - Collects per-request metrics
+
+### Metric Types
+
+- **Time Metrics**: TTFT (Time to First Token), TPOT (Time Per Output Token), E2E Latency
+- **Throughput Metrics**: Input/Output throughput in tokens/second
+- **Token Metrics**: Input/output token counts
+- **Error Metrics**: Error rates and codes
+- **Performance Metrics**: Requests per second, run duration
+
+## Sampling
+
+Components for sampling data and creating requests.
+
+### Base Classes
+
+- **`Sampler`** - Abstract base class for samplers
+
+### Sampler Implementations
+
+- **`TextSampler`** - Sampler for text-based tasks
+- **`ImageSampler`** - Sampler for image-based tasks
+
+### Supported Tasks
+
+- **Text Tasks**: text-to-text, text-to-embeddings, text-to-rerank
+- **Image Tasks**: image-text-to-text, image-to-embeddings
+
+### Features
+
+- Automatic task registry
+- Modality-based sampling
+- Dataset integration
+- Request generation
+
+## Scenarios
+
+Components for defining traffic generation scenarios.
+
+### Base Classes
+
+- **`Scenario`** - Abstract base class for scenarios
+
+### Scenario Implementations
+
+- **`DatasetScenario`** - Dataset-based scenario
+- **`NormalDistribution`** - Normal distribution scenario
+- **`DeterministicDistribution`** - Deterministic scenario
+- **`EmbeddingScenario`** - Embedding-specific scenario
+- **`ReRankScenario`** - Re-ranking scenario
+- **`ImageModality`** - Image modality scenario
+
+### Distribution Types
+
+- **`TextDistribution`** - NORMAL, DETERMINISTIC, UNIFORM
+- **`EmbeddingDistribution`** - Embedding-specific distributions
+- **`ReRankDistribution`** - Re-ranking distributions
+- **`MultiModality`** - Multi-modal scenarios
+
+### Features
+
+- String-based scenario parsing
+- Automatic scenario registry
+- Parameter validation
+- Distribution sampling
+
+## UI
+
+Components for user interface and visualization.
+
+### Dashboard Implementations
+
+- **`Dashboard`** - Union type for dashboard implementations
+- **`RichLiveDashboard`** - Rich library-based dashboard
+- **`MinimalDashboard`** - Minimal dashboard for non-UI scenarios
+
+### Layout Functions
+
+- **`create_layout()`** - Creates dashboard layout
+- **`create_metric_panel()`** - Creates metric display panels
+- **`create_progress_bars()`** - Creates progress tracking bars
+
+### Visualization Functions
+
+- **`create_horizontal_colored_bar_chart()`** - Creates histogram charts
+- **`create_scatter_plot()`** - Creates scatter plots
+- **`update_progress()`** - Updates progress displays
+
+### Features
+
+- Real-time metrics visualization
+- Progress tracking
+- Interactive charts and histograms
+- Configurable UI components
+
+## User
+
+Components for interacting with different model APIs.
+
+### Base Classes
+
+- **`BaseUser`** - Abstract base class for user implementations
+
+### User Implementations
+
+- **`OpenAIUser`** - OpenAI API user
+- **`AWSBedrockUser`** - AWS Bedrock user
+- **`AzureOpenAIUser`** - Azure OpenAI user
+- **`GCPVertexUser`** - GCP Vertex AI user
+- **`OCICohereUser`** - OCI Cohere user
+- **`OCIGenAIUser`** - OCI Generative AI user
+- **`CohereUser`** - Cohere API user
+
+### Supported Tasks
+
+Each user implementation supports different combinations of:
+
+- **text-to-text**: Chat and generation tasks
+- **image-text-to-text**: Vision-based chat tasks
+- **text-to-embeddings**: Text embedding generation
+- **image-to-embeddings**: Image embedding generation
+- **text-to-rerank**: Text re-ranking tasks
+
+### Features
+
+- Task-based request handling
+- Metrics collection
+- Error handling
+- Authentication integration
+
+## Example Usage
+
+### Creating an Authentication Provider
+
+```python
+from genai_bench.auth.unified_factory import UnifiedAuthFactory
+
+# Create OpenAI auth
+auth = UnifiedAuthFactory.create_model_auth(
+    "openai",
+    api_key="sk-..."
+)
+
+# Create AWS Bedrock auth
+auth = UnifiedAuthFactory.create_model_auth(
+    "aws-bedrock",
+    access_key_id="AKIA...",
+    secret_access_key="...",
+    region="us-east-1"
+)
+```
+
+### Creating a Storage Provider
+
+```python
+from genai_bench.auth.unified_factory import UnifiedAuthFactory
+from genai_bench.storage.factory import StorageFactory
+
+# Create storage auth
+storage_auth = UnifiedAuthFactory.create_storage_auth(
+    "aws",
+    profile="default",
+    region="us-east-1"
+)
+
+# Create storage instance
+storage = StorageFactory.create_storage(
+    "aws",
+    storage_auth
+)
+
+# Upload a folder
+storage.upload_folder(
+    "/path/to/results",
+    "my-bucket",
+    prefix="benchmarks/2024"
+)
+```
+
+### Loading Datasets
+
+```python
+from genai_bench.data.config import DatasetConfig, DatasetSourceConfig
+from genai_bench.data.loaders.factory import DataLoaderFactory
+
+# Load from HuggingFace Hub
+config = DatasetConfig(
+    source=DatasetSourceConfig(
+        type="huggingface",
+        path="squad",
+        huggingface_kwargs={"split": "train"}
+    ),
+    prompt_column="question"
+)
+data = DataLoaderFactory.load_data_for_task("text-to-text", config)
+
+# Load from local CSV file
+config = DatasetConfig(
+    source=DatasetSourceConfig(
+        type="file",
+        path="/path/to/dataset.csv",
+        file_format="csv"
+    ),
+    prompt_column="text"
+)
+data = DataLoaderFactory.load_data_for_task("text-to-text", config)
+```
+
+### Running Programmatic Benchmarks
+
+```python
+from genai_bench.distributed.runner import DistributedRunner, DistributedConfig
+from genai_bench.ui.dashboard import create_dashboard
+
+# Configure distributed execution
+config = DistributedConfig(
+    num_workers=4,
+    master_host="127.0.0.1",
+    master_port=5557
+)
+
+# Create dashboard
+dashboard = create_dashboard(metrics_time_unit="s")
+
+# Create and setup runner
+runner = DistributedRunner(environment, config, dashboard)
+runner.setup()
+
+# Update scenario and run benchmark
+runner.update_scenario("N(100,50)")
+runner.update_batch_size(32)
+```
+
+### Analyzing Results
+
+```python
+from genai_bench.analysis.experiment_loader import load_multiple_experiments
+from genai_bench.analysis.flexible_plot_report import FlexiblePlotGenerator
+from genai_bench.analysis.plot_config import PlotConfig, PlotSpec
+
+# Load experiment data
+experiments = load_multiple_experiments(
+    folder_name="/path/to/experiments",
+    filter_criteria={"model": "gpt-4"}
+)
+
+# Create plot configuration
+config = PlotConfig(
+    title="Performance Analysis",
+    plots=[
+        PlotSpec(
+            x_field="concurrency",
+            y_fields=["e2e_latency", "ttft"],
+            plot_type="line",
+            title="Latency vs Concurrency"
+        )
+    ]
+)
+
+# Generate plots
+generator = FlexiblePlotGenerator(config)
+generator.generate_plots(
+    experiments,
+    group_key="traffic_scenario",
+    experiment_folder="/path/to/results"
+)
+```
+
+## Contributing to API Documentation
+
+We welcome contributions to improve our API documentation! If you'd like to help:
+
+1. Add docstrings to undocumented functions
+2. Provide usage examples
+3. Document edge cases and gotchas
+4. Submit a pull request
+
+See our [Development Guide](index.md) for more details.
\ No newline at end of file
diff --git a/docs/development/index.md b/docs/development/index.md
index 0c710bd2..e8a82765 100644
--- a/docs/development/index.md
+++ b/docs/development/index.md
@@ -1,26 +1,68 @@
 # Development
 
-Welcome to the GenAI Bench development guide! This section covers everything you need to contribute to the project.
+Welcome and thank you for your interest in contributing to genai-bench! This section is a development guide that covers everything you need to contribute to the project.
 
 ## Getting Started with Development
 
 <div class="grid cards" markdown>
 
-- :material-source-pull:{ .lg .middle } **Contributing**
+- :material-cog:{ .lg .middle } **Adding New Features**
 
     ---
 
-    Learn how to contribute to GenAI Bench
+    Learn how to add new providers and tasks
 
-    [:octicons-arrow-right-24: Contributing Guide](contributing.md)
+    [:octicons-arrow-right-24: Adding New Features](adding-new-features.md)
+
+- :material-book:{ .lg .middle } **API Reference**
+
+    ---
+
+    Programmatic usage and integration
+
+    [:octicons-arrow-right-24: API Reference](api-reference.md)
 
 </div>
 
+
+## Coding Style Guide
+
+genai-bench uses python 3.11, and we adhere to [Google Python style guide](https://google.github.io/styleguide/pyguide.html).
+
+We use `make format` to format our code using `isort` and `ruff`. The detailed configuration can be found in
+[pyproject.toml](https://github.com/sgl-project/genai-bench/blob/main/pyproject.toml).
+
+### Guidelines
+
+- Follow PEP 8
+- Use type hints
+- Write docstrings for public APIs
+- Keep functions focused and small
+- Add tests for new features
+
+## Pull Requests
+
+Please follow the PR template, which will be automatically populated when you open a new [Pull Request on GitHub](https://github.com/sgl-project/genai-bench/compare).
+
+### Code Reviews
+
+All submissions, including submissions by project members, require a code review.
+To make the review process as smooth as possible, please:
+
+1. Keep your changes as concise as possible.
+   If your pull request involves multiple unrelated changes, consider splitting it into separate pull requests.
+2. Respond to all comments within a reasonable time frame.
+   If a comment isn't clear,
+   or you disagree with a suggestion, feel free to ask for clarification or discuss the suggestion.
+3. Provide constructive feedback and meaningful comments. Focus on specific improvements
+   and suggestions that can enhance the code quality or functionality. Remember to
+   acknowledge and respect the work the author has already put into the submission.
+
 ## Development Setup
 
 ### Prerequisites
 
-- Python 3.8+
+- Python 3.11
 - Git
 - Make (optional but recommended)
 
@@ -31,17 +73,44 @@ git clone https://github.com/sgl-project/genai-bench.git
 cd genai-bench
 ```
 
-### Create a Virtual Environment
+### Development Environment Setup
 
-```bash
-python -m venv venv
-source venv/bin/activate  # On Windows: venv\Scripts\activate
+#### `make`
+
+genai-bench utilizes `make` for a lot of useful commands.
+
+If your laptop doesn't have `GNU make` installed, (check this by typing `make --version` in your terminal),
+you can ask our GenerativeAI's chatbot about how to install it in your system.
+
+#### `uv`
+
+Install uv with `make uv` or install it from the [official website](https://docs.astral.sh/uv/).
+If installing from the website, create a project venv with `uv venv -p python3.11`.
+
+Once you have `make` and `uv` installed, you can follow the command below to build genai-bench wheel:
+
+```shell
+# check out commands genai-bench supports
+make help
+#activate virtual env managed by uv
+source .venv/bin/activate
+# install dependencies
+make install
 ```
 
-### Install in Development Mode
+You can utilize wheel to install genai-bench.
 
-```bash
-pip install -e ".[dev]"
+```shell
+# build a .whl under genai-bench/dist
+make build
+# send the wheel to your remote machine if applies
+rsync --delete -avz ~/genai-bench/dist/<.wheel> <remote-user>@<remote-ip>:<dest-addr>
+```
+
+On your remote machine, you can simply use the `pip` to install genai-bench.
+
+```shell
+pip install <dest-addr>/<.wheel>
 ```
 
 ### Run Tests
@@ -66,54 +135,22 @@ make lint
 
 ```
 genai-bench/
-├── genai_bench/          # Main package
-│   ├── auth/            # Authentication providers
-│   ├── cli/             # CLI implementation
-│   ├── metrics/         # Metrics collection
-│   ├── storage/         # Storage providers
-│   └── user/            # User implementations
-├── tests/               # Test suite
-├── docs/                # Documentation
-└── examples/            # Example configurations
+├── genai_bench/        # Main package
+│   ├── analysis/       # Result analysis and reporting
+│   ├── auth/           # Authentication providers
+│   ├── cli/            # CLI implementation
+│   ├── data/           # Dataset loading and management
+│   ├── distributed/    # Distributed execution
+│   ├── metrics/        # Metrics collection
+│   ├── sampling/       # Data sampling
+│   ├── scenarios/      # Traffic generation scenarios
+│   ├── storage/        # Storage providers
+│   ├── ui/             # User interface components
+│   └── user/           # User implementations
+├── tests/              # Test suite
+└── docs/               # Documentation
 ```
 
-## Key Components
-
-### Authentication System
-
-- Unified factory for creating auth providers
-- Support for multiple cloud providers
-- Extensible architecture for new providers
-
-### Storage System
-
-- Abstract base class for storage providers
-- Implementations for AWS S3, Azure Blob, GCP Cloud Storage, etc.
-- Consistent interface across providers
-
-### CLI Architecture
-
-- Click-based command structure
-- Modular option groups
-- Comprehensive validation
-
-## Adding New Features
-
-### Adding a New Model Provider
-
-1. Create auth provider in `genai_bench/auth/`
-2. Create user class in `genai_bench/user/`
-3. Update `UnifiedAuthFactory`
-4. Add validation in `cli/validation.py`
-5. Write tests
-
-### Adding a New Storage Provider
-
-1. Create storage auth in `genai_bench/auth/`
-2. Create storage implementation in `genai_bench/storage/`
-3. Update `StorageFactory`
-4. Write tests
-
 ## Testing
 
 We use pytest for testing:
@@ -147,16 +184,8 @@ make docs-serve
 make docs-build
 ```
 
-## Code Style
-
-- Follow PEP 8
-- Use type hints
-- Write docstrings for public APIs
-- Keep functions focused and small
-- Add tests for new features
-
 ## Questions?
-
+- Check out the [Adding New Features](./adding-new-features.md) and [API Reference](./api-reference.md) pages for more information on the project
 - Open an issue on GitHub
 - Join our community discussions
 - Check existing issues and PRs
\ No newline at end of file
diff --git a/docs/examples/index.md b/docs/examples/index.md
deleted file mode 100644
index 2036a353..00000000
--- a/docs/examples/index.md
+++ /dev/null
@@ -1,92 +0,0 @@
-# Examples
-
-This section provides practical examples and configurations for GenAI Bench.
-
-## Quick Examples
-
-### OpenAI GPT-4 Benchmark
-
-```bash
-genai-bench benchmark \
-  --api-backend openai \
-  --api-base https://api.openai.com/v1 \
-  --api-key $OPENAI_API_KEY \
-  --api-model-name gpt-4 \
-  --model-tokenizer gpt2 \
-  --task text-to-text \
-  --max-requests-per-run 1000 \
-  --max-time-per-run 10
-```
-
-### AWS Bedrock Claude Benchmark
-
-```bash
-genai-bench benchmark \
-  --api-backend aws-bedrock \
-  --api-base https://bedrock-runtime.us-east-1.amazonaws.com \
-  --aws-profile default \
-  --aws-region us-east-1 \
-  --api-model-name anthropic.claude-3-sonnet-20240229-v1:0 \
-  --model-tokenizer Anthropic/claude-3-sonnet \
-  --task text-to-text \
-  --max-requests-per-run 500 \
-  --max-time-per-run 10
-```
-
-### Multi-Modal Benchmark
-
-```bash
-genai-bench benchmark \
-  --api-backend gcp-vertex \
-  --api-base https://us-central1-aiplatform.googleapis.com \
-  --gcp-project-id my-project \
-  --gcp-location us-central1 \
-  --gcp-credentials-path /path/to/service-account.json \
-  --api-model-name gemini-1.5-pro-vision \
-  --model-tokenizer google/gemini \
-  --task image-text-to-text \
-  --dataset-path /path/to/images \
-  --max-requests-per-run 100 \
-  --max-time-per-run 10
-```
-
-### Embedding Benchmark with Batch Sizes
-
-```bash
-genai-bench benchmark \
-  --api-backend openai \
-  --api-base https://api.openai.com/v1 \
-  --api-key $OPENAI_API_KEY \
-  --api-model-name text-embedding-3-large \
-  --model-tokenizer cl100k_base \
-  --task text-to-embeddings \
-  --batch-size 1 --batch-size 8 --batch-size 32 --batch-size 64 \
-  --max-requests-per-run 2000 \
-  --max-time-per-run 10
-```
-
-## Traffic Scenarios
-
-GenAI Bench supports various traffic patterns:
-
-### Text Generation Scenarios
-
-- `D(100,100)` - Deterministic: 100 input tokens, 100 output tokens
-- `N(480,240)/(300,150)` - Normal distribution
-- `U(50,100)/(200,250)` - Uniform distribution
-
-### Embedding Scenarios
-
-- `E(64)` - 64 tokens per document
-- `E(512)` - 512 tokens per document
-- `E(1024)` - 1024 tokens per document
-
-### Vision Scenarios
-
-- `I(512,512)` - 512x512 pixel images
-- `I(1024,512)` - 1024x512 pixel images
-- `I(2048,2048)` - 2048x2048 pixel images
-
-## Contributing Examples
-
-Have a useful configuration or example? We welcome contributions! Please submit a pull request with your example following our [contribution guidelines](../development/contributing.md).
\ No newline at end of file
diff --git a/docs/getting-started/command-guidelines.md b/docs/getting-started/command-guidelines.md
index fd0228dd..4e76b8ba 100644
--- a/docs/getting-started/command-guidelines.md
+++ b/docs/getting-started/command-guidelines.md
@@ -1,13 +1,8 @@
 # Command Guidelines
 
-Once you install it in your local environment, you can use `--help` to read
-about what command options it supports.
+GenAI Bench provides three main CLI commands for running benchmarks, generating reports, and creating visualizations. This guide covers the essential options for each command.
 
-```shell
-genai-bench --help
-```
-
-`genai-bench` supports three commands:
+## Overview
 
 ```shell
 Commands:
@@ -16,4 +11,138 @@ Commands:
   plot       Plots the experiment(s) results based on filters and group...
 ```
 
-You can also refer to [option_groups.py](https://github.com/sgl-project/genai-bench/blob/main/genai_bench/cli/option_groups.py).
\ No newline at end of file
+## Benchmark
+
+The `benchmark` command runs performance tests against AI models. It's the core command for executing benchmarks.
+
+### Example Usage
+```bash
+# Start a chat benchmark
+genai-bench benchmark --api-backend openai \
+            --api-base "http://localhost:8082" \
+            --api-key "your-openai-api-key" \
+            --api-model-name "meta-llama/Meta-Llama-3-70B-Instruct" \
+            --model-tokenizer "/mnt/data/models/Meta-Llama-3.1-70B-Instruct" \
+            --task text-to-text \
+            --max-time-per-run 15 \
+            --max-requests-per-run 300 \
+            --server-engine "SGLang" \
+            --server-gpu-type "H100" \
+            --server-version "v0.6.0" \
+            --server-gpu-count 4
+```
+
+### Essential Options
+
+#### **API Configuration**
+- `--api-backend` - Choose your model provider (openai, oci-cohere, aws-bedrock, azure-openai, gcp-vertex, vllm, sglang)
+- `--api-base` - API endpoint URL
+- `--api-model-name` - Model name for the request body
+- `--task` - Task type (text-to-text, text-to-embeddings, image-text-to-text, etc.)
+
+#### **Authentication**
+- `--api-key` - API key (for OpenAI)
+- `--model-api-key` - Alternative API key parameter
+- Cloud-specific auth options (AWS, Azure, GCP, OCI)
+
+#### **Experiment Parameters**
+- `--max-requests-per-run` - Maximum requests to send each run
+- `--max-time-per-run` - Maximum duration for each run in minutes
+- `--num-concurrency` - Number of concurrent requests to send (multiple values supported in different runs)
+- `--batch-size` - Batch sizes for embeddings/rerank tasks
+- `--traffic-scenario` - Define input/output token distributions, more info in [Traffic Scenarios](../user-guide/scenario-definition.md)
+- `--model-tokenizer` - Path to the model tokenizer
+
+#### **Dataset Options**
+- `--dataset-path` - Path to dataset (local file, HuggingFace ID, or 'default')
+- `--dataset-config` - JSON config file for advanced dataset options, more info in [Selecting Datasets](../user-guide/run-benchmark.md/#selecting-datasets)
+- `--dataset-prompt-column` - Column name for prompts
+- `--dataset-image-column` - Column name for images (multimodal)
+
+#### **Server Information**
+- `--server-engine` - Backend engine (vLLM, SGLang, TGI, etc.)
+- `--server-version` - Server version
+- `--server-gpu-type` - GPU type (H100, A100-80G, etc.)
+- `--server-gpu-count` - Number of GPUs
+
+For more information and examples, check out [Run Benchmark](../user-guide/run-benchmark.md).
+
+## Excel
+
+The `excel` command exports experiment results to Excel spreadsheets for detailed analysis.
+
+### Example Usage
+
+```bash
+# Export with mean metrics in seconds
+genai-bench excel \
+  --experiment-folder ./experiments/openai_gpt-3.5-turbo_20241201_120000 \
+  --excel-name benchmark_results \
+  --metric-percentile mean \
+  --metrics-time-unit s
+
+# Export with 95th percentile in milliseconds
+genai-bench excel \
+  --experiment-folder ./experiments/my_experiment \
+  --excel-name detailed_analysis \
+  --metric-percentile p95 \
+  --metrics-time-unit ms
+```
+
+### Essential Options
+
+- `--experiment-folder` - Path to experiment results folder (required)
+- `--excel-name` - Name for the output Excel file (required)
+- `--metric-percentile` - Statistical percentile (mean, p25, p50, p75, p90, p95, p99) to select from
+- `--metrics-time-unit [s|ms]` - Time unit to use when showing latency metrics in the spreadsheet. Defaults to seconds
+
+## Plot
+
+The `plot` command generates visualizations from experiment data with flexible configuration options.
+
+### Example Usage
+
+```bash
+# Simple plot with default 2x4 layout
+genai-bench plot \
+  --experiments-folder ./experiments \
+  --group-key traffic_scenario \
+  --filter-criteria "{'model': 'gpt-3.5-turbo'}"
+
+# Use built-in preset for latency analysis
+genai-bench plot \
+  --experiments-folder ./experiments \
+  --group-key server_version \
+  --preset multi_line_latency \
+  --metrics-time-unit ms
+```
+
+### Essential Options
+
+- `--experiments-folder` - Path to experiments folder, can be more than one experiment (required)
+- `--group-key` - Key to group data by (e.g., 'traffic_scenario', 'server_version', 'none') (required)
+- `--filter-criteria` - Dictionary of filter criteria
+- `--plot-config` - Path to JSON plot configuration file. For more information use [Advanced Plot Configuration](../user-guide/generate-plot.md/#advanced-plot-configuration)
+- `--preset` - Built-in plot presets (2x4_default, simple_2x2, multi_line_latency, single_scenario_analysis). Overrides `--plot-config` if both given
+- `--metrics-time-unit [s|ms]` - Time unit for latency display, defaults to seconds
+
+### Advanced Options
+
+- `--list-fields` - List available data fields and exit
+- `--validate-only` - Validate configuration without generating plots
+- `--verbose` - Enable detailed logging
+
+For more information and examples, check out [Generate Plot](../user-guide/generate-plot.md).
+
+## Getting Help
+
+For detailed help on any command:
+
+```bash
+genai-bench --help
+genai-bench benchmark --help
+genai-bench excel --help
+genai-bench plot --help
+```
+
+For further information, refer to the [User Guide](../user-guide/index.md) and the [API Reference](../development/api-reference.md). You can also look at [option_groups.py](https://github.com/sgl-project/genai-bench/blob/main/genai_bench/cli/option_groups.py) directly.
\ No newline at end of file
diff --git a/docs/index.md b/docs/index.md
index c0d6dfb9..1c18b764 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -68,16 +68,21 @@ GenAI Bench supports multiple benchmark types:
 ### 📖 User Guide
 
 - [Run Benchmark](user-guide/run-benchmark.md) - How to run benchmarks
+- [Traffic Scenarios](user-guide/scenario-definition.md) - Understanding traffic scenario syntax
 - [Multi-Cloud Authentication & Storage](user-guide/multi-cloud-auth-storage.md) - Comprehensive guide for cloud provider authentication
 - [Multi-Cloud Quick Reference](user-guide/multi-cloud-quick-reference.md) - Quick examples for common scenarios
 - [Docker Deployment](user-guide/run-benchmark-using-docker.md) - Docker-based benchmarking
-- [Generate Excel Sheet](user-guide/generate-excel-sheet.md) - Creating Excel reports
-- [Generate Plot](user-guide/generate-plot.md) - Creating visualizations
-- [Upload Benchmark Results](user-guide/upload-benchmark-result.md) - Uploading results
+- [Excel Reports](user-guide/generate-excel-sheet.md) - Creating Excel reports
+- [Visualizations](user-guide/generate-plot.md) - Creating visualizations
+- [Upload Results](user-guide/upload-benchmark-result.md) - Uploading results
 
 ### 🔧 Development
 
-- [Contributing](development/contributing.md) - How to contribute to GenAI Bench
+- [Development](development/index.md) - How to contribute to GenAI Bench
+
+### 📚 API Reference
+
+- [API Documentation](development/api-reference.md) - Complete API reference and code examples
 
 ## Support
 
diff --git a/docs/user-guide/index.md b/docs/user-guide/index.md
index f099a32f..f6227255 100644
--- a/docs/user-guide/index.md
+++ b/docs/user-guide/index.md
@@ -74,6 +74,4 @@ Support for text, embeddings, and vision tasks:
 
 ## Need Help?
 
-- Check the [Quick Reference](multi-cloud-quick-reference.md) for common commands
-- Review [Command Guidelines](../getting-started/command-guidelines.md) for detailed options
 - See [Troubleshooting](multi-cloud-auth-storage.md#troubleshooting) for common issues
\ No newline at end of file
diff --git a/docs/user-guide/multi-cloud-auth-storage.md b/docs/user-guide/multi-cloud-auth-storage.md
index f4b72922..5b4f4b8f 100644
--- a/docs/user-guide/multi-cloud-auth-storage.md
+++ b/docs/user-guide/multi-cloud-auth-storage.md
@@ -1,6 +1,6 @@
 # Multi-Cloud Authentication and Storage Guide
 
-genai-bench now supports comprehensive multi-cloud authentication for both model endpoints and storage services. This guide covers how to configure and use authentication for various cloud providers.
+Genai-bench now supports comprehensive multi-cloud authentication for both model endpoints and storage services. This guide covers how to configure and use authentication for various cloud providers.
 
 ## Table of Contents
 
diff --git a/docs/user-guide/run-benchmark.md b/docs/user-guide/run-benchmark.md
index 63f09c24..030239a7 100644
--- a/docs/user-guide/run-benchmark.md
+++ b/docs/user-guide/run-benchmark.md
@@ -1,6 +1,6 @@
 # Run Benchmark
 
-> **Note**: GenAI Bench now supports multiple cloud providers for both model endpoints and storage. For detailed multi-cloud configuration, see the [Multi-Cloud Authentication & Storage Guide](multi-cloud-auth-storage.md) or the [Quick Reference](multi-cloud-quick-reference.md).
+> **Note**: GenAI Bench now supports multiple cloud providers for both model endpoints and storage. For detailed multi-cloud configuration, see the [Multi-Cloud Authentication & Storage Guide](multi-cloud-auth-storage.md) or the [Multi-Cloud Quick Reference](multi-cloud-quick-reference.md).
 
 ## Start a chat benchmark
 
@@ -131,16 +131,22 @@ genai-bench benchmark --api-backend oci-cohere \
             --num-workers 4
 ```
 
-## Monitor a benchmark
+## Specify a custom benchmark load
 
 **IMPORTANT**: logs in genai-bench are all useful. Please keep an eye on WARNING logs when you finish one benchmark.
 
-### Specify --traffic-scenario and --num-concurrency
+You can specify a custom load to benchmark through setting traffic scenarios and concurrencies to benchmark at. 
+
+Traffic scenarios let you define the shape of requests when benchmarking. See [Traffic Scenarios](./scenario-definition.md) for more information.
+
+The concurrency is the number of concurrent users making requests. Running various concurrencies allows you to benchmark performance at different loads. Each specified scenario is run at each concurrency. Specify concurrencies to run with `--num-concurrency`. 
 
 **IMPORTANT**: Please use `genai-bench benchmark --help` to check out the latest default value of `--num-concurrency`
 and `--traffic-scenario`.
 
-Both options are defined as [multi-value options](https://click.palletsprojects.com/en/8.1.x/options/#multi-value-options) in click. Meaning you can pass this command multiple times. If you want to define your own `--num-concurrency` or `--traffic-scenario`, you can use
+Both options are defined as [multi-value options](https://click.palletsprojects.com/en/8.1.x/options/#multi-value-options) in click. Meaning you can pass this command multiple times. 
+
+For example, the below benchmark command runs a scenario with a normal distribution of input and output tokens (Input mean=480, st.dev=240), (Output mean=300, st.dev=150) at concurrencies 1, 2, 4, 8, 16 and 32. 
 
 ```shell
 genai-bench benchmark \
@@ -153,9 +159,9 @@ genai-bench benchmark \
             --traffic-scenario "N(480,240)/(300,150)" --traffic-scenario "D(100,100)"
 ```
 
-### Notes on specific options
+### Notes on benchmark duration
 
-To manage each run or iteration in an experiment, genai-bench uses two parameters to control the exit logic. You can find more details in the `manage_run_time` function located in [utils.py](https://github.com/sgl-project/genai-bench/blob/main/genai_bench/cli/utils.py). Combination of `--max-time-per-run` and `--max-requests-per-run` should save overall time of one benchmark.
+To manage each run or iteration in an experiment, genai-bench uses two parameters to control the exit logic. Benchmark runs terminate after exceeding either the maximum time limit the maximum number of requests. These are specified with `--max-time-per-run` and `--max-requests-per-run`. You can find more details in the `manage_run_time` function located in [utils.py](https://github.com/sgl-project/genai-bench/blob/main/genai_bench/cli/utils.py). 
 
 For light traffic scenarios, such as D(7800,200) or lighter, we recommend the following settings:
 
@@ -197,9 +203,17 @@ To address this, you can increase the number of worker processes using the `--nu
 
 This distributes the load across multiple processes on a single machine, improving performance and ensuring your benchmark runs smoothly.
 
+### Notes on Usage
+
+1. This feature is experimental, so monitor the system's behavior when enabling multiple workers.
+2. Recommended Limit: Do **not** set the number of workers to more than 16, as excessive worker processes can lead to resource contention and diminished performance.
+3. Ensure your system has sufficient CPU and memory resources to support the desired number of workers.
+4. Adjust the number of workers based on your target load and system capacity to achieve optimal results.
+5. For high-concurrency tests with large payloads, use `--spawn-rate` to prevent worker overload.
+
 ### Controlling User Spawn Rate
 
-When running high-concurrency benchmarks with large payloads (e.g., 20k+ tokens), workers may become overwhelmed if all users are spawned immediately. This can cause worker heartbeat failures and restarts.
+By default, users are spawned at a rate equal to the concurrency, meaning it takes one second for all users to be created. When running high-concurrency benchmarks with large payloads (e.g., 20k+ tokens), workers may become overwhelmed if all users are spawned immediately. This can cause worker heartbeat failures and restarts.
 
 To prevent this, use the `--spawn-rate` option to control how quickly users are spawned:
 
@@ -215,17 +229,9 @@ To prevent this, use the `--spawn-rate` option to control how quickly users are
 - `--spawn-rate 100`: Spawn 100 users per second (takes 5 seconds to reach 500 users)
 - `--spawn-rate 500`: Spawn all users immediately (default behavior)
 
-### Notes on Usage
-
-1. This feature is experimental, so monitor the system's behavior when enabling multiple workers.
-2. Recommended Limit: Do **not** set the number of workers to more than 16, as excessive worker processes can lead to resource contention and diminished performance.
-3. Ensure your system has sufficient CPU and memory resources to support the desired number of workers.
-4. Adjust the number of workers based on your target load and system capacity to achieve optimal results.
-5. For high-concurrency tests with large payloads, use `--spawn-rate` to prevent worker overload.
-
-## Using Dataset Configurations
+## Selecting datasets
 
-Genai-bench supports flexible dataset configurations through two approaches:
+By default, genai-bench samples tokens to benchmark from [sonnet.txt](https://github.com/sgl-project/genai-bench/blob/main/genai_bench/data/sonnet.txt) for `text-to-text` or `text-to-embeddings` tasks. Image tasks do not have a default dataset. To select a dataset to benchmark from, genai-bench supports flexible dataset configurations through two approaches:
 
 ### Simple CLI Usage (for basic datasets)
 
@@ -345,4 +351,4 @@ If you want to benchmark a specific portion of a vision dataset, you can use the
 
 ## Picking units
 
-Genai-bench defaults to measuring latency (End-to-end latency, TTFT, TPOT, Input/Output latencies) in seconds. If you prefer milliseconds, you can select them with `--metrics-time-unit [s|ms]`. 
\ No newline at end of file
+Genai-bench defaults to measuring latency metrics (End-to-end latency, TTFT, TPOT, Input/Output latencies) in seconds. If you prefer milliseconds, you can select them with `--metrics-time-unit [s|ms]`. 
\ No newline at end of file
diff --git a/docs/user-guide/scenario-definition.md b/docs/user-guide/scenario-definition.md
index 1b525099..a0d70f6b 100644
--- a/docs/user-guide/scenario-definition.md
+++ b/docs/user-guide/scenario-definition.md
@@ -12,6 +12,7 @@ Scenarios are optional. If you don’t provide any and you supply a dataset, gen
 
 - The CLI accepts one or more scenarios via `--traffic-scenario`. Each run iterates over the supplied scenarios and the selected iteration parameter (concurrency or batch size).
 - Internally, each scenario string is parsed into a Scenario class and passed to samplers to control request construction.
+- Scenarios are defined as [multi-value options](https://click.palletsprojects.com/en/8.1.x/options/#multi-value-options) in click. Meaning you can pass this command multiple times to benchmark different loads.
 
 ### Scenario types and formats