quic
diff --git a/‎.pre-commit-config.yaml
Lines changed: 12 additions & 0 deletions b/‎.pre-commit-config.yaml
Lines changed: 12 additions & 0 deletions
diff --git a/‎CODE-OF-CONDUCT.md
Lines changed: 73 additions & 0 deletions b/‎CODE-OF-CONDUCT.md
Lines changed: 73 additions & 0 deletions
diff --git a/‎CONTRIBUTING.md
Lines changed: 27 additions & 0 deletions b/‎CONTRIBUTING.md
Lines changed: 27 additions & 0 deletions
diff --git a/‎Dockerfile
Lines changed: 58 additions & 0 deletions b/‎Dockerfile
Lines changed: 58 additions & 0 deletions
diff --git a/‎LICENSE
Lines changed: 33 additions & 0 deletions b/‎LICENSE
Lines changed: 33 additions & 0 deletions
diff --git a/‎MANIFEST.in
Lines changed: 2 additions & 0 deletions b/‎MANIFEST.in
Lines changed: 2 additions & 0 deletions
diff --git a/‎Makefile
Lines changed: 3 additions & 0 deletions b/‎Makefile
Lines changed: 3 additions & 0 deletions
diff --git a/‎QEfficient/__init__.py
Lines changed: 20 additions & 0 deletions b/‎QEfficient/__init__.py
Lines changed: 20 additions & 0 deletions
diff --git a/‎QEfficient/cloud/__init__.py
Lines changed: 7 additions & 0 deletions b/‎QEfficient/cloud/__init__.py
Lines changed: 7 additions & 0 deletions
diff --git a/‎QEfficient/cloud/compile.py
Lines changed: 134 additions & 0 deletions b/‎QEfficient/cloud/compile.py
Lines changed: 134 additions & 0 deletions
@@ -0,0 +1,12 @@
+repos:
+  - repo: https://github.com/astral-sh/ruff-pre-commit
+    # Ruff version.
+    rev: v0.3.4
+    hooks:
+      # Run the linter.
+      - id: ruff
+        types_or: [ python, pyi, jupyter ]
+        args: [ --fix ]
+      # Run the formatter.
+      - id: ruff-format
+        types_or: [ python, pyi, jupyter ]
@@ -0,0 +1,73 @@
+# Contributor Covenant Code of Conduct
+
+## Our Pledge
+
+In the interest of fostering an open and welcoming environment, we as
+contributors and maintainers pledge to making participation in our project and
+our community a harassment-free experience for everyone, regardless of age, body
+size, disability, ethnicity, gender identity and expression, level of experience,
+nationality, personal appearance, race, religion, or sexual identity and
+orientation.
+
+## Our Standards
+
+Examples of behavior that contributes to creating a positive environment
+include:
+
+* Using welcoming and inclusive language
+* Being respectful of differing viewpoints and experiences
+* Gracefully accepting constructive criticism
+* Focusing on what is best for the community
+* Showing empathy towards other community members
+
+Examples of unacceptable behavior by participants include:
+
+* The use of sexualized language or imagery and unwelcome sexual attention or
+  advances
+* Trolling, insulting/derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or electronic
+  address, without explicit permission
+* Other conduct which could reasonably be considered inappropriate in a
+  professional setting
+
+## Our Responsibilities
+
+Project maintainers are responsible for clarifying the standards of acceptable
+behavior and are expected to take appropriate and fair corrective action in
+response to any instances of unacceptable behavior.
+
+Project maintainers have the right and responsibility to remove, edit, or
+reject comments, commits, code, wiki edits, issues, and other contributions
+that are not aligned to this Code of Conduct, or to ban temporarily or
+permanently any contributor for other behaviors that they deem inappropriate,
+threatening, offensive, or harmful.
+
+## Scope
+
+This Code of Conduct applies both within project spaces and in public spaces
+when an individual is representing the project or its community. Examples of
+representing a project or community include using an official project e-mail
+address, posting via an official social media account, or acting as an appointed
+representative at an online or offline event. Representation of a project may be
+further defined and clarified by project maintainers.
+
+## Enforcement
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may be
+reported by contacting the project team. All complaints will be reviewed
+and investigated and will result in a response that is deemed necessary and 
+appropriate to the circumstances. The project team is obligated to maintain
+confidentiality with regard to the reporter of an incident.
+Further details of specific enforcement policies may be posted separately.
+
+Project maintainers who do not follow or enforce the Code of Conduct in good
+faith may face temporary or permanent repercussions as determined by other
+members of the project's leadership.
+
+## Attribution
+
+This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
+available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
+
+[homepage]: https://www.contributor-covenant.org
@@ -0,0 +1,27 @@
+## Contributing to PROJECT
+
+Hi there!
+We’re thrilled that you’d like to contribute to this project.
+Your help is essential for keeping this project great and for making it better.
+
+## Branching Strategy
+
+In general, contributors should develop on branches based off of `master` and pull requests should be made against `master`.
+
+## Submitting a pull request
+
+1. Please read our [code of conduct](CODE-OF-CONDUCT.md) and [license](LICENSE).
+1. Fork and clone the repository.
+1. Create a new branch based on `master`: `git checkout -b <my-branch-name> master`.
+1. Make your changes, add tests, and make sure the tests still pass.
+1. Commit your changes using the [DCO](http://developercertificate.org/). You can attest to the DCO by commiting with the **-s** or **--signoff** options or manually adding the "Signed-off-by".
+1. Push to your fork and submit a pull request from your branch to `master`.
+1. Pat yourself on the back and wait for your pull request to be reviewed.
+
+Here are a few things you can do that will increase the likelihood of your pull request to be accepted:
+
+- Follow the existing style where possible.
+- Write tests.
+- Keep your change as focused as possible.
+  If you want to make multiple independent changes, please consider submitting them as separate pull requests.
+- Write a [good commit message](http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html).
@@ -0,0 +1,58 @@
+# Use Ubuntu 20.04 as the base image
+# Create a temp image that has build tools that we can use to build wheel
+# files for dependencies only available as source.
+FROM docker-registry.qualcomm.com/library/ubuntu:20.04
+
+# Update the package lists and install required packages
+RUN apt-get update && apt-get install -y \
+    git \
+    tmux \
+    python3.8 \
+    python3.8-venv \
+    python3-pip
+
+# pip recognizes this variable
+ENV PIP_CACHE_DIR=/var/cache/pip
+WORKDIR /app
+
+# Sample command to register and clone the repository
+# Clone the GitHub repository
+RUN git config --global user.email [email protected] && \
+    git config --global user.name none
+
+RUN mkdir -p /app/qefficient-library
+COPY . /app/qefficient-library
+
+# Create Virtual Env for the docker image
+RUN python3.8 -m venv /app/llm_env
+RUN . /app/llm_env/bin/activate
+WORKDIR /app/qefficient-library
+
+# Install the required Python packages
+
+RUN pip install torch==2.0.0+cpu --extra-index-url https://download.pytorch.org/whl/cpu --no-deps
+RUN pip install datasets==2.17.0 fsspec==2023.10.0 multidict==6.0.5 sentencepiece --no-deps
+
+RUN python3.8 -m pip install .
+WORKDIR /app/qefficient-library
+
+# Set the environment variable for the model card name and token ID
+ENV HF_HOME = "/app/qefficient-library/docs"
+ENV MODEL_NAME = ""
+ENV CACHE_DIR = ""
+ENV TOKEN_ID = ""
+
+# Print a success message
+CMD ["echo", "qefficient-transformers repository cloned and setup installed inside Docker image."]
+CMD ["echo", "Starting the Model Download and Export to Onnx Stage for QEff."]
+CMD python3.8 -m QEfficient.cloud.export --model-name "$MODEL_NAME"
+
+# Example usage:
+# docker build -t qefficient-library .
+
+# Minimum System Requirements Before running docker containers: 
+# 1. Clear the tmp space.
+# 2. For smaller models, 32GiB RAM is sufficient, but larger LLMs we require good CPU/RAM (Context 7B model would require atleast 64GiB).
+# 3. The exact minimum system configuration are tough to decide, since its all function of model parameters.
+
+# docker run -e MODEL_NAME=gpt2 -e TOKEN_ID=<your-token-id> qefficient-library
@@ -0,0 +1,33 @@
+Copyright (c) 2023, Qualcomm Innovation Center, Inc. All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted (subject to the limitations in the
+disclaimer below) provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright
+      notice, this list of conditions and the following disclaimer.
+
+    * Redistributions in binary form must reproduce the above
+      copyright notice, this list of conditions and the following
+      disclaimer in the documentation and/or other materials provided
+      with the distribution.
+
+    * Neither the name of Qualcomm Innovation Center, Inc. nor the names of its
+      contributors may be used to endorse or promote products derived
+      from this software without specific prior written permission.
+
+NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE
+GRANTED BY THIS LICENSE. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT
+HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED
+WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
+ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+SPDX-License-Identifier: BSD-3-Clause
@@ -0,0 +1,2 @@
+include README.md
+include LICENSE
@@ -0,0 +1,3 @@
+format:
+	isort --profile black -l 140 QEfficient/
+	black -l 140 QEfficient/
@@ -0,0 +1,20 @@
+# -----------------------------------------------------------------------------
+#
+# Copyright (c)  2023-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+#
+# -----------------------------------------------------------------------------
+
+import torch.nn as nn
+from QEfficient.transformers.modeling_utils import transform as transform_hf
+
+
+def transform(model: nn.Module, type="Transformers", form_factor="cloud"):
+    """Low level apis in library
+    model : instance of nn.Module
+    type : Transformers | Diffusers, default : Transformers
+    """
+    if type == "Transformers":
+        return transform_hf(model, form_factor)
+    else:
+        raise NotImplementedError
@@ -0,0 +1,7 @@
+# -----------------------------------------------------------------------------
+#
+# Copyright (c)  2023-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+#
+# -----------------------------------------------------------------------------
+
@@ -0,0 +1,134 @@
+# -----------------------------------------------------------------------------
+#
+# Copyright (c)  2023-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+#
+# -----------------------------------------------------------------------------
+
+import os
+import argparse
+import json
+from typing import List
+
+from QEfficient.exporter.export_utils import compile_kv_model_on_cloud_ai_100
+from QEfficient.utils.logging_utils import logger
+
+
+def create_and_dump_specializations(batch_size: int, prompt_len: int, ctx_len: int, path: str):
+    # Create
+    specializations = {
+        "specializations": [
+            {
+                "batch_size": str(batch_size),
+                "seq_len": str(prompt_len),
+                "ctx_len": str(ctx_len),
+            },
+            {"batch_size": str(batch_size), "seq_len": "1", "ctx_len": str(ctx_len)},
+        ]
+    }
+    # Dump
+    with open(path, "w") as file:
+        json.dump(specializations, file, indent=4)
+
+
+def main(
+    onnx_path: str,
+    qpc_path: str,
+    num_cores: int,
+    device_group: List[int],
+    aic_enable_depth_first: bool = False,
+    mos: int = -1,
+    batch_size: int = 1,
+    prompt_len: int = 32,
+    ctx_len: int = 128,
+    mxfp6: bool = True,
+) -> str:
+    # Dynamically create the specializations JSON
+    """
+    Api() to compile the Onnx Model on Cloud AI 100 Platform with give config.
+    ---------
+    :param onnx_path: str. Generated Onnx Model Path.
+    :base_path: str. Base path for the generated models.
+    :batch_size: int. Batch size to compile the model for.
+    :prompt_len: int. prompt len for the model to compile.
+    :ctx_len: int. Maximum context length to compile the model.
+    :mxfp6: bool. Enable compilation for MXFP6 precision
+    :num_cores: int. Number of cores to compile model on. default: 16 available option: [1 to 16]
+    """
+
+    os.makedirs(qpc_path, exist_ok=True)
+    specialization_json_path = os.path.join(qpc_path, "specializations.json")
+    create_and_dump_specializations(
+        batch_size=batch_size, prompt_len=prompt_len, ctx_len=ctx_len, path=specialization_json_path
+    )
+    custom_io_file_path = os.path.join(os.path.dirname(onnx_path), "custom_io.yaml")
+    
+    if not os.path.isfile(custom_io_file_path):
+        raise FileNotFoundError(f"file {custom_io_file_path} needs to exist in the same directory as onnx model files.")
+
+    _, qpc_path = compile_kv_model_on_cloud_ai_100(
+        onnx_path=onnx_path,
+        specializations_json=specialization_json_path,
+        num_cores=num_cores,
+        custom_io_path=custom_io_file_path,
+        base_path=qpc_path,
+        mxfp6=mxfp6,
+        aic_enable_depth_first=aic_enable_depth_first,
+        mos=mos,
+        device_group=device_group,
+    )
+
+    logger.info(f"Compiled QPC files can be found here: {qpc_path}")
+    return qpc_path
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description="Compilation script.")
+    parser.add_argument("--onnx_path", "--onnx-path", required=True, help="Onnx Model Path")
+    parser.add_argument(
+        "--qpc-path",
+        "--qpc_path",
+        required=True,
+        help="Compiled qpc binaries will be stored under this folder",
+    )
+    parser.add_argument("--batch_size", "--batch-size", type=int, default=1, help="Batch size for text generation")
+    parser.add_argument(
+        "--prompt_len",
+        "--prompt-len",
+        default=32,
+        type=int,
+        help="Sequence length for text generation.",
+    )
+    parser.add_argument("--ctx_len", "--ctx-len", default=128, type=int, help="Context length for text generation.")
+    parser.add_argument(
+        "--mxfp6",
+        action="store_true",
+        help="Compress constant MatMul weights to MXFP6 E2M3, default is no compression",
+    )
+    parser.add_argument(
+        "--num_cores",
+        "--num-cores",
+        required=True,
+        type=int,
+        help="num cores to compile the model on",
+    )
+    parser.add_argument(
+        "--device_group",
+        "--device-group",
+        required=True,
+        type=lambda device_ids: [int(x) for x in device_ids.strip("[]").split(",")],
+        help="Cloud AI 100 device ids (comma-separated) e.g. [0] ",
+    )
+    parser.add_argument(
+        "--aic_enable_depth_first", "--aic-enable-depth-first",
+        action="store_true",
+        help="If passed, this option will be enabled during compilation, disabled by default",
+    )
+    parser.add_argument(
+        "--mos",
+        type=int,
+        default=-1,
+        help=" Effort level to reduce the on-chip memory",
+    )
+    args = parser.parse_args()
+    main(**vars(args))
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+include README.md`
	`2`	`+include LICENSE`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+format:`
	`2`	`+ isort --profile black -l 140 QEfficient/`
	`3`	`+ black -l 140 QEfficient/`