Skip to content

Commit 0fe6649

Browse files
committed
add bench
Signed-off-by: yiliu30 <[email protected]>
1 parent 0ea9563 commit 0fe6649

11 files changed

+124
-10
lines changed

.devcontainer/Dockerfile

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,6 @@ ENV FLIT_ROOT_INSTALL=1
77

88
COPY pyproject.toml .
99
RUN touch README.md \
10-
&& mkdir -p src/python_package \
10+
&& mkdir -p src/torchutils \
1111
&& python -m flit install --only-deps --deps develop \
1212
&& rm -r pyproject.toml README.md src

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ You can also use a Dockerfile to automate dev container creation. In your Docker
8484
#### Setup
8585
This project includes three files in the .devcontainer and .vscode directories that enable you to use GitHub Codespaces or Docker and VSCode locally to set up an environment that includes all the necessary extensions and tools for Python development.
8686

87-
The Dockerfile specifies the base image and dependencies needed for the development container. The Dockerfile installs the necessary dependencies for the development container, including Python 3 and flit, a tool used to build and publish Python packages. It sets an environment variable to indicate that flit should be installed globally. It then copies the pyproject.toml file into the container and creates an empty README.md file. It creates a directory src/python_package and installs only the development dependencies using flit. Finally, it removes unnecessary files, including the pyproject.toml, README.md, and src directory.
87+
The Dockerfile specifies the base image and dependencies needed for the development container. The Dockerfile installs the necessary dependencies for the development container, including Python 3 and flit, a tool used to build and publish Python packages. It sets an environment variable to indicate that flit should be installed globally. It then copies the pyproject.toml file into the container and creates an empty README.md file. It creates a directory src/torchutils and installs only the development dependencies using flit. Finally, it removes unnecessary files, including the pyproject.toml, README.md, and src directory.
8888

8989
The devcontainer.json file is a configuration file that defines the development container's settings, including the Docker image to use, any additional VSCode extensions to install, and whether or not to mount the project directory into the container. It uses the python-3-miniconda container as its base, which is provided by Microsoft, and also includes customizations for VSCode, such as recommended extensions for Python development and specific settings for those extensions. In addition to the above, the settings.json file also contains a handy command that can automatically install pre-commit hooks. These hooks can help ensure the quality of the code before it's committed to the repository, improving the overall codebase and making collaboration easier.
9090

docs/modules.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,4 @@ src
44
.. toctree::
55
:maxdepth: 4
66

7-
python_package
7+
torchutils

docs/python_package.hello_world.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,15 @@ Submodules
77
python\_package.hello\_world.hello\_world module
88
------------------------------------------------
99

10-
.. automodule:: python_package.hello_world.hello_world
10+
.. automodule:: torchutils.hello_world.hello_world
1111
:members:
1212
:undoc-members:
1313
:show-inheritance:
1414

1515
Module contents
1616
---------------
1717

18-
.. automodule:: python_package.hello_world
18+
.. automodule:: torchutils.hello_world
1919
:members:
2020
:undoc-members:
2121
:show-inheritance:

docs/python_package.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -7,23 +7,23 @@ Subpackages
77
.. toctree::
88
:maxdepth: 4
99

10-
python_package.hello_world
10+
torchutils.hello_world
1111

1212
Submodules
1313
----------
1414

1515
python\_package.setup module
1616
----------------------------
1717

18-
.. automodule:: python_package.setup
18+
.. automodule:: torchutils.setup
1919
:members:
2020
:undoc-members:
2121
:show-inheritance:
2222

2323
Module contents
2424
---------------
2525

26-
.. automodule:: python_package
26+
.. automodule:: torchutils
2727
:members:
2828
:undoc-members:
2929
:show-inheritance:

pyproject.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ Source = "https://github.com/microsoft/python-package-template"
5252
Tracker = "https://github.com/microsoft/python-package-template/issues"
5353

5454
[tool.flit.module]
55-
name = "python_package"
55+
name = "torchutils"
5656

5757
[tool.bandit]
5858
exclude_dirs = ["build","dist","tests","scripts"]

src/python_package/__init__.py src/torchutils/__init__.py

+3
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,6 @@
66
from __future__ import annotations
77

88
__version__ = "0.0.2"
9+
10+
11+
from torchutils.bench import bench_module, bench_more, inspect_tensor, see_memory_usage

src/torchutils/bench.py

+101
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
seed = 0
2+
3+
4+
import os
5+
6+
DEBUG = os.environ.get("DEBUG", "0") == "1"
7+
8+
from triton.testing import do_bench
9+
import torch
10+
import time
11+
12+
13+
def freeze_seed(seed):
14+
import random
15+
16+
random.seed(seed)
17+
import torch
18+
19+
torch.manual_seed(seed)
20+
torch.cuda.manual_seed(seed)
21+
import numpy as np
22+
23+
np.random.seed(seed)
24+
25+
26+
freeze_seed(seed)
27+
28+
29+
def bench_module(func, warmup=25, rep=200):
30+
torch.cuda.synchronize()
31+
for i in range(warmup):
32+
func()
33+
torch.cuda.synchronize()
34+
start = time.perf_counter()
35+
for i in range(rep):
36+
func()
37+
torch.cuda.synchronize()
38+
end = time.perf_counter()
39+
return (end - start) / rep * 1000
40+
41+
42+
@torch.no_grad()
43+
def bench_more(func, warmup=25, rep=200, kernel=True, profile=True, msg="", export_trace=False):
44+
from triton.testing import do_bench
45+
import torch
46+
47+
module_bench_time = bench_module(func, warmup, rep)
48+
kernel_bench_time = do_bench(func, warmup, rep) if kernel else None
49+
if profile:
50+
print(f"----{msg}----")
51+
from torch.profiler import profile, record_function, ProfilerActivity
52+
53+
activities = [ProfilerActivity.CPU, ProfilerActivity.CUDA]
54+
with profile(activities=activities, with_stack=True, use_cuda=True) as prof:
55+
for i in range(rep):
56+
func()
57+
if export_trace or os.environ.get("EXPORT_TRACE", "0") == "1":
58+
prof.export_chrome_trace(f"{msg}.json")
59+
print(f"Exported trace to {msg}.json")
60+
print("----" * 10, "CPU time", "----" * 10)
61+
print(prof.key_averages().table(sort_by="self_cpu_time_total", row_limit=20))
62+
print("----" * 10, "CUDA time", "----" * 10)
63+
print(prof.key_averages().table(sort_by="self_cuda_time_total", row_limit=20))
64+
return module_bench_time, kernel_bench_time
65+
66+
67+
def inspect_tensor(x, msg="", force=False):
68+
print(
69+
f"{msg}\n: shape={x.shape}, dtype={x.dtype}, device={x.device}, layout={x.layout}, strides={x.stride()}, is_contiguous={x.is_contiguous()}"
70+
)
71+
if DEBUG or force:
72+
print(x)
73+
74+
75+
def see_memory_usage(message, force=True):
76+
# Modified from DeepSpeed
77+
import gc
78+
import warnings
79+
80+
import torch.distributed as dist
81+
breakpoint()
82+
83+
if not force:
84+
return
85+
# if dist.is_initialized() and not dist.get_rank() == 0:
86+
# return
87+
88+
# python doesn't do real-time garbage collection so do it explicitly to get the correct RAM reports
89+
gc.collect()
90+
91+
# Print message except when distributed but not rank 0
92+
print(message)
93+
print(
94+
f"AllocatedMem {round(torch.cuda.memory_allocated() / (1024 * 1024 * 1024),2 )} GB \
95+
MaxAllocatedMem {round(torch.cuda.max_memory_allocated() / (1024 * 1024 * 1024),2)} GB \
96+
ReservedMem {round(torch.cuda.memory_reserved() / (1024 * 1024 * 1024),2)} GB \
97+
MaxReservedMem {round(torch.cuda.max_memory_reserved() / (1024 * 1024 * 1024))} GB "
98+
)
99+
100+
# get the peak memory to report correct data, so reset the counter for the next call
101+
torch.cuda.reset_peak_memory_stats()

src/torchutils/config.py

+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
2+
import os
3+
import dataclasses
4+
5+
@dataclasses.dataclass
6+
class Config:
7+
debug: bool = False
8+
9+
10+
config = Config(debug=os.environ.get("DEBUG", "0") == "1")
File renamed without changes.

tests/test_methods.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
"""This is a sample python file for testing functions from the source code."""
66
from __future__ import annotations
77

8-
from python_package.hello_world import hello_world
8+
from torchutils.hello_world import hello_world
99

1010

1111
def hello_test():

0 commit comments

Comments
 (0)