[Runtime] solve the problem that llama frequently calls autotuner #332

StrongSpoon · 2024-11-27T08:46:51Z

PR Category

Runtime

Type of Change

New Feature

Description

implement LibTuner to take place of Autotuner.
it stores tuned configs in database and preloads them before calling.

Issue

Progress

Change is properly reviewed (1 reviewer required, 2 recommended).
Change is responded to an issue.
Change is fully covered by a UT.

Performance

src/flag_gems/ops/log_softmax.py

tongxin · 2024-11-27T09:33:59Z

src/flag_gems/utils/libentry.py

+        connect = sqlite3.connect(self.cache_dir)
+        c = connect.cursor()
+        c.execute(f"CREATE TABLE IF NOT EXISTS {self.__name__} (key TEXT, config TEXT)")
+        cursor = c.execute(f"SELECT key, config from {self.__name__}")
+
+        for row in cursor:
+            key_str, config_str = row
+            key = (self.getvalue(k) for k in key_str[1:-1].split(", "))
+
+            cfg_ls = [item.split(": ") for item in config_str.split(", ")]
+            config = triton.Config({})
+            for k, v in cfg_ls[:-4]:
+                config.kwargs[k] = eval(v)
+            config.num_warps = int(cfg_ls[-4][1])
+            config.num_ctas = int(cfg_ls[-3][1])
+            config.num_stages = int(cfg_ls[-2][1])
+            config.enable_fp_fusion = eval(cfg_ls[-1][1])
+
+            self.cache[key] = config
+
+        connect.close()


Probably we could leave the connection open and shared because we need to fetch stored params for all gems ops. A try catch block is possibly necessary for external IO operations.

But I think its fine at this stage. Let's improve it later.

there is not always a explicit exit of flag_gems. I'm not sure what time to close the connection if keep it open.

iclementine · 2024-11-27T09:59:33Z

src/flag_gems/utils/code_cache.py

+
+
+def config_cache_dir() -> Path:
+    _config_cache_dir = cache_dir() / "config_cache"


Suggestion for name: "tunning_cache"

what's the difference?

src/flag_gems/utils/libentry.py

* new feature, muti_backend * update auto_tune_module * update auto_tune_module * update auto_tune_module * update __init__ * rebase * fix bug * modifiy auto_tune_config * fix bug * fix bug * update * update * update scatter&gather * fix auto_tune * add gen_torch_device_fn * fix codestyle * fix codestyle * Modify code based on comments * Modify gen_impl with loops instead of recursion * Update code structure * Polish code * update * Polish code * Modify code based on comments * modify based on comment * Modify code based on comments * update * final fix

…agOpen#332) * [operator] turn autotuner to heuristics * [operator] heuristics for gather & index_select * [runtime] libtuner for matmul * [runtime] store config data in one db * [bugfix] parse key as list instead of tuple * [no ci] update var name * [Muti_backend] muti_backend-part_1-framework-and-tune_config (FlagOpen#294) * new feature, muti_backend * update auto_tune_module * update auto_tune_module * update auto_tune_module * update __init__ * rebase * fix bug * modifiy auto_tune_config * fix bug * fix bug * update * update * update scatter&gather * fix auto_tune * add gen_torch_device_fn * fix codestyle * fix codestyle * Modify code based on comments * Modify gen_impl with loops instead of recursion * Update code structure * Polish code * update * Polish code * Modify code based on comments * modify based on comment * Modify code based on comments * update * final fix * [bugfix] update libtuner to be compatible with triton2 * [no ci]reformat * [operator] update log_softmax * [pretune] move pretune to ./examples for models * [format] delete useless print * [format] delete unused import * [format] [no ci] remove useless print --------- Co-authored-by: Galaxy1458 <[email protected]>

StrongSpoon force-pushed the dev_lzx branch from f8b24aa to 94547a9 Compare November 27, 2024 08:48

StrongSpoon changed the title ~~Dev lzx~~ [Runtime] solve the problem that llama frequently calls autotuner Nov 27, 2024

tongxin reviewed Nov 27, 2024

View reviewed changes

iclementine reviewed Nov 27, 2024

View reviewed changes

src/flag_gems/utils/libentry.py Outdated Show resolved Hide resolved

StrongSpoon force-pushed the dev_lzx branch from ed60542 to dc59da9 Compare December 2, 2024 01:36

StrongSpoon force-pushed the dev_lzx branch from 654028c to 55d9b81 Compare December 12, 2024 02:33

kiddyjinjin reviewed Dec 12, 2024

View reviewed changes

src/flag_gems/utils/libentry.py Show resolved Hide resolved

StrongSpoon and others added 11 commits December 16, 2024 09:32

[operator] turn autotuner to heuristics

06bfd2a

[operator] heuristics for gather & index_select

9713eb8

[runtime] libtuner for matmul

0bfaaba

[runtime] store config data in one db

e3846ff

[bugfix] parse key as list instead of tuple

ac07373

[no ci] update var name

916a572

[bugfix] update libtuner to be compatible with triton2

c79c54b

[no ci]reformat

40a5df3

[operator] update log_softmax

e8eb82e

[pretune] move pretune to ./examples for models

85764a4

StrongSpoon force-pushed the dev_lzx branch from 65c157d to 85764a4 Compare December 16, 2024 02:57

StrongSpoon added 3 commits December 16, 2024 11:01

[format] delete useless print

3f59476

[format] delete unused import

e1bb20f

[format] [no ci] remove useless print

11a8479

StrongSpoon marked this pull request as ready for review December 18, 2024 10:02

kiddyjinjin approved these changes Dec 19, 2024

View reviewed changes

StrongSpoon merged commit e9c7aa7 into master Dec 19, 2024

StrongSpoon deleted the dev_lzx branch December 19, 2024 02:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Runtime] solve the problem that llama frequently calls autotuner #332

[Runtime] solve the problem that llama frequently calls autotuner #332

Uh oh!

StrongSpoon commented Nov 27, 2024

Uh oh!

Uh oh!

tongxin Nov 27, 2024

Uh oh!

tongxin Nov 27, 2024

Uh oh!

StrongSpoon Dec 18, 2024

Uh oh!

iclementine Nov 27, 2024

Uh oh!

StrongSpoon Dec 18, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!



		def config_cache_dir() -> Path:
		_config_cache_dir = cache_dir() / "config_cache"

[Runtime] solve the problem that llama frequently calls autotuner #332

[Runtime] solve the problem that llama frequently calls autotuner #332

Uh oh!

Conversation

StrongSpoon commented Nov 27, 2024

PR Category

Type of Change

Description

Issue

Progress

Performance

Uh oh!

Uh oh!

tongxin Nov 27, 2024

Choose a reason for hiding this comment

Uh oh!

tongxin Nov 27, 2024

Choose a reason for hiding this comment

Uh oh!

StrongSpoon Dec 18, 2024

Choose a reason for hiding this comment

Uh oh!

iclementine Nov 27, 2024

Choose a reason for hiding this comment

Uh oh!

StrongSpoon Dec 18, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!