Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Runtime] Constant cache manager and runtime pipeline #342

Open
wants to merge 94 commits into
base: main
Choose a base branch
from
Open
Changes from 3 commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
13faa33
add cpuruntime dialect
May 14, 2024
161848e
format
May 14, 2024
447ef12
add dependency
May 14, 2024
a73dcc1
fix new MLIR
May 14, 2024
1cfede8
add
May 15, 2024
57ba92e
Merge remote-tracking branch 'origin/main' into yijie/cpuruntime
May 15, 2024
3d3308c
move codes from dnn-compiler
niuxiaog May 15, 2024
4d25de6
Merge branch 'yijie/cpuruntime' into yijie/pipeline
May 15, 2024
475faf8
update
May 15, 2024
4f112c0
Merge branch 'main' into xgniu/constant_weights_folding
niuxiaog May 15, 2024
0ac087d
fix
May 15, 2024
74b0d34
remove at exit
May 16, 2024
2cebba9
fix lint
May 16, 2024
d1b35a1
Merge branch 'yijie/cpuruntime' into yijie/pipeline
May 16, 2024
34d10ea
Add kmp_* wrapper for gomp environment
May 16, 2024
55c1043
Merge remote-tracking branch 'origin' into yijie/pipeline
May 16, 2024
e1490bb
Merge branch 'yijie/fake_omp' into yijie/pipeline
May 16, 2024
80a597f
fix
May 16, 2024
0b4332b
fix
May 16, 2024
c43f481
Merge branch 'main' into yijie/fake_omp
May 23, 2024
b1c79a2
add wraper
May 23, 2024
382171b
fix lint
May 23, 2024
ef75da8
Merge branch 'yijie/fake_omp' of https://github.com/intel/graph-compi…
May 23, 2024
f1fd0ae
fix
May 23, 2024
a773ea6
f
May 23, 2024
84933c2
fix
May 23, 2024
4cca4df
add reference
May 23, 2024
678cef9
enable const cache
May 24, 2024
c12156c
reduce size
May 24, 2024
d50a3e8
Merge branch 'main' into xgniu/constant_weights_folding
niuxiaog May 27, 2024
6219935
Add single operand check
niuxiaog May 27, 2024
5eb0ac0
Add cache manager
niuxiaog May 27, 2024
c3e186d
Use llvm global [need to cowork with yijie/mainfunc_wrapper]
niuxiaog May 28, 2024
e24b1df
rename
May 28, 2024
34064f3
Merge branch 'main' of https://github.com/intel/graph-compiler into y…
May 28, 2024
70c5e97
Merge branch 'main' of https://github.com/intel/graph-compiler into y…
May 28, 2024
1e06c98
fix license.py
May 28, 2024
3f656b7
Merge branch 'yijie/fake_omp' into yijie/pipeline
May 28, 2024
24cee01
Merge branch 'yijie/pipeline' into yijie/mainfunc_wrapper
May 28, 2024
7c32bc5
fix
May 28, 2024
4540fb6
fix lint
May 28, 2024
381677a
fix comments
May 28, 2024
8c50b67
Rename; Add llvm dependence
niuxiaog May 28, 2024
25f611e
Change dtype
niuxiaog May 28, 2024
4363915
Fix visibility and type
niuxiaog May 29, 2024
fdfc53e
Merge branch 'main' of https://github.com/intel/graph-compiler into y…
May 29, 2024
60042e1
Merge branch 'yijie/pipeline' into yijie/mainfunc_wrapper
May 29, 2024
b54b310
fix
May 29, 2024
9d04cd2
format
May 29, 2024
206c3f3
cleanup
May 30, 2024
824946b
Revert "cleanup"
May 30, 2024
bc9a7ad
refine options
May 30, 2024
3bd954c
Merge branch 'yijie/mainfunc_wrapper' into yijie/const_cache_jit
May 30, 2024
94f2813
Support cpmplex topo
niuxiaog May 30, 2024
0f67f75
Rename
niuxiaog Jun 3, 2024
d7663a5
Split into short functions
niuxiaog Jun 4, 2024
3f34e97
Add a test
niuxiaog Jun 5, 2024
22c3d76
Adapt to constant PropertyType
niuxiaog Jun 11, 2024
5c92931
Merge branch 'main' into xgniu/constant_weights_folding
niuxiaog Jul 24, 2024
9218762
Revert "Adapt to constant PropertyType"
niuxiaog Jul 24, 2024
4e447dd
Fix link
niuxiaog Jul 24, 2024
d4d81a6
Fold arith.constant
niuxiaog Jul 25, 2024
afec52a
Add compile_time_fold and runtime_fold.
niuxiaog Jul 25, 2024
9c4fd70
Fix license and tidy
niuxiaog Jul 26, 2024
fad5f92
Fix link
niuxiaog Jul 26, 2024
57f887d
Only enable runtime folding
niuxiaog Jul 29, 2024
1fc3b9f
Rename and polish
niuxiaog Jul 29, 2024
aaa4ed4
Merge branch 'main' into xgniu/constant_weights_folding
niuxiaog Jul 31, 2024
bfc12c7
Add accuracy tests on mlp
niuxiaog Aug 7, 2024
346965f
Merge branch 'main' into xgniu/constant_weights_folding
niuxiaog Aug 7, 2024
75fcaed
Merge branch 'main' into xgniu/constant_weights_folding
niuxiaog Aug 19, 2024
f9c2425
Support MemRef args
niuxiaog Aug 20, 2024
d8d2d79
Add to pipeline
niuxiaog Aug 20, 2024
fc739e5
Merge branch 'main' into xgniu/constant_weights_folding
niuxiaog Aug 26, 2024
22c4474
Forbid buffer_to_tensor case
niuxiaog Aug 26, 2024
968677d
Merge branch 'main' into xgniu/constant_weights_folding
niuxiaog Sep 2, 2024
1473a88
Merge branch 'main' into xgniu/constant_weights_folding
niuxiaog Sep 5, 2024
e20d059
Add shape info to global
niuxiaog Sep 6, 2024
99811f2
Merge branch 'xgniu/constant_weights_folding' into xgniu/folding_manager
niuxiaog Sep 11, 2024
3a47e28
Merge branch 'yijie/const_cache_jit' into xgniu/folding_manager
niuxiaog Sep 11, 2024
36fc758
Make things work
niuxiaog Sep 13, 2024
362ad2b
Merge branch 'xgniu/constant_weights_folding' into xgniu/folding_manager
niuxiaog Sep 13, 2024
8d08752
Unify attr name
niuxiaog Sep 13, 2024
ad24768
Merge branch 'main' into xgniu/constant_weights_folding
niuxiaog Sep 14, 2024
edbb708
Clean tests.
niuxiaog Sep 14, 2024
fa30e4a
Updates
niuxiaog Sep 14, 2024
b8b0dd2
Move manager
niuxiaog Sep 14, 2024
361bed6
Merge branch 'xgniu/constant_weights_folding' into xgniu/folding_manager
niuxiaog Sep 14, 2024
6a041dd
Use atomic
niuxiaog Sep 14, 2024
c876358
Fix
niuxiaog Sep 14, 2024
a255c7b
Merge branch 'main' into xgniu/constant_weights_folding
niuxiaog Sep 18, 2024
77e0f02
Merge into one pass
niuxiaog Sep 18, 2024
2df16c2
Skip case
niuxiaog Sep 18, 2024
d8aedad
Merge branch 'xgniu/constant_weights_folding' into xgniu/folding_manager
niuxiaog Sep 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 33 additions & 5 deletions lib/gc/Transforms/ConstantTensorFolding.cpp
Original file line number Diff line number Diff line change
@@ -592,7 +592,7 @@ func::FuncOp buildFoldFunc(MLIRContext *context, OpBuilder &builder,
globalIndexes.insert(globalIndexes.begin(), globalIndexes.size());
auto moduleOp = dyn_cast<ModuleOp>(topOp);
addGlobalI64Array(moduleOp, moduleOp.getLoc(), builder,
"__" + name + "_buffer_ids_", globalIndexes);
"__" + name + "_buffer_ids", globalIndexes);

auto returnOp =
builder.create<func::ReturnOp>(topOp->getLoc(), outputValuesInFold);
@@ -605,6 +605,24 @@ func::FuncOp buildFoldFunc(MLIRContext *context, OpBuilder &builder,
});
}

// the ranks of folded results.
SmallVector<int32_t> foldRanks;
// the shapes of folded results.
SmallVector<int64_t> foldShapes;
for (Value &tensor : outputValuesInFold) {
auto t = dyn_cast<TensorType>(tensor.getType());
Type eleType = t.getElementType();
int64_t bitWidth = eleType.getIntOrFloatBitWidth() / 8; // bytes
ArrayRef<int64_t> shape = t.getShape();
foldRanks.push_back(shape.size());
foldShapes.insert(foldShapes.end(), shape.begin(), shape.end());
foldShapes.push_back(bitWidth);
}
addGlobalI32Array(moduleOp, moduleOp.getLoc(), builder, "__folded_ranks",
foldRanks);
addGlobalI64Array(moduleOp, moduleOp.getLoc(), builder, "__folded_shapes",
foldShapes);

foldFunc.setVisibility(SymbolTable::Visibility::Public);
foldFunc->setAttr(LLVM::LLVMDialect::getEmitCWrapperAttrName(),
UnitAttr::get(context));
@@ -621,11 +639,13 @@ void modifyComputeFunc(MLIRContext *context, OpBuilder &builder,
std::unordered_set<int> &constArgsIndexes,
SmallVector<Type> &outputTypes,
SmallVector<Value> &outputValues) {
// the indexes of args to the folding func.
// the indexes of args to the folding func, including to-fold tensors and
// folded results.
SmallVector<int32_t> foldArgs;
// the indexes of folded args.
// the indexes of folded results.
SmallVector<int32_t> foldIds;
// the indexes of args to the computing func.
// the indexes of args to the computing func, including non-fold tensors and
// folded results.
SmallVector<int32_t> computeArgs;

// modify the BlockArguments of block
@@ -705,7 +725,7 @@ void modifyComputeFunc(MLIRContext *context, OpBuilder &builder,
addGlobalI32Array(moduleOp, moduleOp.getLoc(), builder, "__compute_args",
computeArgs);

addGlobalI32(moduleOp, moduleOp.getLoc(), builder, "__num_orig_num_args",
addGlobalI32(moduleOp, moduleOp.getLoc(), builder, "__num_orig_args",
oriNumArgs);
}

@@ -730,6 +750,14 @@ void canonicalizeAndClean(MLIRContext *context, Operation *topOp) {
op->removeAttr("onednn_graph.in_const_subgraph");
}
});
topOp->walk([&](func::FuncOp op) {
if (op.getOperation()->getAttr("compiletime_const_args_index")) {
op.getOperation()->removeAttr("compiletime_const_args_index");
}
if (op.getOperation()->getAttr("runtime_const_args_index")) {
op.getOperation()->removeAttr("runtime_const_args_index");
}
});
}

// Operate on tensors. Create fold() and compute() on module. The
2 changes: 1 addition & 1 deletion test/gc/Transforms/test_constant_tensor_folding-1.mlir
Original file line number Diff line number Diff line change
@@ -32,7 +32,7 @@ module {

// COM: expected output:
// COM: module {
// COM: llvm.mlir.global external constant @__num_orig_num_args(3 : i32) {addr_space = 0 : i32} : i32
// COM: llvm.mlir.global external constant @__num_orig_args(3 : i32) {addr_space = 0 : i32} : i32
// COM: llvm.mlir.global external constant @__compute_args(dense<[3, 2, 3, 4]> : tensor<4xi32>) {addr_space = 0 : i32} : !llvm.array<4 x i32>
// COM: llvm.mlir.global external constant @__fold_args(dense<[4, 0, 1, 3, 4]> : tensor<5xi32>) {addr_space = 0 : i32} : !llvm.array<5 x i32>
// COM: llvm.mlir.global external constant @__fold_buffer_ids(dense<[2, 0, 1]> : tensor<3xi64>) {addr_space = 0 : i32} : !llvm.array<3 x i64>
2 changes: 1 addition & 1 deletion test/gc/Transforms/test_constant_tensor_folding.mlir
Original file line number Diff line number Diff line change
@@ -74,7 +74,7 @@ module {

// COM: expected output:
// COM: module {
// COM: llvm.mlir.global external constant @__num_orig_num_args(5 : i32) {addr_space = 0 : i32} : i32
// COM: llvm.mlir.global external constant @__num_orig_args(5 : i32) {addr_space = 0 : i32} : i32
// COM: llvm.mlir.global external constant @__compute_args(dense<[5, 0, 5, 6, 7, 8]> : tensor<6xi32>) {addr_space = 0 : i32} : !llvm.array<6 x i32>
// COM: llvm.mlir.global external constant @__fold_args(dense<[8, 1, 2, 3, 4, 5, 6, 7, 8]> : tensor<9xi32>) {addr_space = 0 : i32} : !llvm.array<9 x i32>
// COM: llvm.mlir.global external constant @__fold_buffer_ids(dense<[4, 0, 1, 2, 3]> : tensor<5xi64>) {addr_space = 0 : i32} : !llvm.array<5 x i64>
Original file line number Diff line number Diff line change
@@ -141,7 +141,7 @@
#map3 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
#map4 = affine_map<(d0, d1, d2, d3) -> (d1, d3)>
module {
llvm.mlir.global external constant @__num_orig_num_args(5 : i32) {addr_space = 0 : i32} : i32
llvm.mlir.global external constant @__num_orig_args(5 : i32) {addr_space = 0 : i32} : i32
llvm.mlir.global external constant @__compute_args(dense<[5, 0, 5, 6, 7, 8]> : tensor<6xi32>) {addr_space = 0 : i32} : !llvm.array<6 x i32>
llvm.mlir.global external constant @__fold_args(dense<[8, 1, 2, 3, 4, 5, 6, 7, 8]> : tensor<9xi32>) {addr_space = 0 : i32} : !llvm.array<9 x i32>
llvm.mlir.global external constant @__runtime_fold_buffer_ids_(dense<[4, 0, 1, 2, 3]> : tensor<5xi64>) {addr_space = 0 : i32} : !llvm.array<5 x i64>
Original file line number Diff line number Diff line change
@@ -111,7 +111,7 @@
#map3 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
#map4 = affine_map<(d0, d1, d2, d3) -> (d1, d3)>
module {
llvm.mlir.global external constant @__num_orig_num_args(5 : i32) {addr_space = 0 : i32} : i32
llvm.mlir.global external constant @__num_orig_args(5 : i32) {addr_space = 0 : i32} : i32
llvm.mlir.global external constant @__compute_args(dense<[5, 0, 5, 6, 7, 8]> : tensor<6xi32>) {addr_space = 0 : i32} : !llvm.array<6 x i32>
llvm.mlir.global external constant @__fold_args(dense<[8, 1, 2, 3, 4, 5, 6, 7, 8]> : tensor<9xi32>) {addr_space = 0 : i32} : !llvm.array<9 x i32>
llvm.mlir.global external constant @__runtime_fold_buffer_ids_(dense<[4, 0, 1, 2, 3]> : tensor<5xi64>) {addr_space = 0 : i32} : !llvm.array<5 x i64>