File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -158,6 +158,28 @@ Key changes:
158158* CSR buckets keep data on-device → near-zero host↔GPU traffic
159159* on-GPU radix sort makes preprocessing parallel
160160
161+ ## Future
162+
163+ ### Technical Improvements
164+ - ** Modern Dependencies** : Update to ` objc2 ` and ` objc2-metal ` ([ objc2] ( https://github.com/madsmtm/objc2 ) )
165+ - ** Metal 4** : Adopt latest [ Metal 4] ( https://developer.apple.com/metal/whats-new/ ) features
166+ - ** Refactor with SIMD in mind** :
167+ - Instruction-level parallelism using vector types for faster FMA within SIMD groups
168+ - Memory coalescing to increase locality (e.g., structure of array instead of array of structure)
169+ - Optimized input reading patterns (e.g. ` [X_i || Y_i]_0^{n-1} ` instead of separate arrays)
170+ - Latency hiding and occupancy fine-tuning
171+ - Minimize thread divergence
172+
173+ ### Algorithm & Integration
174+ - ** CPU-GPU Hybrid** : Research interleaving with CPU MSM crate and update to ` arkworks 0.5 `
175+ - ** Advanced Algorithms** :
176+ - [ Elastic MSM] ( https://eprint.iacr.org/2024/057.pdf ) implementation
177+ - Faster modular reduction with LogJump ([ article by Wei Jie] ( https://kohweijie.com/articles/25/logjumps.html ) , [ Barret-Montgomery] ( https://hackmd.io/@Ingonyama/Barret-Montgomery ) )
178+
179+ ### Platform Expansion
180+ - ** Cross-platform** : WGSL support with native execution environment
181+ - ** Crypto Math Library** : Maintain a Metal/WebGPU crypto math library
182+
161183## Community
162184
163185- X account: <a href =" https://twitter.com/zkmopro " ><img src =" https://img.shields.io/twitter/follow/zkmopro?style=flat-square&logo=x&label=zkmopro " ></a >
You can’t perform that action at this time.
0 commit comments