Skip to content

Commit 2b459a2

Browse files
committed
docs: add the future roadmap of this work
1 parent eb60ad6 commit 2b459a2

1 file changed

Lines changed: 22 additions & 0 deletions

File tree

README.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,28 @@ Key changes:
158158
* CSR buckets keep data on-device → near-zero host↔GPU traffic
159159
* on-GPU radix sort makes preprocessing parallel
160160

161+
## Future
162+
163+
### Technical Improvements
164+
- **Modern Dependencies**: Update to `objc2` and `objc2-metal` ([objc2](https://github.com/madsmtm/objc2))
165+
- **Metal 4**: Adopt latest [Metal 4](https://developer.apple.com/metal/whats-new/) features
166+
- **Refactor with SIMD in mind**:
167+
- Instruction-level parallelism using vector types for faster FMA within SIMD groups
168+
- Memory coalescing to increase locality (e.g., structure of array instead of array of structure)
169+
- Optimized input reading patterns (e.g. `[X_i || Y_i]_0^{n-1}` instead of separate arrays)
170+
- Latency hiding and occupancy fine-tuning
171+
- Minimize thread divergence
172+
173+
### Algorithm & Integration
174+
- **CPU-GPU Hybrid**: Research interleaving with CPU MSM crate and update to `arkworks 0.5`
175+
- **Advanced Algorithms**:
176+
- [Elastic MSM](https://eprint.iacr.org/2024/057.pdf) implementation
177+
- Faster modular reduction with LogJump ([article by Wei Jie](https://kohweijie.com/articles/25/logjumps.html), [Barret-Montgomery](https://hackmd.io/@Ingonyama/Barret-Montgomery))
178+
179+
### Platform Expansion
180+
- **Cross-platform**: WGSL support with native execution environment
181+
- **Crypto Math Library**: Maintain a Metal/WebGPU crypto math library
182+
161183
## Community
162184

163185
- X account: <a href="https://twitter.com/zkmopro"><img src="https://img.shields.io/twitter/follow/zkmopro?style=flat-square&logo=x&label=zkmopro"></a>

0 commit comments

Comments
 (0)