The project follows a simple layered layout:
- Signal generation or input
- Core DSP kernels
- Validation through deterministic tests
- Benchmarking and reporting
- Documentation and visual artifacts
That separation keeps the reference implementation easy to audit while still leaving room for optimization work.
| Algorithm | Current implementation | Typical complexity | Main bottleneck |
|---|---|---|---|
| FIR convolution | direct time-domain | O(N * M) |
multiply-accumulate throughput |
| FIR convolution (alt) | FFT overlap-save | O(N log N) blocks |
transform overhead vs filter length |
| Goertzel | direct recurrence | O(N) per target |
scalar recurrence |
| GCC-PHAT | FFT-based correlation flow | O(N log N) |
transform cost |
| Rational resampler | upsample + FIR + decimate | O(N * M) |
filtering cost |
- tighten input validation and error handling
- reduce avoidable allocations
- improve cache locality in hot loops
- keep scalar reference paths easy to verify
- optional AVX2/FMA acceleration
- batched frequency analysis
- target-specific compiler tuning
- more aggressive FFT planning choices
- polyphase resampling
- streaming SDR pipeline integration
- fixed-point and FPGA-oriented variants
This repository is meant to be inspected and taught from. That means correctness, testability, and traceability are more important than chasing every last percent of performance in the baseline path.
The optimization rule is:
measure first -> optimize the bottleneck -> verify again
The long-term value of the project is not only in the algorithms themselves, but in the engineering workflow around them:
- define a clear DSP kernel
- validate numerically
- benchmark on real toolchains
- document the result
- prepare the algorithm for downstream hardware or embedded work