Releases: Eamon2009/Quadtrix.cpp
v1.1.12
macOS/iOS:
macOS Apple Silicon (arm64)
macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED
macOS Intel (x64) SKIPPED
iOS XCFramework DISABLED
Linux:
Ubuntu x64 (CPU)
Ubuntu arm64 (CPU)
Ubuntu s390x (CPU) SKIPPED
Ubuntu x64 (Vulkan) DISABLED
Ubuntu arm64 (Vulkan) DISABLED
Ubuntu x64 (ROCm 7.2) DISABLED
Ubuntu x64 (OpenVINO) DISABLED
Ubuntu x64 (SYCL FP32) DISABLED
Android:
Android arm64 (CPU) DISABLED
Windows:
Windows x64 (CPU)
Windows arm64 (CPU)
Windows x64 (CUDA 12) - CUDA 12.4 DLLs DISABLED
Windows x64 (CUDA 13) - CUDA 13.3 DLLs DISABLED
Windows x64 (Vulkan) DISABLED
Windows x64 (SYCL) DISABLED
Windows x64 (HIP) DISABLED
v1.1.11
macOS/iOS:
macOS Apple Silicon (arm64)
macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED
macOS Intel (x64) SKIPPED
iOS XCFramework DISABLED
Linux:
Ubuntu x64 (CPU)
Ubuntu arm64 (CPU)
Ubuntu s390x (CPU) SKIPPED
Ubuntu x64 (Vulkan) DISABLED
Ubuntu arm64 (Vulkan) DISABLED
Ubuntu x64 (ROCm 7.2) DISABLED
Ubuntu x64 (OpenVINO) DISABLED
Ubuntu x64 (SYCL FP32) DISABLED
Android:
Android arm64 (CPU) DISABLED
Windows:
Windows x64 (CPU)
Windows arm64 (CPU)
Windows x64 (CUDA 12) - CUDA 12.4 DLLs DISABLED
Windows x64 (CUDA 13) - CUDA 13.3 DLLs DISABLED
Windows x64 (Vulkan) DISABLED
Windows x64 (SYCL) DISABLED
Windows x64 (HIP) DISABLED
What's Changed
- chore(deps): bump actions/github-script from 7 to 9 by @dependabot[bot] in #71
Full Changelog: v1.1.10...v1.1.11
v1.1.10
macOS/iOS:
macOS Apple Silicon (arm64)
macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED
macOS Intel (x64) SKIPPED
iOS XCFramework DISABLED
Linux:
Ubuntu x64 (CPU)
Ubuntu arm64 (CPU)
Ubuntu s390x (CPU) SKIPPED
Ubuntu x64 (Vulkan) DISABLED
Ubuntu arm64 (Vulkan) DISABLED
Ubuntu x64 (ROCm 7.2) DISABLED
Ubuntu x64 (OpenVINO) DISABLED
Ubuntu x64 (SYCL FP32) DISABLED
Android:
Android arm64 (CPU) DISABLED
Windows:
Windows x64 (CPU)
Windows arm64 (CPU)
Windows x64 (CUDA 12) - CUDA 12.4 DLLs DISABLED
Windows x64 (CUDA 13) - CUDA 13.3 DLLs DISABLED
Windows x64 (Vulkan) DISABLED
Windows x64 (SYCL) DISABLED
Windows x64 (HIP) DISABLED
v1.1.9
macOS/iOS:
macOS Apple Silicon (arm64)
macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED
macOS Intel (x64) SKIPPED
iOS XCFramework DISABLED
Linux:
Ubuntu x64 (CPU)
Ubuntu arm64 (CPU)
Ubuntu s390x (CPU) SKIPPED
Ubuntu x64 (Vulkan) DISABLED
Ubuntu arm64 (Vulkan) DISABLED
Ubuntu x64 (ROCm 7.2) DISABLED
Ubuntu x64 (OpenVINO) DISABLED
Ubuntu x64 (SYCL FP32) DISABLED
Android:
Android arm64 (CPU) DISABLED
Windows:
Windows x64 (CPU)
Windows arm64 (CPU)
Windows x64 (CUDA 12) - CUDA 12.4 DLLs DISABLED
Windows x64 (CUDA 13) - CUDA 13.3 DLLs DISABLED
Windows x64 (Vulkan) DISABLED
Windows x64 (SYCL) DISABLED
Windows x64 (HIP) DISABLED
v1.1.8
Full Changelog: v1.1.7...v1.1.8
v1.1.7
Full Changelog: v1.1.6...v1.1.7
v1.1.6
Full Changelog: v1.1.5...v1.1.6
v1.1.5
What's Changed
- docs: report [run_20260530_165216] (~791 tok/s) by @codeaddict-119 in #60
- Codeaddict master by @Eamon2009 in #62
- chore: clang-format configuration file based on LLVM by @codeaddict-119 in #63
- feat(cuda): add attention forward and backward kernel declarations by @Eamon2009 in #64
Full Changelog: v1.1.4...v1.1.5
v1.1.4
v1.1.4
Release Date: May 30, 2026
Hardware Profile: CPU (x86)
Model Architecture & Configuration
- Parameters: 6,684,497 (~6.68M)
- Batch Size: 16
- Block Size: 32
- Learning Rate: 1e-3
- Total Steps: 6,000
Training Performance & Metrics
- Best Validation Loss: 4.1319 (achieved at step 3,900)
- Total Training Time: 77m 16s
- Average Throughput: 791 tok/s (peaked at 885 tok/s during warmup)
- Average Step Time: 656.7 ms
- Evaluation Frequency: Every 100 steps
Note: The model achieved its best validation loss at step 3900. Beyond this point, the generalization gap began to widen, indicating the onset of overfitting in later steps.
What's Changed
- Refactor Dockerfile to use ARG for CUDA version by @codeaddict-119 in #57
Full Changelog: v1.1.3...v1.1.4
v1.1.3
v1.1.3
Docker / CUDA
- fix
import torchfailing at container startup caused by ENTRYPOINT resolving to system Python instead of the venv Python - ENTRYPOINT now uses the absolute venv path (
/app/venv/bin/python3) to avoid PATH resolution ambiguity - Added
libgl1to runtime stage dependencies, required by torchvision
Rebuild your image to apply the fix, no other changes required.
What's Changed
- Enhance tensor management and CUDA utilities with benchmarks by @codeaddict-119 in #51
- feat :tensor management with benchmarks (#52) by @codeaddict-119 in #52
New Contributors
- @codeaddict-119 made their first contribution in #51
Full Changelog: v1.1.2...v1.1.3