This repository implements a general augmentation and evaluation framework for time-series and image data, hereafter referred to as the TISA algorithm (Time-series & Image Streaming Augmentor). TISA is designed for unified modeling and representation learning of multi-source heterogeneous data in industrial, manufacturing, and sensor-network scenarios. Through systematic data augmentation, feature extraction, training strategies, and evaluation protocols, it enhances model robustness and generalization across classification, regression, and detection tasks.
The TISA algorithm is designed around the complete pipeline from "data processing → model training → feature evaluation → visualization and application", organized by the following files and modules:
- Top-level training and evaluation scripts (e.g.,
train.py,test.py,single_data_test.py,test_sample.py) for quickly starting experiments. - Configuration management and running scripts (
configs/,run/,utils/config.py) for parameterized, reproducible experiment setups. - Feature evaluation and downstream task scripts (
eval/,run/eval/,ensemble_models.py,optimize_knn.py) for multi-protocol evaluation and combination of feature quality. - Data augmentation and visualization modules (
src/augmentation.py,batch_generate_heatmaps.py,src/visualize.py) for interpretable analysis and visualization at both time-series and image levels. - Example datasets and interfaces for industrial applications (
data/fault_diagnosis.npy,data/soft_sensor.npy), covering typical scenarios such as fault diagnosis and soft sensing.
From the file and directory naming, it is clear that TISA targets both 1D time-series signals and 2D image / pseudo-image representations:
- Time-series dimension: window slicing, frequency-domain transforms, or time-channel combinations map sensor sequences into input tensors usable by models.
- Image dimension: by rearranging, encoding time-series segments, or directly using image data, TISA produces inputs compatible with visual models (e.g., Transformer-based backbones).
Scripts in the repository root and under run/train/ (e.g., train.py, run/train/train.py) form the main training entry points for TISA:
- Specify data paths, model configs, optimizer settings, etc. via command-line arguments or configuration files.
- Support both single-source data (e.g., time-series from a single device) and multi-source data (e.g., multiple operating conditions or sensor combinations) for joint training.
- Adopt a unified training loop for both time-series and image data to facilitate extension and migration.
In train/ssl_meta_arch.py, TISA introduces a self-supervised learning meta-architecture:
- Obtain multiple views of time-series or image data via multi-view augmentations (e.g., cropping, masking, perturbation).
- Use a shared backbone network to extract features and construct training signals via contrastive, reconstruction, or prediction tasks.
- Generate discriminative, general-purpose representations for downstream tasks without requiring large-scale annotations.
In configs/ssl_default_config.yaml and configs/train/, you can define configuration files for different model scales and training strategies to achieve:
- Unified management of basic hyperparameters such as batch size, learning rate, and number of epochs.
- Fast switching between different backbone architectures, input sizes, and projection head structures.
- Specialized augmentation and normalization strategies for time-series vs. image data.
src/backbones.py and directories such as eval/depth/models/backbones/, eval/segmentation/models/backbones/, and segmentation_m2f/models/backbones/ indicate that the repository supports multiple Transformer-based backbone structures:
- Unified backbone builder interfaces (e.g.,
eval/depth/models/builder.py,segmentation_m2f/models/builder.py) dynamically construct models based on configuration. - For different tasks (e.g., depth estimation, segmentation), similar or identical backbones can be reused to share a common representation space.
- For time-series data, TISA provides an "image-like" perspective by treating time steps as spatial dimensions to leverage mature vision backbones.
Directories such as eval/depth/decode_heads/, eval/segmentation/decode_heads/, and segmentation_m2f/models/decode_heads/ contain various decoding and task heads:
- For regression tasks (e.g., depth estimation, soft sensing), continuous-value prediction heads output pixel-level or sequence-level scalars.
- For segmentation and detection tasks, structured output modules such as masks and bounding boxes are introduced (e.g.,
segmentation_m2f/core/anchor,segmentation_m2f/core/box). - Along the time dimension, these can be analogized to the pixel dimension so that each time step or time window receives a label or weight, enabling unified time–space prediction structures.
The eval/ and run/eval/ directories contain several scripts for feature evaluation and downstream training:
eval/knn.py,run/eval/knn.py: k-NN-based feature quality evaluation for both time-series and image features.eval/linear.py,run/eval/linear.py: linear probing to assess the linear separability of frozen features.eval/log_regression.py,run/eval/log_regression.py: logistic regression evaluation for binary or multi-class classification.eval/metrics.py,spot-diff-main/utils/metrics.py: unified implementations of common metrics such as accuracy, recall, and AUC.
These evaluation scripts can be run on:
- Features extracted purely from time-series inputs (e.g., sensor signals in
data/fault_diagnosis.npy). - Features extracted purely from image inputs (e.g., "image-like" feature maps converted from time-series).
- Or fused multimodal features for joint evaluation.
ensemble_models.py: combines predictions from multiple models or training runs via weighting or rule-based fusion to improve stability and accuracy.optimize_knn.py: searches hyperparameters such as k values and distance metrics to optimize nearest-neighbor classification in feature space.
For time-series tasks, you can train multiple models at different temporal granularities and window lengths; for image tasks, you can train models with different input sizes and view transformations, then use the above scripts for post-processing and ensembling.
src/augmentation.py and src/idaly/augmentation.py (the latter inferred to be a submodule for augmentation) provide various data augmentation operations:
- For time-series signals: random cropping, scaling, noise injection, time jittering, channel shuffling, etc.
- For images or time-series pseudo-images: cropping, flipping, color jittering, occlusion/masking, etc.
- For self-supervised tasks: generating multi-view, complementary inputs to improve diversity and robustness of representations.
Through a unified augmentation interface, you can:
- Easily switch or stack different augmentation strategies during training and evaluation.
- Adjust augmentation strength and types for different industrial scenarios (e.g., vibration monitoring, process control).
- Remain compatible with future modalities (e.g., acoustic signals, image sequences).
batch_generate_heatmaps.py: batch-generates heatmaps to visualize model attention on time-series or image inputs.src/visualize.py: general visualization utilities to align feature maps, predictions, and raw data.src/ui_idap_v1.py: a GUI wrapper for interactive workflows such as parameter tuning, data browsing, and result comparison.
For time-series tasks, heatmaps can show:
- Which time windows contribute most to classification or regression outputs.
- The importance distribution across channels in multichannel scenarios.
- Time windows where anomalies, drifts, or faults occur.
For image tasks, heatmaps can show:
- Different levels of attention on target regions vs. background.
- Potential bias or misprediction patterns.
- The impact of different augmentation strategies on attention distribution.
The presence of data/fault_diagnosis.npy and data/soft_sensor.npy indicates that TISA is tailored to the following typical industrial tasks:
- Fault diagnosis: use time-series signals from industrial sensors (e.g., temperature, vibration, current) for classification or anomaly detection of fault patterns.
- Soft sensing: when only partial variables are measured online, use models to predict hard-to-measure process indicators or product quality.
In these two scenarios, TISA:
- Reduces dependence on large labeled datasets via self-supervised pretraining followed by downstream fine-tuning.
- Enhances sensitivity to both slow trends and abrupt shocks with multi-timescale augmentation (combining short and long windows).
- Leverages an image perspective by transforming time-series into time–frequency or time–channel matrices to exploit visual models.
The files spot-diff-main/split_csv/1cls.csv, 2cls_fewshot.csv, 2cls_highshot.csv and spot-diff-main/utils/prepare_data.py show that the repository supports data partitioning and loading for few-shot and many-shot settings:
- Few-shot: evaluate how quickly models adapt with very few labeled samples.
- High-shot: evaluate upper-bound performance with abundant labels.
- Single-/multi-class splits: for anomaly detection or one-class classification scenarios.
By comparing few-shot vs. high-shot results within a unified TISA representation space, you can:
- Analyze how self-supervised features improve label efficiency.
- Assess feature stability in small-sample regimes.
- Guide strategies for data collection and labeling.
utils/config.py: centralized management of configuration parsing and merging, unifying command-line arguments and YAML files.utils/cluster.py: clustering utilities for time-series segments or image features, useful for unsupervised pattern discovery.utils/utils.py: common utilities for logging, random seed control, distributed training helpers, etc.
scripts/lint.sh: code quality and style checking.rename_files.py: batch file renaming for easier organization of data and results.process_all_categories.py: unified preprocessing for multi-category datasets.
These scripts and tools help to:
- Maintain consistency and reproducibility of experiments.
- Simplify data preparation and result management in multi-dataset, multi-task scenarios.
- Lay foundations for further engineering deployment (e.g., services, embedded systems).
Based on the repository structure, a typical workflow for time-series and image tasks can be outlined as follows (schematic only):
-
Data preparation:
- Organize raw time-series files into
.npyor similar formats and place them in thedata/directory. - For image tasks, convert time-series into time–frequency maps, 2D matrices, or directly use image data.
- Use the splits in
spot-diff-main/split_csv/or define custom partition strategies.
- Organize raw time-series files into
-
Configuration:
- Select or create configurations under
configs/train/andconfigs/eval/. - Adjust model size, input resolution, and augmentation strategies according to task type (time-series / image / hybrid).
- For self-supervised pretraining, refer to
ssl_default_config.yamlandtrain/ssl_meta_arch.py.
- Select or create configurations under
-
Model training:
- Launch training with
train.pyin the repository root orrun/train/train.py. - Use utilities in
utils/for logging, checkpoint saving, and loading. - If needed, combine with other scripts under
train/to implement specific training strategies.
- Launch training with
-
Feature evaluation and downstream tasks:
- Use
eval/knn.py,eval/linear.py,eval/log_regression.pyto evaluate feature quality. - Use the wrapped entry points under
run/eval/for batched evaluations. - For depth estimation or segmentation tasks, refer to the models and heads in
eval/depth/andeval/segmentation/.
- Use
-
Visualization and analysis:
- Use
batch_generate_heatmaps.pyto generate time-series / image heatmaps and inspect model attention. - Use
src/visualize.pyandsrc/ui_idap_v1.pyfor result visualization and interaction. - Further optimize and compare results with
ensemble_models.pyandoptimize_knn.py.
- Use
-
Industrial deployment and iteration:
- Integrate TISA-trained models into production systems for fault diagnosis, soft sensing, etc.
- Use
utils/cluster.pyand metric scripts to continuously monitor data distribution and model performance. - For new devices, operating conditions, or image modalities, continue to leverage self-supervision and few-shot strategies for rapid adaptation.
The TISA algorithm is designed around the goal of "unified handling of time-series and image data" by:
- Combining self-supervised pretraining with multi-task heads to improve feature generality and transferability.
- Providing rich data augmentation and visualization tools to support interpretation and debugging of model behavior.
- Supporting multiple evaluation protocols and few-shot scenarios to systematically characterize model performance in complex industrial environments.