📑 Report
The increasing prevalence of time-series data across domains, such as healthcare, finance, and IoT, has driven the need for flexible and generalizable modeling approaches. Traditional time-series analysis relies heavily on supervised models, which suffer from data scarcity and high deployment costs. This project explores the potential of Large Language Models (LLMs) as general-purpose few-shot learners for time-series classification tasks. Specifically, we investigate how different input representations affect LLM performance in two representative tasks: Dual-Tone Multi-Frequency (DTMF) signal decoding and Human Activity Recognition (HAR). By comparing text-based, visual, and multimodal inputs across models including GPT-4o, GPT-o1, and DeepSeek-R1, we assess their reasoning capabilities and robustness. Our findings show that while LLMs demonstrate potential, performance significantly varies depending on input representation and model type. In DTMF, GPT-o1 and DeepSeek-R1 consistently outperforms GPT-4o, particularly in tasks requiring text-based numerical reasoning. In HAR, visualization aids interpretation, and few-shot learning significantly boosts performance. However, challenges remain, especially in precise plot reading, domain knowledge retrieval (GPT-4o), and multimodal integration. Further domain-specific enhancements and robust representation strategies are heavily required for current LLMs.
python dtmf_run.py <subcommand> -m <model> -n <noise-type> [optional arguments]
- -m, --model:
Choose the model (LLM) for this task. Support models: GPT-4o, GPT-o1, DppeSeek-R1- Options: gpt-4o, o1-2024-12-17, deepseek-reasoner
- -n, --noise-type:
Choose input data noise type. 'clean' means using data generated with exactly DTMF frequencies, 'noise' means using data generated with added noise.- Options: noise, clean
- -r, --result-save-filename:
The file name to save results. (default:results_{model}_{noise-type}_{subcommand}[_{optional arguments}].csv
)
- freq_text: Raw frequency-magnitude text input
python dtmf_run.py freq_text -m <model> -n <noise-type> [-g]
Optional Arguments:- -g, --guide:
Add step-by-step guidance if set, otherwise no guidance will be added in the prompt
- -g, --guide:
- freq_plot: Plot frequency-magnitude pairs into line plot for input
python dtmf_run.py freq_plot -m <model> -n <noise-type> [-g -gr]
Note that freq_plot only supports model GPT-4o and GPT-o1.
Optional Arguments:- -g, --guide:
Add step-by-step guidance if set, otherwise no guidance will be added in the prompt - -gr, --grid:
Add grid lines to the input plots is set, otherwise input plots will not contain any grid lines
- -g, --guide:
- freq_pair: Input low/high frequency pair directly
python dtmf_run.py freq_pair -m <model> -n <noise-type> [-map]
Optional Arguments:- -map, --map:
Provide the true DTMF frequency-key mapping in the map if set.
- -map, --map:
python dtmf_eval.py -r <result-save-filename> [optional arguments]
This script calculates the overall accuracy and 3 break-down accuracies, unless --no-detail-acc is set. This script will also plot their confusion matrices. All evaluation results are saved under ./results/dtmf/
by default.
- Accuracy of recognized key comparing to the true key (Overall accuracy). Its confusion matrx filename is default to
conf_matrix_{result-save-filename}_overall.png
. - Accuracy of detected low frequency comparing to the true low frequency. Its confusion matrx filename is default to
conf_matrix_{result-save-filename}_low_freq_{err-tolerance}Hz.png
. - Accuracy of detected high frequency comparing to the true high frequency. Its confusion matrx filename is default to
conf_matrix_{result-save-filename}_high_freq_{err-tolerance}Hz.png
. The distribution plot of frequency detection error is saved infreq_error_dist_{result-save-filename}.png
. - Accuracy of recognized key comparing to the detected frequency. Its confusion matrx ifilename is default to
conf_matrix_{result-save-filename}_freq2key_{err-tolerance}Hz.png
.
- -r, --result-save-filename:
Results file name used indtmf_run.py
. A prefix of "result_" and a file extension of ".csv" will be automatically added.
- -e, --err-tolerance:
An integer of error tolerance range for frequency detection (default: 15 for freq_plot results, 5 for results of all other input types) - --no-detail-acc:
Calculate step-by-step accuracies or not. If guidance is included, step-by-step accuracies will be calculated by default, set --no-detail-acc to disable this feature. Otherwise only overall accuracy will be calculated, regardless of whether this argument is set or not.
python shl_run.py -m <model> -i <input_type> [optional arguments]
- -m, --model:
Choose the model (LLM) for this task. Support models: GPT-4o, GPT-o1, DppeSeek-R1- Options: gpt-4o, o1-2024-12-17, deepseek-reasoner
- -i, --input:
Select the input representation format.- Options:
- time_text: IMU time-series as raw text
- time_text_fewshot: Raw text + one raw text example for each class
- time_text_description: Raw text + textual summary
(Note: the following input types only suppport model GPT-4o and GPT-o1.) - time_plot: Time-series line plot
- time_plot_fewshot: Line plot + one line plot example for each class
- time_plot_env: Line plot + environment photo taken when the activity is happening
- env_only: Environment photo only
- Options:
- -dn, --data-num:
Number of test samples per class (default: 30) - -df, --data-folder:
Path to the dataset folder (default: ./datasets/SHL_processed/User1/220617/Torso_video/) - -l, --location:
The body location on which the smartphone used for IMU data collection is placed (default: Torso) - -f, --frequency:
Sampling frequency in Hz (default: 10 for time_text_fewshot and time_text_description, otherwise 100) - -r, --result-save-filename:
The file name to save results. Will be default toresults_{model}_User1_220617_{4*data-num}_{location}_{input}_4class.csv
if not set.
python shl_eval.py -r <result-save-filename>
This script calculates the recognition accuracy, and plot the confusion matrix. The confusion matrix is saved as ./results/HAR/conf_matrix_{result-save-filename}.png
by default.
- -r, --result-save-filename:
Results file name used inshl_run.py
. A prefix of "result_" and a file extension of ".csv" will be automatically added. This filename will also be used to generate the confusion matrix file name.