Mathia Learning Analytics Project

This project contains machine learning models for predicting three learning outcomes: Metacognition, Math Learning, and Motivation based on student behavioral data.

Project Overview

The project consists of three separate models, each predicting different learning outcomes:

Meta Model: Predicts Post3Meta (Metacognition Assessment)
Math Model: Predicts PostMath (Math Learning Outcomes)
Motivation Model: Predicts PostMotivation3 (Motivation Scores)

Each model uses behavioral features extracted from student interaction data and workspace summaries.

Installation

1. Recommended - Create virtual/conda environment

2. Install Required Libraries

# Install all required packages
pip install -r requirements.txt

3. Running the Models

RECOMMENDED: Only make required changes to test notebooks and run on pre-trained uploaded models IMPORTANT: Follow the exact order below for each model type

Step 1: Meta Model (Metacognition)

1.1 Train Meta Model

# Run the training notebook
jupyter notebook meta_model_team2.ipynb

REQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):

In the preprocessing section of meta_model_team2.ipynb, replace:

# REPLACE THESE PATHS WITH YOUR ACTUAL DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")

# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)

# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_meta.to_csv("preprocessed_data/df_cleaned_meta.csv", index=False)

1.2 Test Meta Model

# Run the test notebook
jupyter notebook meta_test_team2.ipynb

REQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):

In the preprocessing section of meta_test_team2.ipynb, replace:

# REPLACE THESE PATHS WITH YOUR ACTUAL TEST DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")

# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)

# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_meta.to_csv("preprocessed_data/df_cleaned_meta.csv", index=False)

Step 2: Math Model (Math Learning)

2.1 Train Math Model

# Run the training notebook
jupyter notebook math_model_team2.ipynb

🔧 REQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):

In the preprocessing section of math_model_team2.ipynb, replace:

# REPLACE THESE PATHS WITH YOUR ACTUAL DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")

# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)

# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_math.to_csv("preprocessed_data/df_cleaned_math.csv", index=False)

2.2 Test Math Model

# Run the test notebook
jupyter notebook math_test_team2.ipynb

REQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):

In the preprocessing section of math_test_team2.ipynb, replace:

# REPLACE THESE PATHS WITH YOUR ACTUAL TEST DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")

# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)

# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_math.to_csv("preprocessed_data/df_cleaned_math.csv", index=False)

Motivation Model

3.1 Train Motivation Model

# Run the training notebook
jupyter notebook motivation_model_team2.ipynb

REQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):

In the preprocessing section of motivation_model_team2.ipynb, replace:

# REPLACE THESE PATHS WITH YOUR ACTUAL DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")

# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)

# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_map.to_csv("preprocessed_data/df_cleaned_map.csv", index=False)
df_cleaned_se.to_csv("preprocessed_data/df_cleaned_se.csv", index=False)

3.2 Test Motivation Model

# Run the test notebook
jupyter notebook motivation_test_team2.ipynb

REQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):

In the preprocessing section of motivation_test_team2.ipynb, replace:

# REPLACE THESE PATHS WITH YOUR ACTUAL TEST DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")

# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)

# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_map.to_csv("preprocessed_data/df_cleaned_map.csv", index=False)
df_cleaned_se.to_csv("preprocessed_data/df_cleaned_se.csv", index=False)

Model Details

Meta Model (Classification)

Target: Post3Meta (Metacognition Assessment)
Models: RandomForest, GradientBoosting, XGBoost
Evaluation: Quadratic Weighted Kappa (QWK)
Features: Learning behavior patterns, help-seeking behavior
Output: meta_model.pkl (model, label_encoder, feature_columns)

Math Model (Regression)

Target: PostMath (Math Learning Outcomes)
Model: ExtraTreesRegressor (single model)
Evaluation: R², RMSE, MAE
Features: Workspace completion, error rates, skills mastery
Output: math_model.pkl (model, feature_columns)

Motivation Model (Classification)

Target: PostMotivation3 (Average of PostSE and PostMAP)
Models: RandomForest, GradientBoosting
Evaluation: Quadratic Weighted Kappa (QWK)
Features: MAP-specific and SE-specific behavioral patterns
Output: motivation_model.pkl (model, feature_columns)

Output Files

Trained Models

meta_model.pkl - Trained metacognition model
math_model.pkl - Trained math model
motivation_model.pkl - Trained motivation model

Predictions

meta_predictions.csv - Metacognition predictions
math_predictions.csv - Math predictions
motivation_predictions.csv - Motivation predictions

Preprocessed Data

df_main_allws.csv - Main interaction data (all workspaces)
df_ws_allws.csv - Workspace data (all workspaces)
df_main.csv - Main interaction data (filtered)
df_ws.csv - Workspace data (filtered)
df_scores.csv - Student scores
df_cleaned_math.csv - Cleaned math scores
df_cleaned_meta.csv - Cleaned metacognition scores
df_cleaned_map.csv - Cleaned MAP scores
df_cleaned_se.csv - Cleaned SE scores

Important Notes

File Paths: Always replace the file paths in the preprocessing sections with your actual data locations
Order: Train models before testing them
Data Requirements: Ensure all required CSV files are present in the correct directories
Model Files: Trained models (.pkl files) must be in the same directory as the test notebooks
Memory: Some notebooks may require significant memory for large datasets

Troubleshooting

Common Issues:

FileNotFoundError: Check that all file paths are correctly specified
MemoryError: Consider using smaller datasets or increasing system memory
ImportError: Ensure all required libraries are installed via requirements.txt

Getting Help:

If you encounter issues, check:

All file paths are correctly specified
All required libraries are installed
Data files are in the correct format
Sufficient system resources are available

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mathia Learning Analytics Project

Table of Contents

Project Overview

Installation

1. Recommended - Create virtual/conda environment

2. Install Required Libraries

3. Running the Models

Step 1: Meta Model (Metacognition)

1.1 Train Meta Model

1.2 Test Meta Model

Step 2: Math Model (Math Learning)

2.1 Train Math Model

2.2 Test Math Model

Motivation Model

3.1 Train Motivation Model

3.2 Test Motivation Model

Model Details

Meta Model (Classification)

Math Model (Regression)

Motivation Model (Classification)

Output Files

Trained Models

Predictions

Preprocessed Data

Important Notes

Troubleshooting

Common Issues:

Getting Help:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
MATHia Challenge Summary Document - Team 2 (Erica, Tobasum, Sefika).pdf		MATHia Challenge Summary Document - Team 2 (Erica, Tobasum, Sefika).pdf
README.md		README.md
math_model.pkl		math_model.pkl
math_model_team2.ipynb		math_model_team2.ipynb
math_test_team2.ipynb		math_test_team2.ipynb
meta_model_team2.ipynb		meta_model_team2.ipynb
meta_model_team2.pkl		meta_model_team2.pkl
meta_test_team2.ipynb		meta_test_team2.ipynb
motivation_model.pkl		motivation_model.pkl
motivation_model_team2.ipynb		motivation_model_team2.ipynb
motivation_test_team2.ipynb		motivation_test_team2.ipynb

Folders and files

Latest commit

History

Repository files navigation

Mathia Learning Analytics Project

Table of Contents

Project Overview

Installation

1. Recommended - Create virtual/conda environment

2. Install Required Libraries

3. Running the Models

Step 1: Meta Model (Metacognition)

1.1 Train Meta Model

1.2 Test Meta Model

Step 2: Math Model (Math Learning)

2.1 Train Math Model

2.2 Test Math Model

Motivation Model

3.1 Train Motivation Model

3.2 Test Motivation Model

Model Details

Meta Model (Classification)

Math Model (Regression)

Motivation Model (Classification)

Output Files

Trained Models

Predictions

Preprocessed Data

Important Notes

Troubleshooting

Common Issues:

Getting Help:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages