This project contains machine learning models for predicting three learning outcomes: Metacognition, Math Learning, and Motivation based on student behavioral data.
The project consists of three separate models, each predicting different learning outcomes:
- Meta Model: Predicts
Post3Meta(Metacognition Assessment) - Math Model: Predicts
PostMath(Math Learning Outcomes) - Motivation Model: Predicts
PostMotivation3(Motivation Scores)
Each model uses behavioral features extracted from student interaction data and workspace summaries.
# Install all required packages
pip install -r requirements.txtRECOMMENDED: Only make required changes to test notebooks and run on pre-trained uploaded models IMPORTANT: Follow the exact order below for each model type
# Run the training notebook
jupyter notebook meta_model_team2.ipynbREQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):
In the preprocessing section of meta_model_team2.ipynb, replace:
# REPLACE THESE PATHS WITH YOUR ACTUAL DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")
# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)
# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_meta.to_csv("preprocessed_data/df_cleaned_meta.csv", index=False)# Run the test notebook
jupyter notebook meta_test_team2.ipynbREQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):
In the preprocessing section of meta_test_team2.ipynb, replace:
# REPLACE THESE PATHS WITH YOUR ACTUAL TEST DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")
# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)
# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_meta.to_csv("preprocessed_data/df_cleaned_meta.csv", index=False)# Run the training notebook
jupyter notebook math_model_team2.ipynb🔧 REQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):
In the preprocessing section of math_model_team2.ipynb, replace:
# REPLACE THESE PATHS WITH YOUR ACTUAL DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")
# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)
# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_math.to_csv("preprocessed_data/df_cleaned_math.csv", index=False)# Run the test notebook
jupyter notebook math_test_team2.ipynbREQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):
In the preprocessing section of math_test_team2.ipynb, replace:
# REPLACE THESE PATHS WITH YOUR ACTUAL TEST DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")
# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)
# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_math.to_csv("preprocessed_data/df_cleaned_math.csv", index=False)# Run the training notebook
jupyter notebook motivation_model_team2.ipynbREQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):
In the preprocessing section of motivation_model_team2.ipynb, replace:
# REPLACE THESE PATHS WITH YOUR ACTUAL DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")
# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)
# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_map.to_csv("preprocessed_data/df_cleaned_map.csv", index=False)
df_cleaned_se.to_csv("preprocessed_data/df_cleaned_se.csv", index=False)# Run the test notebook
jupyter notebook motivation_test_team2.ipynbREQUIRED FILE PATH REPLACEMENTS (Preprocessing Section):
In the preprocessing section of motivation_test_team2.ipynb, replace:
# REPLACE THESE PATHS WITH YOUR ACTUAL TEST DATA LOCATIONS
df_main = pd.read_csv("./raw_data/training_set_with_formatted_time.csv")
df_ws = pd.read_csv("./raw_data/workspace_summary_train.csv")
df_scores = pd.read_csv("./raw_data/student_scores_train.csv")
# REPLACE THESE OUTPUT PATHS
df_main.to_csv('preprocessed_data/df_main_allws.csv', index=False)
df_ws.to_csv('preprocessed_data/df_ws_allws.csv', index=False)
# REPLACE THESE OUTPUT PATHS
df_main.to_csv("preprocessed_data/df_main.csv", index=False)
df_ws.to_csv("preprocessed_data/df_ws.csv", index=False)
df_scores.to_csv("preprocessed_data/df_scores.csv", index=False)
df_cleaned_map.to_csv("preprocessed_data/df_cleaned_map.csv", index=False)
df_cleaned_se.to_csv("preprocessed_data/df_cleaned_se.csv", index=False)- Target:
Post3Meta(Metacognition Assessment) - Models: RandomForest, GradientBoosting, XGBoost
- Evaluation: Quadratic Weighted Kappa (QWK)
- Features: Learning behavior patterns, help-seeking behavior
- Output:
meta_model.pkl(model, label_encoder, feature_columns)
- Target:
PostMath(Math Learning Outcomes) - Model: ExtraTreesRegressor (single model)
- Evaluation: R², RMSE, MAE
- Features: Workspace completion, error rates, skills mastery
- Output:
math_model.pkl(model, feature_columns)
- Target:
PostMotivation3(Average of PostSE and PostMAP) - Models: RandomForest, GradientBoosting
- Evaluation: Quadratic Weighted Kappa (QWK)
- Features: MAP-specific and SE-specific behavioral patterns
- Output:
motivation_model.pkl(model, feature_columns)
meta_model.pkl- Trained metacognition modelmath_model.pkl- Trained math modelmotivation_model.pkl- Trained motivation model
meta_predictions.csv- Metacognition predictionsmath_predictions.csv- Math predictionsmotivation_predictions.csv- Motivation predictions
df_main_allws.csv- Main interaction data (all workspaces)df_ws_allws.csv- Workspace data (all workspaces)df_main.csv- Main interaction data (filtered)df_ws.csv- Workspace data (filtered)df_scores.csv- Student scoresdf_cleaned_math.csv- Cleaned math scoresdf_cleaned_meta.csv- Cleaned metacognition scoresdf_cleaned_map.csv- Cleaned MAP scoresdf_cleaned_se.csv- Cleaned SE scores
- File Paths: Always replace the file paths in the preprocessing sections with your actual data locations
- Order: Train models before testing them
- Data Requirements: Ensure all required CSV files are present in the correct directories
- Model Files: Trained models (.pkl files) must be in the same directory as the test notebooks
- Memory: Some notebooks may require significant memory for large datasets
- FileNotFoundError: Check that all file paths are correctly specified
- MemoryError: Consider using smaller datasets or increasing system memory
- ImportError: Ensure all required libraries are installed via
requirements.txt
If you encounter issues, check:
- All file paths are correctly specified
- All required libraries are installed
- Data files are in the correct format
- Sufficient system resources are available