A sample data science project that uses a Linear Regression model built in R to predict house price from the Ames Housing Data dataset. Specifically, this example is used to demonstrate the creating of ModelOp Center (MOC)-compliant code.
To run locally, first make sure the R version and libraries match the training envorinment. The model was trained on R-4.2.1. To install the required packages, run
$ R -e 'install.packages("remotes", repos="http://cran.rstudio.com", dependencies=TRUE);'
$ R -e 'remotes::install_url(url="https://cran.r-project.org/src/contrib/Archive/readr/readr_1.3.0.tar.gz", dependencies=TRUE, upgrade=TRUE);'
$ R -e 'remotes::install_url(url="https://cran.r-project.org/src/contrib/Archive/tidymodels/tidymodels_0.1.4.tar.gz", dependencies=TRUE, upgrade=TRUE);'
house_price.Ris the R code that houses the MOC-compliant code to predict and get metrics on data.trained_model.RDatais the trained model artifact that is loaded upon prediction. In our case, the artifact is a workflow built on top of a recipe that includes a few data cleaning steps and a call to a linear regression model.- The datasets used for scoring are
baseline.jsonandsample.json. These datasets represent raw data that would first be run into a batch scoring job. A sample of the outcome to the scoring job is provided in theoutput_action_sample.jsonfile. - The datasets for metrics are
baseline_scored.jsonandsample_scored.json. These datasets represent data that has appended the predictions from a scoring job. The columnSale_Priceis renamed toground_truth(not a necessary step).
- For a scoring job, use the
baseline.jsonor thesample.jsonfiles. The output is a JSON object that has the orignalSale_Priceandpredictionfor each input row. - For a metrics job, use the
baseline_scored.jsonor thesample_scored.jsonfiles. THe output is a list of the relevant metrics (RMSE, R2, MAE) for the regression model.
The input data to the scoring job is sample.json, which is a JSON-lines file (one-line JSON records). Here are the first two records:
{"MS_SubClass":"One_Story_1946_and_Newer_All_Styles","MS_Zoning":"Residential_Low_Density","Lot_Frontage":81,"Lot_Area":14267,"Street":"Pave","Alley":"No_Alley_Access","Lot_Shape":"Slightly_Irregular","Land_Contour":"Lvl","Utilities":"AllPub","Lot_Config":"Corner","Land_Slope":"Gtl","Neighborhood":"North_Ames","Condition_1":"Norm","Condition_2":"Norm","Bldg_Type":"OneFam","House_Style":"One_Story","Overall_Cond":"Above_Average","Year_Built":1958,"Year_Remod_Add":1958,"Roof_Style":"Hip","Roof_Matl":"CompShg","Exterior_1st":"Wd Sdng","Exterior_2nd":"Wd Sdng","Mas_Vnr_Type":"BrkFace","Mas_Vnr_Area":108,"Exter_Cond":"Typical","Foundation":"CBlock","Bsmt_Cond":"Typical","Bsmt_Exposure":"No","BsmtFin_Type_1":"ALQ","BsmtFin_SF_1":1,"BsmtFin_Type_2":"Unf","BsmtFin_SF_2":0,"Bsmt_Unf_SF":406,"Total_Bsmt_SF":1329,"Heating":"GasA","Heating_QC":"Typical","Central_Air":"Y","Electrical":"SBrkr","First_Flr_SF":1329,"Second_Flr_SF":0,"Gr_Liv_Area":1329,"Bsmt_Full_Bath":0,"Bsmt_Half_Bath":0,"Full_Bath":1,"Half_Bath":1,"Bedroom_AbvGr":3,"Kitchen_AbvGr":1,"TotRms_AbvGrd":6,"Functional":"Typ","Fireplaces":0,"Garage_Type":"Attchd","Garage_Finish":"Unf","Garage_Cars":1,"Garage_Area":312,"Garage_Cond":"Typical","Paved_Drive":"Paved","Wood_Deck_SF":393,"Open_Porch_SF":36,"Enclosed_Porch":0,"Three_season_porch":0,"Screen_Porch":0,"Pool_Area":0,"Pool_QC":"No_Pool","Fence":"No_Fence","Misc_Feature":"Gar2","Misc_Val":12500,"Mo_Sold":6,"Year_Sold":2010,"Sale_Type":"WD ","Sale_Condition":"Normal","Sale_Price":172000,"Longitude":-93.6194,"Latitude":42.0527,"Sale_Price_log":5.2355}
{"MS_SubClass":"One_Story_PUD_1946_and_Newer","MS_Zoning":"Residential_Low_Density","Lot_Frontage":39,"Lot_Area":5389,"Street":"Pave","Alley":"No_Alley_Access","Lot_Shape":"Slightly_Irregular","Land_Contour":"Lvl","Utilities":"AllPub","Lot_Config":"Inside","Land_Slope":"Gtl","Neighborhood":"Stone_Brook","Condition_1":"Norm","Condition_2":"Norm","Bldg_Type":"TwnhsE","House_Style":"One_Story","Overall_Cond":"Average","Year_Built":1995,"Year_Remod_Add":1996,"Roof_Style":"Gable","Roof_Matl":"CompShg","Exterior_1st":"CemntBd","Exterior_2nd":"CmentBd","Mas_Vnr_Type":"None","Mas_Vnr_Area":0,"Exter_Cond":"Typical","Foundation":"PConc","Bsmt_Cond":"Typical","Bsmt_Exposure":"No","BsmtFin_Type_1":"GLQ","BsmtFin_SF_1":3,"BsmtFin_Type_2":"Unf","BsmtFin_SF_2":0,"Bsmt_Unf_SF":415,"Total_Bsmt_SF":1595,"Heating":"GasA","Heating_QC":"Excellent","Central_Air":"Y","Electrical":"SBrkr","First_Flr_SF":1616,"Second_Flr_SF":0,"Gr_Liv_Area":1616,"Bsmt_Full_Bath":1,"Bsmt_Half_Bath":0,"Full_Bath":2,"Half_Bath":0,"Bedroom_AbvGr":2,"Kitchen_AbvGr":1,"TotRms_AbvGrd":5,"Functional":"Typ","Fireplaces":1,"Garage_Type":"Attchd","Garage_Finish":"RFn","Garage_Cars":2,"Garage_Area":608,"Garage_Cond":"Typical","Paved_Drive":"Paved","Wood_Deck_SF":237,"Open_Porch_SF":152,"Enclosed_Porch":0,"Three_season_porch":0,"Screen_Porch":0,"Pool_Area":0,"Pool_QC":"No_Pool","Fence":"No_Fence","Misc_Feature":"None","Misc_Val":0,"Mo_Sold":3,"Year_Sold":2010,"Sale_Type":"WD ","Sale_Condition":"Normal","Sale_Price":236500,"Longitude":-93.6329,"Latitude":42.0611,"Sale_Price_log":5.3738}The input data to the metrics job is sample_scored.json, which is a JSON-lines file (one-line JSON records). Here are the first two records:
{"ground_truth":172000,"prediction":143116.3251,"MS_SubClass":"One_Story_1946_and_Newer_All_Styles","MS_Zoning":"Residential_Low_Density","Lot_Frontage":81,"Lot_Area":14267,"Street":"Pave","Alley":"No_Alley_Access","Lot_Shape":"Slightly_Irregular","Land_Contour":"Lvl","Utilities":"AllPub","Lot_Config":"Corner","Land_Slope":"Gtl","Neighborhood":"North_Ames","Condition_1":"Norm","Condition_2":"Norm","Bldg_Type":"OneFam","House_Style":"One_Story","Overall_Cond":"Above_Average","Year_Built":1958,"Year_Remod_Add":1958,"Roof_Style":"Hip","Roof_Matl":"CompShg","Exterior_1st":"Wd Sdng","Exterior_2nd":"Wd Sdng","Mas_Vnr_Type":"BrkFace","Mas_Vnr_Area":108,"Exter_Cond":"Typical","Foundation":"CBlock","Bsmt_Cond":"Typical","Bsmt_Exposure":"No","BsmtFin_Type_1":"ALQ","BsmtFin_SF_1":1,"BsmtFin_Type_2":"Unf","BsmtFin_SF_2":0,"Bsmt_Unf_SF":406,"Total_Bsmt_SF":1329,"Heating":"GasA","Heating_QC":"Typical","Central_Air":"Y","Electrical":"SBrkr","First_Flr_SF":1329,"Second_Flr_SF":0,"Gr_Liv_Area":1329,"Bsmt_Full_Bath":0,"Bsmt_Half_Bath":0,"Full_Bath":1,"Half_Bath":1,"Bedroom_AbvGr":3,"Kitchen_AbvGr":1,"TotRms_AbvGrd":6,"Functional":"Typ","Fireplaces":0,"Garage_Type":"Attchd","Garage_Finish":"Unf","Garage_Cars":1,"Garage_Area":312,"Garage_Cond":"Typical","Paved_Drive":"Paved","Wood_Deck_SF":393,"Open_Porch_SF":36,"Enclosed_Porch":0,"Three_season_porch":0,"Screen_Porch":0,"Pool_Area":0,"Pool_QC":"No_Pool","Fence":"No_Fence","Misc_Feature":"Gar2","Misc_Val":12500,"Mo_Sold":6,"Year_Sold":2010,"Sale_Type":"WD ","Sale_Condition":"Normal","Longitude":-93.6194,"Latitude":42.0527,"Sale_Price_log":5.2355}
{"ground_truth":236500,"prediction":254932.8695,"MS_SubClass":"One_Story_PUD_1946_and_Newer","MS_Zoning":"Residential_Low_Density","Lot_Frontage":39,"Lot_Area":5389,"Street":"Pave","Alley":"No_Alley_Access","Lot_Shape":"Slightly_Irregular","Land_Contour":"Lvl","Utilities":"AllPub","Lot_Config":"Inside","Land_Slope":"Gtl","Neighborhood":"Stone_Brook","Condition_1":"Norm","Condition_2":"Norm","Bldg_Type":"TwnhsE","House_Style":"One_Story","Overall_Cond":"Average","Year_Built":1995,"Year_Remod_Add":1996,"Roof_Style":"Gable","Roof_Matl":"CompShg","Exterior_1st":"CemntBd","Exterior_2nd":"CmentBd","Mas_Vnr_Type":"None","Mas_Vnr_Area":0,"Exter_Cond":"Typical","Foundation":"PConc","Bsmt_Cond":"Typical","Bsmt_Exposure":"No","BsmtFin_Type_1":"GLQ","BsmtFin_SF_1":3,"BsmtFin_Type_2":"Unf","BsmtFin_SF_2":0,"Bsmt_Unf_SF":415,"Total_Bsmt_SF":1595,"Heating":"GasA","Heating_QC":"Excellent","Central_Air":"Y","Electrical":"SBrkr","First_Flr_SF":1616,"Second_Flr_SF":0,"Gr_Liv_Area":1616,"Bsmt_Full_Bath":1,"Bsmt_Half_Bath":0,"Full_Bath":2,"Half_Bath":0,"Bedroom_AbvGr":2,"Kitchen_AbvGr":1,"TotRms_AbvGrd":5,"Functional":"Typ","Fireplaces":1,"Garage_Type":"Attchd","Garage_Finish":"RFn","Garage_Cars":2,"Garage_Area":608,"Garage_Cond":"Typical","Paved_Drive":"Paved","Wood_Deck_SF":237,"Open_Porch_SF":152,"Enclosed_Porch":0,"Three_season_porch":0,"Screen_Porch":0,"Pool_Area":0,"Pool_QC":"No_Pool","Fence":"No_Fence","Misc_Feature":"None","Misc_Val":0,"Mo_Sold":3,"Year_Sold":2010,"Sale_Type":"WD ","Sale_Condition":"Normal","Longitude":-93.6329,"Latitude":42.0611,"Sale_Price_log":5.3738}