non parametric modeling
-
dendrogram
-
silhouette scores
-
aggregate variables
-
descriptive analysis
-
feature importance (coef, quantify the impact on clustering)
-
decribe the clusters, novel characteristics of each cluster (descriptive analysis)
gradient boosting, random forest non parametric, and non linear
feature selection, re-describe analysis clusters
2 weeks: modeling, literature summary sheets
random forest feature selection, feature selection based on buckets