Check out my solution for FORCE 2020 Lithology Classification Contest. The objective of the competition is to create machine learning model to correctly predict lithology labels using provided well logs, provided NPD Lithostratigraphy and well location X,Y position. Dataset can be accessed from the contest page. Contest Link:- https://github.com/bolgebrygg/Force-2020-Machine-Learning-competition/tree/master/lithology_competition
I have used the following strategy for my solution:-
- Clustering Based on Spatial Location
- Missing Data Flags
- Interpolated Zone Flags
- Despiking of the logs
- Flagging Bad Holes
- Train and fill the flagged zones
- Outlier Analysis
- Feature Engineering
- Dimensionality Reduction
- Final Lithology Classification
Despiking of the log values has been done by using the threshold available in literature and from the statistical nature of data. Bad holes are flagged by comparing Caliper and Borehole Size log and by Using ensemble modeling for DTC and comparing the predicted DTC with actual log values to flag bad hole. Median Filter is used to denoise the features and new features such as Shale Volume and Carbon Index are also added.