Using Random forest to design an on-line electrical motor false classifying and predicting system

Abstract

Electric motors are the important power source for intelligent manufacturing; however, motor eccentric is a serious problem that will occurs when robots or machine have operate for a while. This fault will cause damage to power modules, but traditional solutions were mostly depends on expensive sensors and the are not able to make a precise prediction. In this project, we will foucus on the eccentricity fault by collecting big data and do the further classification and prediction based on Random Forest.

Here is the code!!

Guide line

Data introduction

Real time raw data
Features extraction

Model training

Random Forest algorithm intriduction
training and test data preprocessing
Feature filtering
Data size choosing
hyper parameter tunning
Status Voting Method

Why Random Forest?

KNN
SVM
Decision Tree
Comparison Result

Future work

Data introduction

A proper data of electrical motor false classification consists of the real-time operating raw data and controlling command of rolling element bearings via the acquisition station (shown in Fig.1), data processing, feature extraction from the data sets and classification into functional (State0) or defective (State1, State2, State3) status (shown in Fig.2) of rolling element bearing. To be more specific, I get the data from a real packing machine motor.

Entire process of collecting big data	Status definitions

Fig.1	Fig.2

Real time raw data

Acquisitions of the rolling element bearing’s real-time raw data and controlling command from servo motor driver were performed on the oscilloscope which is shown in Fig. 3. It provides 8 channels which have 16 bytes memory and 4KHz sampling frequency for each. Fig. 4 displays the acquisition’s rule of the training data which was set for appropriate experimental temperature, working time and experimental station running speed.

Oscilloscope and command interface Acquisition’s rule

Fig.3 Fig.4
Features extraction

Servo Driver provides many of real-time signals and commands which can be obtained from oscilloscope. I choose 8 motor related real-time information (all are in time domain) to be the features for model learning(shown in Fig.5).

The description of 8 channels for data acquisitions

Fig.5

Model training

Scikit-learn is an open-source machine learning software, which is easy to use. Scikit-learn provides classification, regression, clustering and dimensionality reduction libraries for python programming. I used Scikit-learn and random forests algorithm to classify rolling element bearing and motor condition and make prediction.

Random Forest algorithm intriduction

Here is the code!!

The characteristics of random forest are adding an additional layer of randomness to bagging that constructing each tree using a different bootstrap sample of the data. Bagging is a well- known type of classification trees, which is that successive trees do not depend on earlier trees. Each tree is independently constructed using a bootstrap sample of the data set. In the end, a simple majority vote is taken for prediction. In standard decision trees, each node is split using the best split among all variables. In random forests, each node is split using the best among a subset of predictors randomly chosen at that node. This strategy makes random forest performing better than many other classifiers, including decision trees, discriminant analysis and support vector machines. Random forest is also robust against overfitting.
training and test data preprocessing

Here is the code!!

The training data used in this part are collected from the 10 groups of original training data sets which contains the four motor states and I selected 2500 volumes of data from every state containing the information about CH1 ~ CH8 randomly. Finally, a group of 10000 training data containing four states was obtained. The purpose of this step is to make the collection of training data become more various, so as to prove the credibility of training data.

(Note: I deal with the test data in the same way, the volume of test data is 10000 at first also).

In addition to normal data preprocessing, I do the advance data preprocessing (Shown in Fig.6) and get the better prediction accuracy (Shown in Fig.7).

CH1max :The maximum value of CH1 in the training data set

CH1min : The minimum value of CH1 in the training data set

CH1peak :The peak value of CH1 in the training data set

CH1peak = CH1max -((CH1max -CH1min)*2%)

Advance data preprocessing by focusing on the peak value The comparison of different data preprocessing

Fig.6 Fig.7
Feature filtering

In this part, I seperate all the features into four gruops (Shown in Fig.8 below), each of them is: speed relative features, location related features, torque related features and the other feature. After, I combine those feature cluster in various way (Shown in Fig.9 below), the result indicates that the combination of all features get the highest OOB_Score, thus I regard CH1~CH8 as the data features.

Features cluster Result

Fig.8 Fig.9
Data size choosing

The results of six kinds of data size are shown below, we can see that the prediction accuracy is the highest when the amount of training data is 100K, which proves that the more the amount of training data is, the better the training model will be.

The comparison of different data size
hyper parameter tunning

Here is the code!!

After doing proper feature extraction, training data preprocessing and deciding proper training data size, model improving is through hyper parameter tunning and determining the proper one by OOB_Score. I use the powerful function called sklearn.model_selection.GridSearchCV to run all the parameters fisrt and then narrow down to the parameter 'n_estimators'. Fig. 10 shows that in the case of n_estimators=70, the OOB_Score is 0.9389, which is the best score between n_estimators=10~80. At last, it can be seen in Fig. 11 that the prediction accuracy of each state of n_estimators=70 is 0.998, 0.9981, 0.9989 and 0.998, all are better than the case of n_estimators=10. Consequently, the learning model of this project is based on 8 features of real-time raw data and controlling command from servo motor driver, 100k pieces of training data size, advance data preprocessing of CH1 peak value and n_estimators=70.

OOB_Score of different n_estimators Prediction accuracy of n_estimators=10,70

Fig.10 Fig.11
Status Voting Method

After constructing the learning model by random forest algorithm, status voting method is designed to make sure that the diagnosis prediction results are precise. Fig. 12 displays how the status voting method works, and beneath shows the algorithm:

w0 = State0 votesnumber

w1 = State1 votes number

w2 = State2 votes number

w3 = State3 votes number

w0 + w1 + w2 + w3 = 1000(totalvotesnumber)

w = max(w0,w1,w2,w3)

State_result = State(wmax)

The comparison of different data size

Fig.12

Why Random Forest?

Here is the code!!

I compare Random forest with other three kinds of classfiers, which are KNN, decision tree and SVM and i will give brief introduction about those algorithms.

Here is the code!!

KNN
SVM
Decision Tree

Fig.13 and Fig.14 indicates the results of comparing random forest with the three classifiers I mentioned, as i predicted, random forest is the best classifiers in this project.



Fig.13	Fig.14

Last but not least, decision tree and random forest are quite similar in some case, therefore I list out some certain differences between them.

Future work

The future work could be related to the implementation of the presented approach on the cloud system, IoT or embedded system computation platform as well as to do more non real-time motor faults’ feature extractions making prediction system much more complete.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
KnnTrain.py		KnnTrain.py
ProjectForALL .py		ProjectForALL .py
README.md		README.md
RF_Model_Learning.py		RF_Model_Learning.py
RFparametersTest.py		RFparametersTest.py
Txt2Csv_PulseCommandData.py		Txt2Csv_PulseCommandData.py
compare_different_classifier.py		compare_different_classifier.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Using Random forest to design an on-line electrical motor false classifying and predicting system

Abstract

Guide line

Data introduction

Model training

Why Random Forest?

Future work

About

Releases

Packages

Languages

Oscilloscope and command interface	Acquisition’s rule

Fig.3	Fig.4

The description of 8 channels for data acquisitions

Fig.5

Advance data preprocessing by focusing on the peak value	The comparison of different data preprocessing

Fig.6	Fig.7

Features cluster	Result

Fig.8	Fig.9

Jeff-67/Compare-classifiers-with-Random-Forest-to-build-a-motor-false-prediction-and-classification-system

Folders and files

Latest commit

History

Repository files navigation

Using Random forest to design an on-line electrical motor false classifying and predicting system

Abstract

Guide line

Data introduction

Model training

Why Random Forest?

Future work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages