Gas Sensor Array Drift Project

Data Pre-Processing

There are 129 features in the dataset. The very first feature has the gas class and concentration concatenated using a semi-colon.
Other 128 features have the feature number conactenated with actual data using a colon.

Data pre-processing is required to address 1 and 2. First feature was split into two features GAS and CONC (concentration). For all other features the feature number was discarded and an appropriate column name was given. For example S11I_001 column is the increasing current reading for sensor 11 when alpha = 0.001.

All ten batches were combined into a single data set with an additional column 'BATCH' denoting the batch to which the observation belongs. Final clean data set contains 131 variables (128 current reading features + gas + concentration + batch) with 13910 rows.

Current Task

Currently doing data exploration and dimensionality reduction using PCA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gas Sensor Array Drift Project

Data Pre-Processing

Current Task

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally