Skip to content

urban-eriksson/gaussifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

gaussifier

Gaussifier / MOAT (Mother Of All Transforms)

The idea behind this transform is inspired by the procedure to remove disparate impact which has been presented previously in the literature which has also been implemented in another repository of mine. It could possibly be used in statistics as well as for preprocessing of data in ML.

https://github.com/urban-eriksson/ml-datapreprocessing

The idea for this transform, is to take one sample, the training sample, and then assign an aggregated probability density for each of the datapoints. If the datapoints are first sorted this probability is F(x_i) = P(X<=x_i) = (i + 1) / (N + 1) when i goes from 0 to N-1.

The test data, xp_j, j=1..M, can then for instance be linearly interpolated using the point set of x and F(x), and extrapolated when xp_j < min(x) or xp_j > max(x). Quite arbitrarily a gaussian curve form can then be generated by the obtained percent points and the percent point function of the norm module in the stastistical functions of scipy.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html

Figure 1.Tranformation when the x and xp datasets are equal

Figure 2. Gaussifying data with uniform distibution

About

Gaussifier / MOAT (Mother Of All Transforms)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages