From 0c49bfb3df906098152f65c0678998f7713dbe63 Mon Sep 17 00:00:00 2001
From: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Date: Fri, 1 Mar 2019 14:40:29 +0100
Subject: [PATCH 01/22] SLEP005: Outlier Rejection API

---
 index.rst            |  1 +
 slep005/proposal.rst | 98 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 99 insertions(+)
 create mode 100644 slep005/proposal.rst

diff --git a/index.rst b/index.rst
index 9713a84..cbe75c6 100644
--- a/index.rst
+++ b/index.rst
@@ -26,6 +26,7 @@
     slep002/proposal
     slep003/proposal
     slep004/proposal
+    slep005/proposal
 
 .. toctree::
     :maxdepth: 1
diff --git a/slep005/proposal.rst b/slep005/proposal.rst
new file mode 100644
index 0000000..33dcf4d
--- /dev/null
+++ b/slep005/proposal.rst
@@ -0,0 +1,98 @@
+.. _slep_005:
+
+=====================
+Outlier rejection API
+=====================
+
+:Author: Oliver Raush (oliverrausch99@gmail.com), Guillaume Lemaitre (g.lemaitre58@gmail.com)
+:Status: Draft
+:Type: Standards Track
+:Created: created on, in 2019-03-01
+:Resolution: <url>
+
+Abstract
+--------
+
+We propose a new mixin ``OutlierRejectionMixin`` implementing a
+``fit_resample(X, y)`` method. This method will remove samples from
+``X`` and ``y`` to get a outlier-free dataset. This method is also
+handle in ``Pipeline``.
+
+Detailed description
+--------------------
+
+Fitting a machine learning model on an outlier-free dataset can be
+beneficial.  Currently, the family of outlier detection algorithms
+allows to detect outliers using `estimator.fit_predict(X, y)`. However,
+there is no mechanism to remove outliers without any manual step. It
+is even impossible when a ``Pipeline`` is used.
+
+We propose the following changes:
+
+* implement an ``OutlierRejectionMixin``;
+* this mixin add a method ``fit_resample(X, y)`` removing outliers
+  from ``X`` and ``y``;
+* ``fit_resample`` should be handled in ``Pipeline``.
+
+Implementation
+--------------
+
+API changes are implemented in
+https://github.com/scikit-learn/scikit-learn/pull/13269
+
+Estimator implementation
+........................
+
+The new mixin is implemented as::
+  
+  class OutlierRejectionMixin:
+    _estimator_type = "outlier_rejector"
+    def fit_resample(self, X, y):
+        inliers = self.fit_predict(X) == 1
+        return safe_mask(X, inliers), safe_mask(y, inliers)
+
+This will be used as follows for the outlier detection algorithms::
+  
+  class IsolationForest(BaseBagging, OutlierMixin, OutlierRejectionMixin):
+      ...
+      
+One can use the new algorithm with::
+  
+  from sklearn.ensemble import IsolationForest
+  estimator = IsolationForest()
+  X_free, y_free = estimator.fit_resample(X, y)
+
+Pipeline implementation
+.......................
+
+To handle outlier rejector in ``Pipeline``, we enforce the following:
+
+* an estimator cannot implement both ``fit_resample(X, y)`` and
+  ``fit_transform(X)`` / ``transform(X)``.
+* ``fit_predict(X)`` (i.e., clustering methods) should not be called if an
+  outlier rejector is in the pipeline.
+
+Backward compatibility
+----------------------
+
+There is no backward incompatibilities with the current API.
+
+Discussion
+----------
+
+* https://github.com/scikit-learn/scikit-learn/pull/13269
+
+References and Footnotes
+------------------------
+
+.. [1] Each SLEP must either be explicitly labeled as placed in the public
+   domain (see this SLEP as an example) or licensed under the `Open
+   Publication License`_.
+
+.. _Open Publication License: https://www.opencontent.org/openpub/
+
+
+Copyright
+---------
+
+This document has been placed in the public domain. [1]_

From 4ecc51bc00f113f8be9b6c4aad8ee477e3c9a02c Mon Sep 17 00:00:00 2001
From: Oliver Rausch <Oliverrausch99@gmail.com>
Date: Sat, 2 Mar 2019 19:55:11 +0100
Subject: [PATCH 02/22] Update slep005/proposal.rst

Co-Authored-By: glemaitre <g.lemaitre58@gmail.com>
---
 slep005/proposal.rst | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 33dcf4d..6df9eed 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -71,6 +71,18 @@ To handle outlier rejector in ``Pipeline``, we enforce the following:
   ``fit_transform(X)`` / ``transform(X)``.
 * ``fit_predict(X)`` (i.e., clustering methods) should not be called if an
   outlier rejector is in the pipeline.
+* We propose that resamplers are only applied during fit time. Specifically, the pipeline will act as follows:
+===================== ================================
+Method                Resamplers applied               
+===================== ================================
+``fit``               Yes
+``fit_transform``     Yes
+``transform``         Yes
+``fit_resample``      Yes
+``predict``           No
+``score``             No
+``fit_predict``       not supported 
+===================== ================================
 
 Backward compatibility
 ----------------------

From c855ffe16c14266a8241e37d04c3e3fcc32845ba Mon Sep 17 00:00:00 2001
From: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Date: Sat, 2 Mar 2019 23:09:04 +0100
Subject: [PATCH 03/22] Update slep

---
 slep005/proposal.rst | 123 +++++++++++++++++++++----------------------
 1 file changed, 61 insertions(+), 62 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 6df9eed..7f34af5 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -1,10 +1,12 @@
 .. _slep_005:
 
-=====================
-Outlier rejection API
-=====================
+=============
+Resampler API
+=============
 
-:Author: Oliver Raush (oliverrausch99@gmail.com), Guillaume Lemaitre (g.lemaitre58@gmail.com)
+:Author: Oliver Raush (oliverrausch99@gmail.com),
+         Christos Aridas (char@upatras.gr),
+         Guillaume Lemaitre (g.lemaitre58@gmail.com)
 :Status: Draft
 :Type: Standards Track
 :Created: created on, in 2019-03-01
@@ -13,77 +15,74 @@ Outlier rejection API
 Abstract
 --------
 
-We propose a new mixin ``OutlierRejectionMixin`` implementing a
-``fit_resample(X, y)`` method. This method will remove samples from
-``X`` and ``y`` to get a outlier-free dataset. This method is also
-handle in ``Pipeline``.
+We propose the inclusion of a new type of estimator: resampler. The
+resampler will change the samples in ``X`` and ``y``. In short:
 
-Detailed description
---------------------
+* resamplers will reduce or augment the number of samples in ``X`` and
+  ``y``;
+* ``Pipeline`` should treat them as a separate type of estimator.
 
-Fitting a machine learning model on an outlier-free dataset can be
-beneficial.  Currently, the family of outlier detection algorithms
-allows to detect outliers using `estimator.fit_predict(X, y)`. However,
-there is no mechanism to remove outliers without any manual step. It
-is even impossible when a ``Pipeline`` is used.
+Motivation
+----------
 
-We propose the following changes:
+Sample reduction or augmentation are part of machine-learning
+pipeline. The current scikit-learn API does not offer support for such
+use cases.
 
-* implement an ``OutlierRejectionMixin``;
-* this mixin add a method ``fit_resample(X, y)`` removing outliers
-  from ``X`` and ``y``;
-* ``fit_resample`` should be handled in ``Pipeline``.
+Two possible use cases are currently reported:
 
+* sample rebalancing to correct bias toward class with large cardinality;
+* outlier rejection to fit a clean dataset.
+   
 Implementation
 --------------
 
-API changes are implemented in
-https://github.com/scikit-learn/scikit-learn/pull/13269
-
-Estimator implementation
-........................
-
-The new mixin is implemented as::
-  
-  class OutlierRejectionMixin:
-    _estimator_type = "outlier_rejector"
-    def fit_resample(self, X, y):
-        inliers = self.fit_predict(X) == 1
-        return safe_mask(X, inliers), safe_mask(y, inliers)
+To handle outlier rejector in ``Pipeline``, we enforce the following:
 
-This will be used as follows for the outlier detection algorithms::
+* an estimator cannot implement both ``fit_resample(X, y)`` and
+  ``fit_transform(X)`` / ``transform(X)``. If both are implemented,
+  ``Pipeline`` will not be able to know which of the two methods to
+  call.
+* resamplers are only applied during ``fit``. Otherwise, scoring will
+  be harder. Specifically, the pipeline will act as follows:
   
-  class IsolationForest(BaseBagging, OutlierMixin, OutlierRejectionMixin):
-      ...
-      
-One can use the new algorithm with::
+  ===================== ================================
+  Method                Resamplers applied               
+  ===================== ================================
+  ``fit``               Yes
+  ``fit_transform``     Yes
+  ``fit_resample``      Yes
+  ``transform``         No
+  ``predict``           No
+  ``score``             No
+  ``fit_predict``       not supported 
+  ===================== ================================
+
+* ``fit_predict(X)`` (i.e., clustering methods) should not be called
+  if an outlier rejector is in the pipeline. The output will be of
+  different size than ``X`` breaking metric computation.
+* in a supervised scheme, resampler will need to validate which type
+  of target is passed. Up to our knowledge, supervised are used for
+  binary and multiclass classification.
   
-  from sklearn.ensemble import IsolationForest
-  estimator = IsolationForest()
-  X_free, y_free = estimator.fit_resample(X, y)
+Alternative implementation
+..........................
 
-Pipeline implementation
-.......................
+Alternatively ``sample_weight`` could be used as a placeholder to
+perform resampling. However, the current limitations are:
 
-To handle outlier rejector in ``Pipeline``, we enforce the following:
-
-* an estimator cannot implement both ``fit_resample(X, y)`` and
-  ``fit_transform(X)`` / ``transform(X)``.
-* ``fit_predict(X)`` (i.e., clustering methods) should not be called if an
-  outlier rejector is in the pipeline.
-* We propose that resamplers are only applied during fit time. Specifically, the pipeline will act as follows:
-===================== ================================
-Method                Resamplers applied               
-===================== ================================
-``fit``               Yes
-``fit_transform``     Yes
-``transform``         Yes
-``fit_resample``      Yes
-``predict``           No
-``score``             No
-``fit_predict``       not supported 
-===================== ================================
+* ``sample_weight`` is not available for all estimators;
+* ``sample_weight`` will implement only sample reductions;
+* ``sample_weight`` can be applied at both fit and predict time;
+* ``sample_weight`` need to be passed and modified within a
+  ``Pipeline``.
+  
+Current implementation
+......................
 
+* Outlier rejection are implemented in:
+  https://github.com/scikit-learn/scikit-learn/pull/13269
+  
 Backward compatibility
 ----------------------
 
@@ -92,7 +91,7 @@ There is no backward incompatibilities with the current API.
 Discussion
 ----------
 
-* https://github.com/scikit-learn/scikit-learn/pull/13269
+* https://github.com/scikit-learn/scikit-learn/pull/13269{
 
 References and Footnotes
 ------------------------

From c16ef7b21c20ada88d59baa475e42b610f520643 Mon Sep 17 00:00:00 2001
From: Adrin Jalali <adrin.jalali@gmail.com>
Date: Tue, 5 Mar 2019 13:23:29 +0100
Subject: [PATCH 04/22] Update slep005/proposal.rst

Co-Authored-By: glemaitre <g.lemaitre58@gmail.com>
---
 slep005/proposal.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 7f34af5..7848635 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -91,7 +91,7 @@ There is no backward incompatibilities with the current API.
 Discussion
 ----------
 
-* https://github.com/scikit-learn/scikit-learn/pull/13269{
+* https://github.com/scikit-learn/scikit-learn/pull/13269
 
 References and Footnotes
 ------------------------

From e2f6a7059ec2f04681949d953313949736e76df9 Mon Sep 17 00:00:00 2001
From: Oliver Rausch <oliverrausch99@gmail.com>
Date: Tue, 25 Jun 2019 23:51:53 +0200
Subject: [PATCH 05/22] Update proposal based on discussion

- removed the proposal for pipeline modification
- added some more usecases
- added a description of the api and the constraints
---
 slep005/proposal.rst | 87 +++++++++++++++++++++++---------------------
 1 file changed, 45 insertions(+), 42 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 7848635..685f87f 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -4,7 +4,7 @@
 Resampler API
 =============
 
-:Author: Oliver Raush (oliverrausch99@gmail.com),
+:Author: Oliver Rausch (oliverrausch99@gmail.com),
          Christos Aridas (char@upatras.gr),
          Guillaume Lemaitre (g.lemaitre58@gmail.com)
 :Status: Draft
@@ -18,53 +18,57 @@ Abstract
 We propose the inclusion of a new type of estimator: resampler. The
 resampler will change the samples in ``X`` and ``y``. In short:
 
-* resamplers will reduce or augment the number of samples in ``X`` and
-  ``y``;
-* ``Pipeline`` should treat them as a separate type of estimator.
+* resamplers will reduce and/or augment the number of samples in ``X`` and
+  ``y`` during ``fit``, but will perform no changes during ``predict``.
+* a new verb/method that all resamplers must implement is introduced: ``fit_resample``.
+* A new meta-estimator, ``ResampledTrainer``, that allows for the composition of
+  resamplers and estimators is proposed.
+
 
 Motivation
 ----------
 
-Sample reduction or augmentation are part of machine-learning
-pipeline. The current scikit-learn API does not offer support for such
+Sample reduction or augmentation are common parts of machine-learning
+pipelines. The current scikit-learn API does not offer support for such
 use cases.
 
-Two possible use cases are currently reported:
+Usecases
+........
+
+* sample rebalancing to correct bias toward class with large cardinality
+* outlier rejection to fit a clean dataset
+* representing a dataset by generating centroids of clustering methods.
+* adding unlabeled samples to a dataset during semi-supervised fit time for
+  cross validation (simply passing a semi-supervised dataset to cross validation
+  methods doesn't work since the cross validation will treat the label -1 as a
+  separate class). Alternative approach is a new cv splitter.
 
-* sample rebalancing to correct bias toward class with large cardinality;
-* outlier rejection to fit a clean dataset.
-   
 Implementation
 --------------
+API and Constraints
+...................
+Resamplers implement a method ``fit_resample(X, y)``, a pure function which
+returns ``Xt, yt`` corresponding to the resampled dataset, where samples may
+have been added and/or removed.
+
+Resamplers cannot be transformers, that is, a resampler cannot implement
+``fit_transform`` or ``transform``. Similarly, transformers cannot implement ``fit_resample``.
+
+Resamplers may not change the order, meaning or format of features (This is left
+to Transformers).
+
+ResampledTrainer
+................
+This metaestimator composes a resampler and a predictor. It
+behaves as follows:
+
+ ``fit(X, y)``: resample ``X, y`` with the resampler, then fit on the resampled
+  dataset.
+* ``predict(X)``: simply predict on ``X`` with the predictor.
+* ``score(X)``: simply score on ``X`` with the predictor.
+
+See PR #13269 for an implementation.
 
-To handle outlier rejector in ``Pipeline``, we enforce the following:
-
-* an estimator cannot implement both ``fit_resample(X, y)`` and
-  ``fit_transform(X)`` / ``transform(X)``. If both are implemented,
-  ``Pipeline`` will not be able to know which of the two methods to
-  call.
-* resamplers are only applied during ``fit``. Otherwise, scoring will
-  be harder. Specifically, the pipeline will act as follows:
-  
-  ===================== ================================
-  Method                Resamplers applied               
-  ===================== ================================
-  ``fit``               Yes
-  ``fit_transform``     Yes
-  ``fit_resample``      Yes
-  ``transform``         No
-  ``predict``           No
-  ``score``             No
-  ``fit_predict``       not supported 
-  ===================== ================================
-
-* ``fit_predict(X)`` (i.e., clustering methods) should not be called
-  if an outlier rejector is in the pipeline. The output will be of
-  different size than ``X`` breaking metric computation.
-* in a supervised scheme, resampler will need to validate which type
-  of target is passed. Up to our knowledge, supervised are used for
-  binary and multiclass classification.
-  
 Alternative implementation
 ..........................
 
@@ -76,13 +80,12 @@ perform resampling. However, the current limitations are:
 * ``sample_weight`` can be applied at both fit and predict time;
 * ``sample_weight`` need to be passed and modified within a
   ``Pipeline``.
-  
+
 Current implementation
 ......................
 
-* Outlier rejection are implemented in:
-  https://github.com/scikit-learn/scikit-learn/pull/13269
-  
+https://github.com/scikit-learn/scikit-learn/pull/13269
+
 Backward compatibility
 ----------------------
 

From 8f8ebb6b81e7a619a09e6aa6e33a2ffe3582dbf1 Mon Sep 17 00:00:00 2001
From: Oliver Rausch <oliverrausch99@gmail.com>
Date: Wed, 26 Jun 2019 00:08:44 +0200
Subject: [PATCH 06/22] Reword semisupervised usecase

---
 slep005/proposal.rst | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 685f87f..dd4448d 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -32,16 +32,17 @@ Sample reduction or augmentation are common parts of machine-learning
 pipelines. The current scikit-learn API does not offer support for such
 use cases.
 
-Usecases
-........
+Possible Usecases
+.................
 
 * sample rebalancing to correct bias toward class with large cardinality
 * outlier rejection to fit a clean dataset
 * representing a dataset by generating centroids of clustering methods.
-* adding unlabeled samples to a dataset during semi-supervised fit time for
-  cross validation (simply passing a semi-supervised dataset to cross validation
-  methods doesn't work since the cross validation will treat the label -1 as a
-  separate class). Alternative approach is a new cv splitter.
+* currently semi-supervised learning is not supported by scoring-based
+  functions like ``cross_val_score``, ``GridSearchCV`` or ``validation_curve``
+  since the scorers will regard "unlabeled" as a separate class. A resampler
+  could add the unlabeled samples to the dataset during fit time to solve this
+  (note that this can also be solved by a new cv splitter).
 
 Implementation
 --------------

From ae03400215adfb0dc210f8a20b11e78fe775c20e Mon Sep 17 00:00:00 2001
From: Oliver Rausch <oliverrausch99@gmail.com>
Date: Wed, 26 Jun 2019 21:07:16 +0200
Subject: [PATCH 07/22] Add description of first few pipeline methods

---
 slep005/proposal.rst | 42 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index dd4448d..8439aaa 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -43,6 +43,7 @@ Possible Usecases
   since the scorers will regard "unlabeled" as a separate class. A resampler
   could add the unlabeled samples to the dataset during fit time to solve this
   (note that this can also be solved by a new cv splitter).
+* Dataset augmentation (very common in vision problems)
 
 Implementation
 --------------
@@ -55,8 +56,10 @@ have been added and/or removed.
 Resamplers cannot be transformers, that is, a resampler cannot implement
 ``fit_transform`` or ``transform``. Similarly, transformers cannot implement ``fit_resample``.
 
-Resamplers may not change the order, meaning or format of features (This is left
-to Transformers).
+Resamplers may not change the order, meaning, dtype or format of features (this is left
+to transformers).
+
+Resamplers should also resample any kwargs that are array-like and have the same `shape[0]` as `X` and `y`.
 
 ResampledTrainer
 ................
@@ -70,6 +73,41 @@ behaves as follows:
 
 See PR #13269 for an implementation.
 
+Modifying Pipeline
+..................
+As an alternative to ``ResampledTrainer``, ``Pipeline`` could be modified to
+accomodate resamplers.
+The functionality is described in terms of the head (all stages except the last)
+and the tail (the last stage) of the ``Pipeline``. Note that we assume
+resamplers and transformers are exclusive so that the pipeline can decide which
+method to call. Further note that ``Xt, yt`` are the outputs of the stage, and
+``X, y`` are the inputs to the stage.
+
+``fit``:
+  head for resamplers: `Xt, yt = est.fit_resample(X, y)`
+  head for transformers: `Xt, yt = est.fit_transform(X, y)`
+  tail for transformers and predictors: `est.fit(X, y)`
+  tail for resamplers: `pass`
+
+``fit_transform``:
+  Equivalent to `fit(X, y).transform(X)` overall
+
+``predict``
+  head for resamplers: `Xt = X`
+  head for transformers: `Xt = est.transform(X)`
+  tail for predictors: `return est.predict(X)`
+  tail for transformers and resamplers: `error`
+
+``transform``
+  head for resamplers: `Xt = X`
+  head for transformers: `Xt = est.transform(X)`
+  tail for predictors and resamplers: `error`
+  tail for transformers: `return est.transform(X)`
+
+``score``
+  see predict
+
+
 Alternative implementation
 ..........................
 

From 10c85ff580ae2aa9b753a5f3d08781570e90db01 Mon Sep 17 00:00:00 2001
From: Oliver Rausch <oliverrausch99@gmail.com>
Date: Wed, 26 Jun 2019 21:13:33 +0200
Subject: [PATCH 08/22] Add code examples

---
 slep005/proposal.rst | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 8439aaa..0725aaf 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -73,6 +73,19 @@ behaves as follows:
 
 See PR #13269 for an implementation.
 
+Example Usage
+"""""""""""""
+::
+    est = ResamplingTrainer(RandomUnderSampler(), SVC())
+    est = make_pipeline(
+        StandardScaler(),
+        ResamplingTrainer(Birch(), make_pipeline(SelectKBest(), SVC()))
+    )
+    est = ResamplingTrainer(
+        RandomUnderSampler(),
+        make_pipeline(StandardScaler(), SelectKBest(), SVC()),
+    )
+
 Modifying Pipeline
 ..................
 As an alternative to ``ResampledTrainer``, ``Pipeline`` could be modified to
@@ -107,6 +120,12 @@ method to call. Further note that ``Xt, yt`` are the outputs of the stage, and
 ``score``
   see predict
 
+Example Usage::
+    est = make_pipeline(RandomUnderSampler(), SVC())
+    est = make_pipeline(StandardScaler(), Birch(), SelectKBest(), SVC())
+    est = make_pipeline(
+        RandomUnderSampler(), StandardScaler(), SelectKBest(), SVC()
+    )
 
 Alternative implementation
 ..........................

From c39d615439746c785f8900d242de9f2c58a79d8b Mon Sep 17 00:00:00 2001
From: Oliver Rausch <oliverrausch99@gmail.com>
Date: Wed, 26 Jun 2019 21:16:20 +0200
Subject: [PATCH 09/22] formatting

---
 slep005/proposal.rst | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 0725aaf..a64326e 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -73,9 +73,7 @@ behaves as follows:
 
 See PR #13269 for an implementation.
 
-Example Usage
-"""""""""""""
-::
+Example Usage::
     est = ResamplingTrainer(RandomUnderSampler(), SVC())
     est = make_pipeline(
         StandardScaler(),

From 2de0d4885005ec552ed106e3eb62b645cb473889 Mon Sep 17 00:00:00 2001
From: Oliver Rausch <oliverrausch99@gmail.com>
Date: Thu, 27 Jun 2019 02:48:16 +0200
Subject: [PATCH 10/22] Formatting and cleanup

---
 index.rst            |  2 +-
 slep005/proposal.rst | 82 +++++++++++++++++++++++++++++---------------
 2 files changed, 55 insertions(+), 29 deletions(-)

diff --git a/index.rst b/index.rst
index cbe75c6..68c4028 100644
--- a/index.rst
+++ b/index.rst
@@ -10,6 +10,7 @@
     :caption: Under review
 
     under_review
+    slep005/proposal
 
 .. toctree::
     :maxdepth: 1
@@ -26,7 +27,6 @@
     slep002/proposal
     slep003/proposal
     slep004/proposal
-    slep005/proposal
 
 .. toctree::
     :maxdepth: 1
diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index a64326e..0317d5f 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -18,11 +18,15 @@ Abstract
 We propose the inclusion of a new type of estimator: resampler. The
 resampler will change the samples in ``X`` and ``y``. In short:
 
-* resamplers will reduce and/or augment the number of samples in ``X`` and
-  ``y`` during ``fit``, but will perform no changes during ``predict``.
-* a new verb/method that all resamplers must implement is introduced: ``fit_resample``.
-* A new meta-estimator, ``ResampledTrainer``, that allows for the composition of
-  resamplers and estimators is proposed.
+* a new verb/method that all resamplers must implement is introduced:
+  ``fit_resample``.
+* resamplers are able to reduce and/or augment the number of samples in
+  ``X`` and ``y`` during ``fit``, but will perform no changes during
+  ``predict``.
+* to facilitate this behavior a new meta-estimator (``ResampledTrainer``) that
+  allows for the composition of resamplers and estimators is proposed.
+  Alternatively we propose changes to ``Pipeline`` that also enable similar
+  compositions.
 
 
 Motivation
@@ -35,34 +39,41 @@ use cases.
 Possible Usecases
 .................
 
-* sample rebalancing to correct bias toward class with large cardinality
-* outlier rejection to fit a clean dataset
-* representing a dataset by generating centroids of clustering methods.
-* currently semi-supervised learning is not supported by scoring-based
+* Sample rebalancing to correct bias toward class with large cardinality
+  outlier rejection to fit a clean dataset.
+* Sample reduction e.g. representing a dataset by its k-means centroids.
+* Currently semi-supervised learning is not supported by scoring-based
   functions like ``cross_val_score``, ``GridSearchCV`` or ``validation_curve``
   since the scorers will regard "unlabeled" as a separate class. A resampler
   could add the unlabeled samples to the dataset during fit time to solve this
-  (note that this can also be solved by a new cv splitter).
-* Dataset augmentation (very common in vision problems)
+  (note that this could also be solved by a new cv splitter).
+* NaNRejector (drop all samples that contain nan).
+* Dataset augmentation (like is commonly done in DL).
 
 Implementation
 --------------
+
 API and Constraints
 ...................
-Resamplers implement a method ``fit_resample(X, y)``, a pure function which
-returns ``Xt, yt`` corresponding to the resampled dataset, where samples may
-have been added and/or removed.
 
-Resamplers cannot be transformers, that is, a resampler cannot implement
-``fit_transform`` or ``transform``. Similarly, transformers cannot implement ``fit_resample``.
+* Resamplers implement a method ``fit_resample(X, y, **kwargs)``, a pure function which
+  returns ``Xt, yt, kwargs`` corresponding to the resampled dataset, where
+  samples may have been added and/or removed.
+* An estimator may only implement either ``fit_transform`` or ``fit_resample``.
+* Resamplers may not change the order, meaning, dtype or format of features
+  (this is left to transformers).
+* Resamplers should also resample any kwargs.
 
-Resamplers may not change the order, meaning, dtype or format of features (this is left
-to transformers).
+Composition
+-----------
 
-Resamplers should also resample any kwargs that are array-like and have the same `shape[0]` as `X` and `y`.
+An key part of the proposal is the introduction of a way of composing resamplers
+with predictors. We present two options: ``ResampledTrainer`` and modifications
+to ``Pipeline``.
 
 ResampledTrainer
 ................
+
 This metaestimator composes a resampler and a predictor. It
 behaves as follows:
 
@@ -73,7 +84,10 @@ behaves as follows:
 
 See PR #13269 for an implementation.
 
-Example Usage::
+Example Usage:
+
+.. code-block:: python
+
     est = ResamplingTrainer(RandomUnderSampler(), SVC())
     est = make_pipeline(
         StandardScaler(),
@@ -83,6 +97,11 @@ Example Usage::
         RandomUnderSampler(),
         make_pipeline(StandardScaler(), SelectKBest(), SVC()),
     )
+    clf = ResampledTrainer(
+        NaNRejector(), # removes samples containing NaN
+        ResampledTrainer(RandomUnderSampler(),
+            make_pipeline(StandardScaler(), SGDClassifier()))
+    )
 
 Modifying Pipeline
 ..................
@@ -91,17 +110,17 @@ accomodate resamplers.
 The functionality is described in terms of the head (all stages except the last)
 and the tail (the last stage) of the ``Pipeline``. Note that we assume
 resamplers and transformers are exclusive so that the pipeline can decide which
-method to call. Further note that ``Xt, yt`` are the outputs of the stage, and
-``X, y`` are the inputs to the stage.
+method to call. Further note that ``Xt, yt, kwt`` are the outputs of the stage, and
+``X, y, **kw`` are the inputs to the stage.
 
 ``fit``:
-  head for resamplers: `Xt, yt = est.fit_resample(X, y)`
-  head for transformers: `Xt, yt = est.fit_transform(X, y)`
-  tail for transformers and predictors: `est.fit(X, y)`
-  tail for resamplers: `pass`
+  head for resamplers: `Xt, yt, kwt = est.fit_resample(X, y, **kw)`.
+  head for transformers: `Xt, yt = est.fit_transform(X, y, **kw)`.
+  tail for transformers and predictors: `est.fit(X, y, **kw)`.
+  tail for resamplers: `pass`.
 
 ``fit_transform``:
-  Equivalent to `fit(X, y).transform(X)` overall
+  Equivalent to `fit(X, y).transform(X)` overall.
 
 ``predict``
   head for resamplers: `Xt = X`
@@ -118,12 +137,19 @@ method to call. Further note that ``Xt, yt`` are the outputs of the stage, and
 ``score``
   see predict
 
-Example Usage::
+Example Usage:
+
+.. code-block:: python
+
     est = make_pipeline(RandomUnderSampler(), SVC())
     est = make_pipeline(StandardScaler(), Birch(), SelectKBest(), SVC())
     est = make_pipeline(
         RandomUnderSampler(), StandardScaler(), SelectKBest(), SVC()
     )
+    est = make_pipeline(
+        NaNRejector(), RandomUnderSampler(), StandardScaler(), SGDClassifer()
+    )
+
 
 Alternative implementation
 ..........................

From 387b338f0e6eb7fadf1a98852d34095648dbe463 Mon Sep 17 00:00:00 2001
From: Oliver Rausch <oliverrausch99@gmail.com>
Date: Thu, 27 Jun 2019 02:51:00 +0200
Subject: [PATCH 11/22] even more formatting

---
 slep005/proposal.rst | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 0317d5f..f7f870b 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -77,7 +77,7 @@ ResampledTrainer
 This metaestimator composes a resampler and a predictor. It
 behaves as follows:
 
- ``fit(X, y)``: resample ``X, y`` with the resampler, then fit on the resampled
+* ``fit(X, y)``: resample ``X, y`` with the resampler, then fit on the resampled
   dataset.
 * ``predict(X)``: simply predict on ``X`` with the predictor.
 * ``score(X)``: simply score on ``X`` with the predictor.
@@ -114,25 +114,25 @@ method to call. Further note that ``Xt, yt, kwt`` are the outputs of the stage,
 ``X, y, **kw`` are the inputs to the stage.
 
 ``fit``:
-  head for resamplers: `Xt, yt, kwt = est.fit_resample(X, y, **kw)`.
-  head for transformers: `Xt, yt = est.fit_transform(X, y, **kw)`.
-  tail for transformers and predictors: `est.fit(X, y, **kw)`.
-  tail for resamplers: `pass`.
+  head for resamplers: ``Xt, yt, kwt = est.fit_resample(X, y, **kw)``.
+  head for transformers: ``Xt, yt = est.fit_transform(X, y, **kw)``.
+  tail for transformers and predictors: ``est.fit(X, y, **kw)``.
+  tail for resamplers: ``pass``.
 
 ``fit_transform``:
-  Equivalent to `fit(X, y).transform(X)` overall.
+  Equivalent to ``fit(X, y).transform(X)`` overall.
 
 ``predict``
-  head for resamplers: `Xt = X`
-  head for transformers: `Xt = est.transform(X)`
-  tail for predictors: `return est.predict(X)`
-  tail for transformers and resamplers: `error`
+  head for resamplers: ``Xt = X``
+  head for transformers: ``Xt = est.transform(X)``
+  tail for predictors: ``return est.predict(X)``
+  tail for transformers and resamplers: ``error``
 
 ``transform``
-  head for resamplers: `Xt = X`
-  head for transformers: `Xt = est.transform(X)`
-  tail for predictors and resamplers: `error`
-  tail for transformers: `return est.transform(X)`
+  head for resamplers: ``Xt = X``
+  head for transformers: ``Xt = est.transform(X)``
+  tail for predictors and resamplers: ``error``
+  tail for transformers: ``return est.transform(X)``
 
 ``score``
   see predict

From 5ecfead026aa618288baf4fd893d5dec37fab5ec Mon Sep 17 00:00:00 2001
From: Oliver Rausch <oliverrausch99@gmail.com>
Date: Thu, 27 Jun 2019 02:52:37 +0200
Subject: [PATCH 12/22] more formatting

---
 slep005/proposal.rst | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index f7f870b..8fdc3e5 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -114,25 +114,25 @@ method to call. Further note that ``Xt, yt, kwt`` are the outputs of the stage,
 ``X, y, **kw`` are the inputs to the stage.
 
 ``fit``:
-  head for resamplers: ``Xt, yt, kwt = est.fit_resample(X, y, **kw)``.
-  head for transformers: ``Xt, yt = est.fit_transform(X, y, **kw)``.
-  tail for transformers and predictors: ``est.fit(X, y, **kw)``.
-  tail for resamplers: ``pass``.
+* head for resamplers: ``Xt, yt, kwt = est.fit_resample(X, y, **kw)``.
+* head for transformers: ``Xt, yt = est.fit_transform(X, y, **kw)``.
+* tail for transformers and predictors: ``est.fit(X, y, **kw)``.
+* tail for resamplers: ``pass``.
 
 ``fit_transform``:
-  Equivalent to ``fit(X, y).transform(X)`` overall.
+* Equivalent to ``fit(X, y).transform(X)`` overall.
 
 ``predict``
-  head for resamplers: ``Xt = X``
-  head for transformers: ``Xt = est.transform(X)``
-  tail for predictors: ``return est.predict(X)``
-  tail for transformers and resamplers: ``error``
+* head for resamplers: ``Xt = X``
+* head for transformers: ``Xt = est.transform(X)``
+* tail for predictors: ``return est.predict(X)``
+* tail for transformers and resamplers: ``error``
 
 ``transform``
-  head for resamplers: ``Xt = X``
-  head for transformers: ``Xt = est.transform(X)``
-  tail for predictors and resamplers: ``error``
-  tail for transformers: ``return est.transform(X)``
+* head for resamplers: ``Xt = X``
+* head for transformers: ``Xt = est.transform(X)``
+* tail for predictors and resamplers: ``error``
+* tail for transformers: ``return est.transform(X)``
 
 ``score``
   see predict

From e7faa6ee3151540d5759c04c4d308ae1bc18792e Mon Sep 17 00:00:00 2001
From: Oliver Rausch <oliverrausch99@gmail.com>
Date: Thu, 27 Jun 2019 02:54:57 +0200
Subject: [PATCH 13/22] try these headings

---
 slep005/proposal.rst | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 8fdc3e5..58921de 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -113,28 +113,33 @@ resamplers and transformers are exclusive so that the pipeline can decide which
 method to call. Further note that ``Xt, yt, kwt`` are the outputs of the stage, and
 ``X, y, **kw`` are the inputs to the stage.
 
-``fit``:
+fit
+~~~
 * head for resamplers: ``Xt, yt, kwt = est.fit_resample(X, y, **kw)``.
 * head for transformers: ``Xt, yt = est.fit_transform(X, y, **kw)``.
 * tail for transformers and predictors: ``est.fit(X, y, **kw)``.
 * tail for resamplers: ``pass``.
 
-``fit_transform``:
+fit_transform
+~~~~~~~~~~~~~
 * Equivalent to ``fit(X, y).transform(X)`` overall.
 
-``predict``
+predict
+~~~~~~~
 * head for resamplers: ``Xt = X``
 * head for transformers: ``Xt = est.transform(X)``
 * tail for predictors: ``return est.predict(X)``
 * tail for transformers and resamplers: ``error``
 
-``transform``
+transform
+~~~~~~~~~
 * head for resamplers: ``Xt = X``
 * head for transformers: ``Xt = est.transform(X)``
 * tail for predictors and resamplers: ``error``
 * tail for transformers: ``return est.transform(X)``
 
-``score``
+score
+~~~~~
   see predict
 
 Example Usage:

From a4019ed832ffe833d8d61999765005e79efb8b41 Mon Sep 17 00:00:00 2001
From: Oliver Rausch <oliverrausch99@gmail.com>
Date: Thu, 27 Jun 2019 02:55:45 +0200
Subject: [PATCH 14/22] last one

---
 slep005/proposal.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 58921de..025ae8d 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -140,7 +140,7 @@ transform
 
 score
 ~~~~~
-  see predict
+* see predict
 
 Example Usage:
 

From 5ddc6f9c8b72a63c1222b9c767a0380fdfec2283 Mon Sep 17 00:00:00 2001
From: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Date: Wed, 3 Jul 2019 11:23:40 +0200
Subject: [PATCH 15/22] minor rephrasing

---
 slep005/proposal.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 025ae8d..4af1f94 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -163,8 +163,8 @@ Alternatively ``sample_weight`` could be used as a placeholder to
 perform resampling. However, the current limitations are:
 
 * ``sample_weight`` is not available for all estimators;
-* ``sample_weight`` will implement only sample reductions;
-* ``sample_weight`` can be applied at both fit and predict time;
+* ``sample_weight`` will implement only simple resampling (only when resampling
+  uses original samples);
 * ``sample_weight`` need to be passed and modified within a
   ``Pipeline``.
 

From cde164b52ce7dd82e7680b598bd8ada8e7989da7 Mon Sep 17 00:00:00 2001
From: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Date: Wed, 3 Jul 2019 11:32:58 +0200
Subject: [PATCH 16/22] address comments

---
 slep005/proposal.rst | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 4af1f94..4a33cb4 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -8,7 +8,6 @@ Resampler API
          Christos Aridas (char@upatras.gr),
          Guillaume Lemaitre (g.lemaitre58@gmail.com)
 :Status: Draft
-:Type: Standards Track
 :Created: created on, in 2019-03-01
 :Resolution: <url>
 
@@ -16,7 +15,8 @@ Abstract
 --------
 
 We propose the inclusion of a new type of estimator: resampler. The
-resampler will change the samples in ``X`` and ``y``. In short:
+resampler will change the samples in ``X`` and ``y`` and return both
+``Xt`` and ``yt``. In short:
 
 * a new verb/method that all resamplers must implement is introduced:
   ``fit_resample``.
@@ -85,6 +85,7 @@ behaves as follows:
 See PR #13269 for an implementation.
 
 Example Usage:
+~~~~~~~~~~~~~~
 
 .. code-block:: python
 
@@ -105,6 +106,7 @@ Example Usage:
 
 Modifying Pipeline
 ..................
+
 As an alternative to ``ResampledTrainer``, ``Pipeline`` could be modified to
 accomodate resamplers.
 The functionality is described in terms of the head (all stages except the last)
@@ -115,10 +117,10 @@ method to call. Further note that ``Xt, yt, kwt`` are the outputs of the stage,
 
 fit
 ~~~
-* head for resamplers: ``Xt, yt, kwt = est.fit_resample(X, y, **kw)``.
-* head for transformers: ``Xt, yt = est.fit_transform(X, y, **kw)``.
-* tail for transformers and predictors: ``est.fit(X, y, **kw)``.
-* tail for resamplers: ``pass``.
+* head for resamplers: ``Xt, yt, kwt = est.fit_resample(X, y, **kw)``
+* head for transformers: ``Xt, yt = est.fit_transform(X, y, **kw)``
+* tail for transformers and predictors: ``est.fit(X, y, **kw)``
+* tail for resamplers: ``pass``
 
 fit_transform
 ~~~~~~~~~~~~~
@@ -143,6 +145,7 @@ score
 * see predict
 
 Example Usage:
+~~~~~~~~~~~~~~
 
 .. code-block:: python
 

From e87fd7e906804843d3b7df6d80edf8e75c64a839 Mon Sep 17 00:00:00 2001
From: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Date: Wed, 3 Jul 2019 13:19:19 +0200
Subject: [PATCH 17/22] Apply suggestions from code review

Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
---
 slep005/proposal.rst | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 4a33cb4..59270f2 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -39,8 +39,8 @@ use cases.
 Possible Usecases
 .................
 
-* Sample rebalancing to correct bias toward class with large cardinality
-  outlier rejection to fit a clean dataset.
+* Sample rebalancing to correct bias toward class with large cardinality.
+* Outlier rejection to fit a clean dataset.
 * Sample reduction e.g. representing a dataset by its k-means centroids.
 * Currently semi-supervised learning is not supported by scoring-based
   functions like ``cross_val_score``, ``GridSearchCV`` or ``validation_curve``
@@ -168,8 +168,8 @@ perform resampling. However, the current limitations are:
 * ``sample_weight`` is not available for all estimators;
 * ``sample_weight`` will implement only simple resampling (only when resampling
   uses original samples);
-* ``sample_weight`` need to be passed and modified within a
-  ``Pipeline``.
+* ``sample_weight`` needs to be passed and modified within a
+  ``Pipeline``, which isn't possible without something like resamplers.
 
 Current implementation
 ......................

From ad4e94fdbe2b07f55b0e80d013fddc6c230cabe7 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Wed, 3 Jul 2019 21:39:52 +1000
Subject: [PATCH 18/22] Some text about resampling pipelines and their issues

---
 slep005/proposal.rst | 126 +++++++++++++++++++++++++++----------------
 1 file changed, 80 insertions(+), 46 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 4a33cb4..7ffeef5 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -71,30 +71,44 @@ An key part of the proposal is the introduction of a way of composing resamplers
 with predictors. We present two options: ``ResampledTrainer`` and modifications
 to ``Pipeline``.
 
-ResampledTrainer
-................
+Alternative 1: ResampledTrainer
+...............................
 
 This metaestimator composes a resampler and a predictor. It
 behaves as follows:
 
-* ``fit(X, y)``: resample ``X, y`` with the resampler, then fit on the resampled
-  dataset.
+* ``fit(X, y)``: resample ``X, y`` with the resampler, then fit the predictor
+  on the resampled dataset.
 * ``predict(X)``: simply predict on ``X`` with the predictor.
 * ``score(X)``: simply score on ``X`` with the predictor.
 
 See PR #13269 for an implementation.
 
+One benefit of the ``ResampledTrainer`` is that it does not stop the resampler
+having other methods, such as ``transform``, as it is clear that the
+``ResampledTrainer`` will only call ``fit_resample``.
+
+There are complications around supporting ``fit_transform``, ``fit_predict``
+and ``fit_resample`` methods in ``ResampledTrainer``. ``fit_transform`` support
+is only possible by implementing ``fit_transform(X, y)`` as ``fit(X,
+y).transform(X)``, rather than calling ``fit_transform`` of the predictor.
+``fit_predict`` would have to behave similarly.  Thus ``ResampledTrainer``
+would not work with non-inductive estimators (TSNE, AgglomerativeClustering,
+etc.) as their final step.  If the predictor of a ``ResampledTrainer`` is
+itself a resampler, it's unclear how ``ResampledTrainer.fit_resample`` should
+behave.  These caveats also apply to the Pipeline modification below.
+
 Example Usage:
 ~~~~~~~~~~~~~~
 
 .. code-block:: python
 
-    est = ResamplingTrainer(RandomUnderSampler(), SVC())
+    est = ResampledTrainer(RandomUnderSampler(), SVC())
     est = make_pipeline(
         StandardScaler(),
-        ResamplingTrainer(Birch(), make_pipeline(SelectKBest(), SVC()))
+        ResampledTrainer(Birch(), make_pipeline(SelectKBest(), SVC()))
     )
-    est = ResamplingTrainer(
+    est = ResampledTrainer(
         RandomUnderSampler(),
         make_pipeline(StandardScaler(), SelectKBest(), SVC()),
     )
@@ -104,45 +118,65 @@ Example Usage:
             make_pipeline(StandardScaler(), SGDClassifier()))
     )
 
-Modifying Pipeline
-..................
-
-As an alternative to ``ResampledTrainer``, ``Pipeline`` could be modified to
-accomodate resamplers.
-The functionality is described in terms of the head (all stages except the last)
-and the tail (the last stage) of the ``Pipeline``. Note that we assume
-resamplers and transformers are exclusive so that the pipeline can decide which
-method to call. Further note that ``Xt, yt, kwt`` are the outputs of the stage, and
-``X, y, **kw`` are the inputs to the stage.
-
-fit
-~~~
-* head for resamplers: ``Xt, yt, kwt = est.fit_resample(X, y, **kw)``
-* head for transformers: ``Xt, yt = est.fit_transform(X, y, **kw)``
-* tail for transformers and predictors: ``est.fit(X, y, **kw)``
-* tail for resamplers: ``pass``
-
-fit_transform
-~~~~~~~~~~~~~
-* Equivalent to ``fit(X, y).transform(X)`` overall.
-
-predict
-~~~~~~~
-* head for resamplers: ``Xt = X``
-* head for transformers: ``Xt = est.transform(X)``
-* tail for predictors: ``return est.predict(X)``
-* tail for transformers and resamplers: ``error``
-
-transform
-~~~~~~~~~
-* head for resamplers: ``Xt = X``
-* head for transformers: ``Xt = est.transform(X)``
-* tail for predictors and resamplers: ``error``
-* tail for transformers: ``return est.transform(X)``
-
-score
-~~~~~
-* see predict
+Alternative 2: Prediction Pipeline
+..................................
+
+As an alternative to ``ResampledTrainer``, ``Pipeline`` can be modified to
+accomodate resamplers.  The essence of the operation is this: one or more steps
+of the pipeline may be a resampler. When fitting the Pipeline, ``fit_resample``
+will be called on each resampler instead of ``fit_transform``, and the output
+of ``fit_resample`` will be used in place of the original ``X``, ``y``, etc.,
+to fit the subsequent step (and so on).  When predicting in the Pipeline,
+the resampler will act as a passthrough step.
+
+Limitations
+~~~~~~~~~~~
+
+.. rubric:: Prohibiting ``transform`` on resamplers
+
+It may be problematic for a resampler to provide ``transform`` if Pipelines
+support resampling:
+
+1. It is unclear what to do at test time if a resampler has a transform
+   method.
+2. Adding fit_resample to the API of an an existing transformer may
+   drastically change its behaviour in a Pipeline.
+
+For this reason, it may be best to reject resamplers supporting ``transform``
+from being used in a Pipeline.
+
+.. rubric:: Prohibiting ``transform`` on resampling Pipelines
+
+Providing a ``transform`` method on a Pipeline that contains a resampler
+presents several problems:
+
+1. A resampling Pipeline needs to use a special code path for ``fit_transform``
+   that would call ``fit(X, y, **kw).transform(X)`` on the Pipeline.
+   Ordinarily a Pipeline would pass the transformed data to ``fit_transform``
+   of the left step. If the Pipeline contains a resampler, it rather needs to
+   fit the Pipeline excluding the last step, then transform the original
+   training data until the last step, then fit_transform the last step. This
+   means special code paths for pipelines containing resamplers; the effect of
+   the resampler is not localised in terms of code maintenance.
+2. As a result of issue 1, appending a step to the transformation Pipeline
+   means that the transformer which was previously last, and previously trained
+   on the full dataset, will now be trained on the resampled dataset.
+3. As a result of issue 1, the last step cannot be 'passthrough' as in other
+   transformer pipelines.
+
+For this reason, it may be best to disable ``fit_transform`` and ``transform``
+on the Pipeline. A resampling Pipeline would therefore not be usable as a
+transformation within a ``FeatureUnion`` or ``ColumnTransformer``. Thus the
+``ResampledTrainer`` would be strictly more expressive than a resampling
+Pipeline.
+
+.. rubric:: Handling ``fit`` parameters
+
+Sample props or weights cannot be routed to steps downstream of a resampler in
+a Pipeline, unless they too are resampled. It's very unclear how this would
+work with Pipeline's current prefix-based fit parameter routing.
+
+TODO: propose solutions
 
 Example Usage:
 ~~~~~~~~~~~~~~

From b989562f865e5dbd1d4e2895580fcb04dbea1f4a Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Wed, 3 Jul 2019 21:48:39 +1000
Subject: [PATCH 19/22] Some text about resampling pipelines and their issues
 (#2)

---
 slep005/proposal.rst | 126 +++++++++++++++++++++++++++----------------
 1 file changed, 80 insertions(+), 46 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 59270f2..1e375ea 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -71,30 +71,44 @@ An key part of the proposal is the introduction of a way of composing resamplers
 with predictors. We present two options: ``ResampledTrainer`` and modifications
 to ``Pipeline``.
 
-ResampledTrainer
-................
+Alternative 1: ResampledTrainer
+...............................
 
 This metaestimator composes a resampler and a predictor. It
 behaves as follows:
 
-* ``fit(X, y)``: resample ``X, y`` with the resampler, then fit on the resampled
-  dataset.
+* ``fit(X, y)``: resample ``X, y`` with the resampler, then fit the predictor
+  on the resampled dataset.
 * ``predict(X)``: simply predict on ``X`` with the predictor.
 * ``score(X)``: simply score on ``X`` with the predictor.
 
 See PR #13269 for an implementation.
 
+One benefit of the ``ResampledTrainer`` is that it does not stop the resampler
+having other methods, such as ``transform``, as it is clear that the
+``ResampledTrainer`` will only call ``fit_resample``.
+
+There are complications around supporting ``fit_transform``, ``fit_predict``
+and ``fit_resample`` methods in ``ResampledTrainer``. ``fit_transform`` support
+is only possible by implementing ``fit_transform(X, y)`` as ``fit(X,
+y).transform(X)``, rather than calling ``fit_transform`` of the predictor.
+``fit_predict`` would have to behave similarly.  Thus ``ResampledTrainer``
+would not work with non-inductive estimators (TSNE, AgglomerativeClustering,
+etc.) as their final step.  If the predictor of a ``ResampledTrainer`` is
+itself a resampler, it's unclear how ``ResampledTrainer.fit_resample`` should
+behave.  These caveats also apply to the Pipeline modification below.
+
 Example Usage:
 ~~~~~~~~~~~~~~
 
 .. code-block:: python
 
-    est = ResamplingTrainer(RandomUnderSampler(), SVC())
+    est = ResampledTrainer(RandomUnderSampler(), SVC())
     est = make_pipeline(
         StandardScaler(),
-        ResamplingTrainer(Birch(), make_pipeline(SelectKBest(), SVC()))
+        ResampledTrainer(Birch(), make_pipeline(SelectKBest(), SVC()))
     )
-    est = ResamplingTrainer(
+    est = ResampledTrainer(
         RandomUnderSampler(),
         make_pipeline(StandardScaler(), SelectKBest(), SVC()),
     )
@@ -104,45 +118,65 @@ Example Usage:
             make_pipeline(StandardScaler(), SGDClassifier()))
     )
 
-Modifying Pipeline
-..................
-
-As an alternative to ``ResampledTrainer``, ``Pipeline`` could be modified to
-accomodate resamplers.
-The functionality is described in terms of the head (all stages except the last)
-and the tail (the last stage) of the ``Pipeline``. Note that we assume
-resamplers and transformers are exclusive so that the pipeline can decide which
-method to call. Further note that ``Xt, yt, kwt`` are the outputs of the stage, and
-``X, y, **kw`` are the inputs to the stage.
-
-fit
-~~~
-* head for resamplers: ``Xt, yt, kwt = est.fit_resample(X, y, **kw)``
-* head for transformers: ``Xt, yt = est.fit_transform(X, y, **kw)``
-* tail for transformers and predictors: ``est.fit(X, y, **kw)``
-* tail for resamplers: ``pass``
-
-fit_transform
-~~~~~~~~~~~~~
-* Equivalent to ``fit(X, y).transform(X)`` overall.
-
-predict
-~~~~~~~
-* head for resamplers: ``Xt = X``
-* head for transformers: ``Xt = est.transform(X)``
-* tail for predictors: ``return est.predict(X)``
-* tail for transformers and resamplers: ``error``
-
-transform
-~~~~~~~~~
-* head for resamplers: ``Xt = X``
-* head for transformers: ``Xt = est.transform(X)``
-* tail for predictors and resamplers: ``error``
-* tail for transformers: ``return est.transform(X)``
-
-score
-~~~~~
-* see predict
+Alternative 2: Prediction Pipeline
+..................................
+
+As an alternative to ``ResampledTrainer``, ``Pipeline`` can be modified to
+accomodate resamplers.  The essence of the operation is this: one or more steps
+of the pipeline may be a resampler. When fitting the Pipeline, ``fit_resample``
+will be called on each resampler instead of ``fit_transform``, and the output
+of ``fit_resample`` will be used in place of the original ``X``, ``y``, etc.,
+to fit the subsequent step (and so on).  When predicting in the Pipeline,
+the resampler will act as a passthrough step.
+
+Limitations
+~~~~~~~~~~~
+
+.. rubric:: Prohibiting ``transform`` on resamplers
+
+It may be problematic for a resampler to provide ``transform`` if Pipelines
+support resampling:
+
+1. It is unclear what to do at test time if a resampler has a transform
+   method.
+2. Adding fit_resample to the API of an an existing transformer may
+   drastically change its behaviour in a Pipeline.
+
+For this reason, it may be best to reject resamplers supporting ``transform``
+from being used in a Pipeline.
+
+.. rubric:: Prohibiting ``transform`` on resampling Pipelines
+
+Providing a ``transform`` method on a Pipeline that contains a resampler
+presents several problems:
+
+1. A resampling Pipeline needs to use a special code path for ``fit_transform``
+   that would call ``fit(X, y, **kw).transform(X)`` on the Pipeline.
+   Ordinarily a Pipeline would pass the transformed data to ``fit_transform``
+   of the left step. If the Pipeline contains a resampler, it rather needs to
+   fit the Pipeline excluding the last step, then transform the original
+   training data until the last step, then fit_transform the last step. This
+   means special code paths for pipelines containing resamplers; the effect of
+   the resampler is not localised in terms of code maintenance.
+2. As a result of issue 1, appending a step to the transformation Pipeline
+   means that the transformer which was previously last, and previously trained
+   on the full dataset, will now be trained on the resampled dataset.
+3. As a result of issue 1, the last step cannot be 'passthrough' as in other
+   transformer pipelines.
+
+For this reason, it may be best to disable ``fit_transform`` and ``transform``
+on the Pipeline. A resampling Pipeline would therefore not be usable as a
+transformation within a ``FeatureUnion`` or ``ColumnTransformer``. Thus the
+``ResampledTrainer`` would be strictly more expressive than a resampling
+Pipeline.
+
+.. rubric:: Handling ``fit`` parameters
+
+Sample props or weights cannot be routed to steps downstream of a resampler in
+a Pipeline, unless they too are resampled. It's very unclear how this would
+work with Pipeline's current prefix-based fit parameter routing.
+
+TODO: propose solutions
 
 Example Usage:
 ~~~~~~~~~~~~~~

From ee197cbc0b88cbfdaa18910827afe8ab1c2cdaec Mon Sep 17 00:00:00 2001
From: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Date: Wed, 3 Jul 2019 13:58:30 +0200
Subject: [PATCH 20/22] minor changes

---
 slep005/proposal.rst | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 1e375ea..2402cb3 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -56,18 +56,20 @@ Implementation
 API and Constraints
 ...................
 
-* Resamplers implement a method ``fit_resample(X, y, **kwargs)``, a pure function which
-  returns ``Xt, yt, kwargs`` corresponding to the resampled dataset, where
-  samples may have been added and/or removed.
-* An estimator may only implement either ``fit_transform`` or ``fit_resample``.
+* Resamplers implement a method ``fit_resample(X, y, **kwargs)``, a pure
+  function which returns ``Xt, yt, kwargs`` corresponding to the resampled
+  dataset, where samples may have been added and/or removed.
+* An estimator may only implement either ``fit_transform`` or ``fit_resample``
+  if support for ``Resamplers`` in ``Pipeline`` is enabled.
 * Resamplers may not change the order, meaning, dtype or format of features
   (this is left to transformers).
-* Resamplers should also resample any kwargs.
+* Resamplers should also handled (e.g. resample, generate anew, etc.) any
+  kwargs.
 
 Composition
 -----------
 
-An key part of the proposal is the introduction of a way of composing resamplers
+A key part of the proposal is the introduction of a way of composing resamplers
 with predictors. We present two options: ``ResampledTrainer`` and modifications
 to ``Pipeline``.
 

From 35c140d01a290a8af8348df9c04f2fbb9d83c79f Mon Sep 17 00:00:00 2001
From: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Date: Wed, 3 Jul 2019 14:05:20 +0200
Subject: [PATCH 21/22] iter

---
 slep005/proposal.rst | 40 +++++++++++++++++++++-------------------
 1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 2402cb3..79f839f 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -60,7 +60,8 @@ API and Constraints
   function which returns ``Xt, yt, kwargs`` corresponding to the resampled
   dataset, where samples may have been added and/or removed.
 * An estimator may only implement either ``fit_transform`` or ``fit_resample``
-  if support for ``Resamplers`` in ``Pipeline`` is enabled.
+  if support for ``Resamplers`` in ``Pipeline`` is enabled
+  (see Sect. "Limitations").
 * Resamplers may not change the order, meaning, dtype or format of features
   (this is left to transformers).
 * Resamplers should also handled (e.g. resample, generate anew, etc.) any
@@ -136,13 +137,13 @@ Limitations
 
 .. rubric:: Prohibiting ``transform`` on resamplers
 
-It may be problematic for a resampler to provide ``transform`` if Pipelines
+It may be problematic for a resampler to provide ``transform`` if ``Pipeline``s
 support resampling:
 
 1. It is unclear what to do at test time if a resampler has a transform
    method.
-2. Adding fit_resample to the API of an an existing transformer may
-   drastically change its behaviour in a Pipeline.
+2. Adding ``fit_resample`` to the API of an an existing transformer may
+   drastically change its behaviour in a ``Pipeline``.
 
 For this reason, it may be best to reject resamplers supporting ``transform``
 from being used in a Pipeline.
@@ -152,31 +153,32 @@ from being used in a Pipeline.
 Providing a ``transform`` method on a Pipeline that contains a resampler
 presents several problems:
 
-1. A resampling Pipeline needs to use a special code path for ``fit_transform``
-   that would call ``fit(X, y, **kw).transform(X)`` on the Pipeline.
-   Ordinarily a Pipeline would pass the transformed data to ``fit_transform``
-   of the left step. If the Pipeline contains a resampler, it rather needs to
-   fit the Pipeline excluding the last step, then transform the original
-   training data until the last step, then fit_transform the last step. This
-   means special code paths for pipelines containing resamplers; the effect of
-   the resampler is not localised in terms of code maintenance.
-2. As a result of issue 1, appending a step to the transformation Pipeline
+1. A resampling ``Pipeline`` needs to use a special code path for
+   ``fit_transform`` that would call ``fit(X, y, **kw).transform(X)`` on the
+   ``Pipeline``.  Ordinarily a ``Pipeline`` would pass the transformed data to
+   ``fit_transform`` of the left step. If the ``Pipeline`` contains a
+   resampler, it rather needs to fit the ``Pipeline`` excluding the last step,
+   then transform the original training data until the last step, then
+   ``fit_transform`` the last step. This means special code paths for pipelines
+   containing resamplers; the effect of the resampler is not localised in terms
+   of code maintenance.
+2. As a result of issue 1, appending a step to the transformation ``Pipeline``
    means that the transformer which was previously last, and previously trained
    on the full dataset, will now be trained on the resampled dataset.
-3. As a result of issue 1, the last step cannot be 'passthrough' as in other
-   transformer pipelines.
+3. As a result of issue 1, the last step cannot be ``'passthrough'`` as in
+   other transformer pipelines.
 
 For this reason, it may be best to disable ``fit_transform`` and ``transform``
-on the Pipeline. A resampling Pipeline would therefore not be usable as a
+on the Pipeline. A resampling ``Pipeline`` would therefore not be usable as a
 transformation within a ``FeatureUnion`` or ``ColumnTransformer``. Thus the
 ``ResampledTrainer`` would be strictly more expressive than a resampling
-Pipeline.
+``Pipeline``.
 
 .. rubric:: Handling ``fit`` parameters
 
 Sample props or weights cannot be routed to steps downstream of a resampler in
-a Pipeline, unless they too are resampled. It's very unclear how this would
-work with Pipeline's current prefix-based fit parameter routing.
+a ``Pipeline``, unless they too are resampled. It's very unclear how this would
+work with ``Pipeline``'s current prefix-based fit parameter routing.
 
 TODO: propose solutions
 

From bc45d6aba464398b5d6ecf755a3570b087a3ecdd Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Tue, 27 Aug 2019 00:23:30 +1000
Subject: [PATCH 22/22] Some comments on fit params

---
 slep005/proposal.rst | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/slep005/proposal.rst b/slep005/proposal.rst
index 7ffeef5..7f13530 100644
--- a/slep005/proposal.rst
+++ b/slep005/proposal.rst
@@ -173,10 +173,15 @@ Pipeline.
 .. rubric:: Handling ``fit`` parameters
 
 Sample props or weights cannot be routed to steps downstream of a resampler in
-a Pipeline, unless they too are resampled. It's very unclear how this would
-work with Pipeline's current prefix-based fit parameter routing.
-
-TODO: propose solutions
+a Pipeline, unless they too are resampled. To support this, a resampler
+would need to be passed all props that are required downstream, and
+``fit_resample`` should return resampled versions of them. Note that these
+must be distinct from parameters that affect the resampler's fitting.
+That is, consider the signature ``fit_resample(X, y=None, props=None, sample_weight=None)``.
+The ``sample_weight`` passed in should affect the resampling, but does not
+itself need to be resampled. A Pipeline would pass ``props`` including the fit
+parameters required downstream, which would be resampled and returned by
+``fit_resample``.
 
 Example Usage:
 ~~~~~~~~~~~~~~
@@ -191,6 +196,7 @@ Example Usage:
     est = make_pipeline(
         NaNRejector(), RandomUnderSampler(), StandardScaler(), SGDClassifer()
     )
+    est.fit(X,y, sgdclassifier__sample_weight=my_weight)
 
 
 Alternative implementation