-
Notifications
You must be signed in to change notification settings - Fork 65
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
ac0cce1
commit 1ab8a58
Showing
112 changed files
with
32,381 additions
and
0 deletions.
There are no files selected for viewing
Empty file.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,272 @@ | ||
|
||
|
||
<!DOCTYPE html> | ||
<html class="writer-html5" lang="en" > | ||
<head> | ||
<meta charset="utf-8" /> | ||
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0" /> | ||
|
||
<title>hep_ml.splot — hep_ml 0.7.0 documentation</title> | ||
|
||
|
||
|
||
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" /> | ||
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" /> | ||
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" /> | ||
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" /> | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
<!--[if lt IE 9]> | ||
<script src="../../_static/js/html5shiv.min.js"></script> | ||
<![endif]--> | ||
|
||
|
||
<script type="text/javascript" id="documentation_options" data-url_root="../../" src="../../_static/documentation_options.js"></script> | ||
<script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js"></script> | ||
<script src="../../_static/jquery.js"></script> | ||
<script src="../../_static/underscore.js"></script> | ||
<script src="../../_static/doctools.js"></script> | ||
|
||
<script type="text/javascript" src="../../_static/js/theme.js"></script> | ||
|
||
|
||
<link rel="index" title="Index" href="../../genindex.html" /> | ||
<link rel="search" title="Search" href="../../search.html" /> | ||
</head> | ||
|
||
<body class="wy-body-for-nav"> | ||
|
||
|
||
<div class="wy-grid-for-nav"> | ||
|
||
<nav data-toggle="wy-nav-shift" class="wy-nav-side"> | ||
<div class="wy-side-scroll"> | ||
<div class="wy-side-nav-search" > | ||
|
||
|
||
|
||
<a href="../../index.html" class="icon icon-home"> hep_ml | ||
|
||
|
||
|
||
</a> | ||
|
||
|
||
|
||
|
||
<div class="version"> | ||
0.7.0 | ||
</div> | ||
|
||
|
||
|
||
|
||
<div role="search"> | ||
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get"> | ||
<input type="text" name="q" placeholder="Search docs" /> | ||
<input type="hidden" name="check_keywords" value="yes" /> | ||
<input type="hidden" name="area" value="default" /> | ||
</form> | ||
</div> | ||
|
||
|
||
</div> | ||
|
||
|
||
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation"> | ||
|
||
|
||
|
||
|
||
|
||
|
||
<ul> | ||
<li class="toctree-l1"><a class="reference internal" href="../../index.html">hep_ml documentation</a></li> | ||
<li class="toctree-l1"><a class="reference internal" href="../../gb.html">Gradient boosting</a></li> | ||
<li class="toctree-l1"><a class="reference internal" href="../../losses.html">Losses for Gradient Boosting</a></li> | ||
<li class="toctree-l1"><a class="reference internal" href="../../uboost.html">uBoost</a></li> | ||
<li class="toctree-l1"><a class="reference internal" href="../../metrics.html">Metric functions</a></li> | ||
<li class="toctree-l1"><a class="reference internal" href="../../nnet.html">Neural networks</a></li> | ||
<li class="toctree-l1"><a class="reference internal" href="../../preprocessing.html">Preprocessing data</a></li> | ||
<li class="toctree-l1"><a class="reference internal" href="../../reweight.html">Reweighting algorithms</a></li> | ||
<li class="toctree-l1"><a class="reference internal" href="../../speedup.html">Fast predictions</a></li> | ||
<li class="toctree-l1"><a class="reference internal" href="../../splot.html">sPlot</a></li> | ||
<li class="toctree-l1"><a class="reference internal" href="../../notebooks.html">Code Examples</a></li> | ||
</ul> | ||
|
||
|
||
|
||
</div> | ||
|
||
</div> | ||
</nav> | ||
|
||
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"> | ||
|
||
|
||
<nav class="wy-nav-top" aria-label="top navigation"> | ||
|
||
<i data-toggle="wy-nav-top" class="fa fa-bars"></i> | ||
<a href="../../index.html">hep_ml</a> | ||
|
||
</nav> | ||
|
||
|
||
<div class="wy-nav-content"> | ||
|
||
<div class="rst-content"> | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
<div role="navigation" aria-label="breadcrumbs navigation"> | ||
|
||
<ul class="wy-breadcrumbs"> | ||
|
||
<li><a href="../../index.html" class="icon icon-home"></a> »</li> | ||
|
||
<li><a href="../index.html">Module code</a> »</li> | ||
|
||
<li>hep_ml.splot</li> | ||
|
||
|
||
<li class="wy-breadcrumbs-aside"> | ||
|
||
</li> | ||
|
||
</ul> | ||
|
||
|
||
<hr/> | ||
</div> | ||
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article"> | ||
<div itemprop="articleBody"> | ||
|
||
<h1>Source code for hep_ml.splot</h1><div class="highlight"><pre> | ||
<span></span><span class="sd">"""</span> | ||
<span class="sd">**sPlot** is reweighting technique frequently used in HEP to reconstruct the distributions of features in mixture.</span> | ||
<span class="sd">Initial information used is the probabilities obtained after fitting.</span> | ||
|
||
<span class="sd">**hep_ml.splot** contains standalone python implementation of this technique.</span> | ||
<span class="sd">This implementation is brilliantly simple and clear - just as it should be.</span> | ||
|
||
<span class="sd">Example</span> | ||
<span class="sd">-------</span> | ||
|
||
<span class="sd">>>> from hep_ml.splot import compute_sweights</span> | ||
<span class="sd">>>> p = pandas.DataFrame({'signal': p_signal, 'bkg', b_bkg})</span> | ||
<span class="sd">>>> sWeights = compute_sweights(p)</span> | ||
<span class="sd">>>> # plotting reconstructed distribution of some other variable</span> | ||
<span class="sd">>>> plt.hist(other_var, weights=sWeights.signal)</span> | ||
<span class="sd">>>> plt.hist(other_var, weights=sWeights.bkg)</span> | ||
|
||
<span class="sd">For more examples and explanations, see notebooks/Splot in repository.</span> | ||
|
||
<span class="sd">"""</span> | ||
|
||
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">division</span><span class="p">,</span> <span class="n">print_function</span><span class="p">,</span> <span class="n">absolute_import</span> | ||
<span class="kn">import</span> <span class="nn">pandas</span> | ||
<span class="kn">import</span> <span class="nn">numpy</span> | ||
<span class="kn">from</span> <span class="nn">.commonutils</span> <span class="kn">import</span> <span class="n">check_sample_weight</span> | ||
|
||
<span class="n">__author__</span> <span class="o">=</span> <span class="s1">'Alex Rogozhnikov'</span> | ||
|
||
|
||
<div class="viewcode-block" id="compute_sweights"><a class="viewcode-back" href="../../splot.html#hep_ml.splot.compute_sweights">[docs]</a><span class="k">def</span> <span class="nf">compute_sweights</span><span class="p">(</span><span class="n">probabilities</span><span class="p">,</span> <span class="n">sample_weight</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span> | ||
<span class="sd">"""Computes sWeights based on probabilities obtained from distribution fit.</span> | ||
|
||
<span class="sd"> :param probabilities: pandas.DataFrame with probabilities of shape [n_samples, n_classes].</span> | ||
<span class="sd"> These probabilities are obtained after fit (typically, mass fit).</span> | ||
<span class="sd"> Pay attention, that for each sample sum of probabilities should be equal to 1.</span> | ||
<span class="sd"> :param sample_weight: optionally you can pass weights of events, numpy.array of shape [n_samples]</span> | ||
<span class="sd"> :return: pandas.DataFrame with sWeights of shape [n_samples, n_classes]</span> | ||
<span class="sd"> """</span> | ||
<span class="c1"># converting to pandas.DataFrame</span> | ||
<span class="n">probabilities</span> <span class="o">=</span> <span class="n">pandas</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">probabilities</span><span class="p">)</span> | ||
<span class="c1"># checking sample_weight</span> | ||
<span class="n">sample_weight</span> <span class="o">=</span> <span class="n">check_sample_weight</span><span class="p">(</span><span class="n">probabilities</span><span class="p">,</span> <span class="n">sample_weight</span><span class="o">=</span><span class="n">sample_weight</span><span class="p">)</span> | ||
<span class="c1"># checking that all weights are positive</span> | ||
<span class="k">assert</span> <span class="n">numpy</span><span class="o">.</span><span class="n">all</span><span class="p">(</span><span class="n">sample_weight</span> <span class="o">>=</span> <span class="mi">0</span><span class="p">),</span> <span class="s1">'sample weight are expected to be non-negative'</span> | ||
|
||
<span class="n">p</span> <span class="o">=</span> <span class="n">numpy</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">probabilities</span><span class="p">)</span> | ||
<span class="c1"># checking that probabilities sum up to 1.</span> | ||
<span class="k">assert</span> <span class="n">numpy</span><span class="o">.</span><span class="n">allclose</span><span class="p">(</span><span class="n">p</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">),</span> <span class="mi">1</span><span class="p">,</span> <span class="n">atol</span><span class="o">=</span><span class="mf">1e-3</span><span class="p">),</span> <span class="s1">'sum of probabilities is not equal to 1.'</span> | ||
|
||
<span class="c1"># computations</span> | ||
<span class="n">initial_stats</span> <span class="o">=</span> <span class="p">(</span><span class="n">p</span> <span class="o">*</span> <span class="n">sample_weight</span><span class="p">[:,</span> <span class="n">numpy</span><span class="o">.</span><span class="n">newaxis</span><span class="p">])</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span> | ||
<span class="n">V_inv</span> <span class="o">=</span> <span class="n">p</span><span class="o">.</span><span class="n">T</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">p</span> <span class="o">*</span> <span class="n">sample_weight</span><span class="p">[:,</span> <span class="n">numpy</span><span class="o">.</span><span class="n">newaxis</span><span class="p">])</span> | ||
<span class="n">V</span> <span class="o">=</span> <span class="n">numpy</span><span class="o">.</span><span class="n">linalg</span><span class="o">.</span><span class="n">inv</span><span class="p">(</span><span class="n">V_inv</span><span class="p">)</span> <span class="o">*</span> <span class="n">initial_stats</span><span class="p">[</span><span class="n">numpy</span><span class="o">.</span><span class="n">newaxis</span><span class="p">,</span> <span class="p">:]</span> | ||
|
||
<span class="c1"># Final formula</span> | ||
<span class="n">sweights</span> <span class="o">=</span> <span class="n">p</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">V</span><span class="p">)</span> <span class="o">*</span> <span class="n">sample_weight</span><span class="p">[:,</span> <span class="n">numpy</span><span class="o">.</span><span class="n">newaxis</span><span class="p">]</span> | ||
<span class="k">return</span> <span class="n">pandas</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">sweights</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="n">probabilities</span><span class="o">.</span><span class="n">keys</span><span class="p">())</span></div> | ||
</pre></div> | ||
|
||
</div> | ||
|
||
</div> | ||
<footer> | ||
|
||
<hr/> | ||
|
||
<div role="contentinfo"> | ||
<p> | ||
© Copyright 2015-2017, Yandex; Alex Rogozhnikov and contributors. | ||
|
||
</p> | ||
</div> | ||
|
||
|
||
|
||
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a | ||
|
||
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a> | ||
|
||
provided by <a href="https://readthedocs.org">Read the Docs</a>. | ||
|
||
</footer> | ||
</div> | ||
</div> | ||
|
||
</section> | ||
|
||
</div> | ||
|
||
|
||
<script type="text/javascript"> | ||
jQuery(function () { | ||
SphinxRtdTheme.Navigation.enable(true); | ||
}); | ||
</script> | ||
|
||
|
||
|
||
|
||
|
||
|
||
</body> | ||
</html> |
Oops, something went wrong.