Skip to content

bertilhatt/pydata_pres_small_sample

Repository files navigation

AB-testing on a small number of clusters

PyData London 2019 - Tutorial

This repository contains a presentation tutorial drafted for PyData London 2019. We look into AB-testing, also known as Randomised-control trials (RCT) — notably issues raised by testing on small samples. We consider testing under constraints, typically having to segment users along groups or clusters of users.

Stucture

This presentation covers principles of AB-tests, how to compute AA-tests and sensitivity tests and how to interpret insights from them. We then investigate complications from constraints. The most common type restriction is that users belonging to a same group have to be exposed to the same experience, therefore in the same segment (known as Control and Treatment). This de facto reduces the size of the sample to the number of clusters. We consider the impact of testing on a sample of small size throughout.

The repository uses a table of events and segment randomly generated to illustrate our points. However, we encourage people to use their own dataset to understand how tests could work on their own data.

Status

Both presenter were employees of Farfetch at the time of drafting and when presenting this tutorial. This presentation was made for educational purposes exclusively. No confidential information or code used by Farfetch is contained in this work.

A video recording of this presentation should be make available by the organisers of PyData London on their YouTube channel after the conference.

Contributions are welcome. Re-use of the code is encouraged. Usual disclaimers apply.

The slides are generated using the jupyter nbconvert tool.

Requirements

pip install -r requirements.txt

About

Tutorial for PyData London 2019 on AB Test by cluster

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published