Bandits codes contributed by Louie Hoang at MSR.
Bandits codes for algorithms like UCB,TS etc. and analyses written by me. Could include implementations of papers and tutorials.
bandits-codes
├── README.md
├── bandit_rl_implementations
│ ├── Bandits_email_version.py
│ ├── CBThompsonSampling_LogisticReg.py
│ ├── MAB_TS_posteriors.png
│ ├── TS_Loss_ER.png
│ ├── versions=10_ignore_regret.png
│ └── versions=3,T=1000.png
└── louie_experiments
├── LinUCB.py
├── bandit_data_format.py
├── base_model.py
├── beta_bernoulli.py
├── data_reader.py
├── driver.py
├── driver_contextual.py
├── driver_gaussian_rewards.py
├── driver_gaussian_rewards_two_bandits.py
├── driver_two_bandits.py
├── driver_two_bandits_contextual.py
├── epsilonGreedyPolicy.py
├── epsilon_greedy.py
├── forced_actions.py
├── gaussian_reward.py
├── generate_single_bandit.py
├── greedy_model.py
├── logistic_regression.py
├── logistic_regression_scikit.py
├── ng_normal.py
├── nig_normal.py
├── output_format.py
├── plot_graph.py
├── queueBasedBandit.py
├── random_policy.py
├── test_logistic_regression.py
├── thompson_ng_policy.py
├── thompson_nig_policy.py
├── thompson_policy.py
├── two_factor_reward.py
└── ucb1.py