Skip to content

mcmcse: updates, cleanup, and efficiency

pvarshney1729 edited this page Mar 24, 2020 · 8 revisions

Background

The mcmcse package was built to estimate Monte Carlo standard errors for Markov chain Monte Carlo. It has since expanded to multivariate output analysis methods and the reliable calculation of effective sample size. However, the package is structured to only take a single Markov chain as input. Reliable estimation of standard errors from multiple chains can be done via replicated variance calculations. Implementing these replicated variance calculations requires significant updates to the package, and the addition of a few functions specifically for multiple chains.

Most of the heavy coding is written in C++ using Rcpp. A CRAN hosted version of the package is here and a GitHub development version of the package is here.

Related work

There are a few other packages in R that do univariate effective sample size calculations (for multiple chains), the most popular of which is coda. However, coda does not use consistent estimators of the variance, and the variance estimates are known to be liberal. In addition, there is no other package that we know that does multivariate effective sample size calculations.

Details of your coding project

Over the three months, I would expect the student to complete the following tasks:

  • Change the current implementation of the function multiESS to allow the input of list of Markov chains, and estimate the effective sample size from replicated variance methods, including batch means and spectral variance methods.
  • Implement all additional computationally heavy coding with Rcpp and perform heavy benchmarking to find under which situations does the computation become too burdensome.
  • Test all functions for numerical instabilities.
  • The current version of the package requires thorough user testing and code testing. This will require the addition of testthat.
  • The student will be required to improve documentation on multiESS and bring uniformity in all documentations.

Expected impact

The package mcmcse has been dowloaded over 30,000 times and has 71 citations on Google Scholar. Already the package has been found to be useful by the generic scientific community, and any and all improvements in the package will continue to benefit this larger community.

Mentors

  • EVALUATING MENTOR: Dootika Vats [email protected] is the author and maintainer of R package mcmcse and a contributor on R package stableGR. She was a GSoC student participant in 2015 for this same package and an expert in MCMC output analysis.
  • James Flegal [email protected] is the founding author of the package and an expert in MCMC output analysis

Tests

Students, please do one or more of the following tests before contacting the mentors above.

MENTORS: write several tests that potential students can do to demonstrate their capabilities for this particular project. Ask some hard questions that will give you insight about how the students write code to solve problems. You'll see that the harder the questions that you ask, the easier it will be for you to choose between the students that apply for your project! Please modify the suggestions below to make them specific for your project.

  • Easy: (1) Download the mcmcse package from CRAN and use the function ess on a vector foo of length 1e4 randomly drawn from a standard normal distribution. (2) Make a random matrix of size 10 x 10 and produce only the eigenvalues of the matrix.
  • Medium: Write a function that runs a Gaussian AR(1) model and use mcmcse to estimate the effective sample size.
  • Hard: Implement the replicated batch means estimator from this paper.

Solutions of tests

Students, please post a link to your test results here.

Clone this wiki locally