Version numbers follow Semantic Versioning.
2020-03-18
- Dependencies are being managed different now
- For the paper comparison vignette, all of the code is pre-run and saved in the LDATS-replications repository
- Allows removal of otherwise unused packages from this package's dependency list
2020-03-02
straingsAsFactors
update- only involved patching one test
2019-12-22
- Don't need to make a sparse matrix to pass in now
- Tweaking the simulation functions to simplify X
- Using only as.matrix() fails if there is only 1 year in a segment and there are multiple covariates. In that case, as.matrix(X[in1, ]) returns a matrix of n_covariates rows x 1 column, instead of a matrix of 1 row and n_covariates columns. This edit should fix that by forcing it into a matrix of the correct number of rows.
2019-07-28
- Added space between braces for auto-linking in the description text.
2019-07-24
- Output given by
print
/cat
has been replaced withmessage
messages. - Added examples in documentation (and replacement of
duntrun{}
withdonttest{}
) - Editing of description file for specs
- Reduction of test runtimes
messageq
replacesqprint
2019-07-10
- Including appropriate files in the Rbuildignore
- Handling the downloads so they work robustly locally
2019-07-09
- Incorporates Hao's feedback and edits on the paper comparison vignette
- Updates the vignette to work with the contemporary version of the package
- Allowed removal of the large model cache files
- Inclusion of the json file for the Zenodo page
- The .pdf describing the model (the manuscript work in progress) is now at the top level and named "LDATS_model.pdf", to allow the full model description to remain stable while the ms development happens elsewhere.
2019-07-09
- At the
LDA_TS
function level, the separate inputs for data tables (document_term_table
anddocument_covariate_table
) have been merged into a single inputdata
, which can be just thedocument_term_table
or a list including thedocument_term_table
and optionally also adocument_covariate_table
. If covariates aren't provided, the function now constructs a covariate table assuming equi-spaced observations. If using a list, the function assumes that one and only one element of the list will have a name containing the letters "term", and at most one element containing the letters "covariate" (regular expressions are used for matching). (addresses issue 119) timename
has been moved from within theTS_controls_list
to a main argument in all associated functions.- The control lists have been made easier to interact with. Primarily, the arguments that previously required
LDA_controls_list
,TS_controls_list
, orLDA_TS_controls_list
inputs now take generallist
inputs (soLDA_TS
does not need to have a nested set of control functions). Each control list is passed through a function (LDA_set_control
,TS_control
, orLDA_TS_control
) to set any non-input values to their defaults. This also allows the removal of those controls list class definitions. (addresses issue 130)
- Reduced the complexity of the example in the README (addresses issue 115)
- Added
control
input in theplot
call in the example in the README (addresses issue 116) - Reduced the number of seeds in the rodent vignette example (addresses issue 117)
- The number of observations for a VEM-fit LDA is now calculated as the number of entries in the document-term matrix (following Hoffman et al. and Buntine, see
?logLik.LDA_VEM
for references. - Associated, we now include an AICc function that is general and works in this specific case as defined (addresses issue 129)
- A few plotting functions use
devAskNewPage
to help flip through multiple outputs, but were only resetting it withdevAskNewPage(FALSE)
at the end of a clean execution. The code has been updated withon.exit(devAskNewPage(FALSE))
, which accounts for failed executions. (addresses issue 118)
summarize_TS
has been renamedpackage_TS
to align with the otherpackage_
functions that package model output.
- Basic simulation functionality has been added for help with generating data sets to analyze. (addresses issue 114)
sim_LDA_data
simulates an LDA model's document-term-matrixsim_TS_data
simulates an TS model's document-topic distribution matrixsim_LDA_TS_data
simulates an LDA_TS model's document-term-matrixsoftmax
andlogsumexp
are added as utility functions
- Function organization (addresses issue 122) and navbar formatting.
- Due to a misread of earlier code, the AIC value in the output from
TS
was named "deviance". The output has been updated to return the AIC.
2019-02-11
- Creation of a standard API and code pipeline for all components of the LDATS analysis.
- Substantial refactor of the underlying code from hardcoded to generalized functions.
- Development of checking functions used to run the basic structural checks on the function inputs.
- Inclusion of control options lists for the LDA stage, TS stage, and overall to reduce the length of input lists.
- All functions used in the code base are now exported, documented, and tested.
AIC.LDA_VEM()
now uses the number of parameters as reported fromlogLik
to calculate AIC.- Previous by-hand calculations of AIC included variational parameters that are integrated out of the model in the total parameter count.
- Time series models allow for flexible covariate set for regression via formula inputs to the top-level functions.
- The time series model code now also includes estimation of the parameters defining the between-change point regressions (i.e., the regressors).
- Regressor estimates come as marginal posterior distributions, and are calculated by unconditioning the estimates generated under known change points.
document_weights()
function is provided to allow for appropriate weighting of documents by their sizes (number of words) so that an average-length document is 1.- Document weighting is done automatically by default, which is easily undone by using
weights = NULL
.
- The ptMCMC code has been refactored into functions, many of which are generalized to use in other contexts.
- The temperature schema is fully controllable via arguments to the TS controls list
- Burn-in removal and thinning of final chains is controllable via the TS controls list
- Memoisation of
multinom_TS()
andmultinom_TS_chunk()
now is optional viamemoise_fun()
and is controlled through the TS controls list.
LDA_set()
,LDA_TS()
, andTS()
now all have default plotting options on their outputs.plot.TS()
provides MCMC diagnostic plots and summary plots.plot.LDA_TS()
plots produce the combination of plots.
- Portal rodent data from Christensen et al. (2018) are now provided in a pre-formatted and ready-to-roll data object.
- Access the data using
data(rodents)
. - Note, however, that the data in Christensen et al. 2018 are scaled according to trapping effort. The data included in LDATS are not, to allow for appropriate weighting. See comparison vignette for further details.
Comparison with Christensen et al. (2018)
-
The comparison vignette provides a step-by-step comparison of the LDATS pipeline to the analysis in Christensen et al. 2018.
-
The key differences are as follows:
* The `document_term_table` in Christensen *et al.* 2018 was adjusted to account for variable trapping effort. The data included in LDATS are not adjusted, so that sampling periods can be weighted appropriately. * The LDA model selection criterion has changed (see LDA model AIC calculation, above), so that LDATS now identifies 6 topics compared to the 4 topics found in the paper. * LDATS will by default weight sampling periods according to the number of terms (see Document weighting, above). * Despite these changes, the updated LDATS pipeline gives qualitatively similar results to the analysis in Christensen *et al.* 2018.
2017-11-16
- Beginning initial development of package from original code used in Christensen et al. (2018).