CBM_vol2biomass.Rmd

---
title: "CBM_vol2biomass"
author:
  - Celine Boisvenue
  - Alex Chubaty
date: "14 May 2020"
output: pdf_document
editor_options: 
  chunk_output_type: console
  markdown: 
    wrap: sentence
---

# CBM_vol2biomass

## Overview

## List of input objects

| Name          | Class      | Description | Source                                                                                                                                    |
|------------|------------|-----------------------|--------------------------|
| userGCM3      | Data table |             | <https://docs.google.com/spreadsheets/d/1u7o2BzPZ2Bo7hNcC8nEctNpDmp7ce84m/edit?usp=sharing&ouid=108246386320559871010&rtpof=true&sd=true> |
| curveID       | Character  |             |                                                                                                                                           |
| gcMeta        | Data table |             | <https://docs.google.com/spreadsheets/d/1LYnShgd0Q7idNNKX9hHYju4kMDwMSkW5/>                                                               |
| spatialID     | Data table |             |                                                                                                                                           |
| canfi_species | Data table |             | <https://docs.google.com/spreadsheets/d/1YpJ9MyETyt1LBFO81xTrIdbhjO7GoK3K/>                                                               |
| ecozones      | Numeric    |             |                                                                                                                                           |
| table 3       | Data table |             | <https://nfi.nfis.org/resources/biomass_models/appendix2_table3.csv>                                                                      |
| table 4       | Data table |             | <https://nfi.nfis.org/resources/biomass_models/appendix2_table4.csv>                                                                      |
| table 5       | Data table |             | <https://nfi.nfis.org/resources/biomass_models/appendix2_table5.csv>                                                                      |
| table 6       | Data table |             | <https://nfi.nfis.org/resources/biomass_models/appendix2_table6.csv>                                                                      |
| table 7       | Data table |             | <https://nfi.nfis.org/resources/biomass_models/appendix2_table7.csv>                                                                      |
| cbmAdmin      | Data table |             | <https://drive.google.com/file/d/1xdQt9JB5KRIw72uaN5m3iOk8e34t9dyz>                                                                       |

## List of output objects

| Name              | Class      | Description |
|-------------------|------------|-------------|
| cumPoolsClean     | Data table |             |
| growth_increments | Data table |             |
| gcMetaAllCols     | Data table |             |
| volCurves         | gg (list?) |             |

## Module flow (rename this section later)
1. 
2.
3.
4.


## OLD VERSION

```{=html}
<!-- TODO: cleanup and format this file
     * one sentence per line
     * use backticks for code formatting
     * remane modules!
-->
```
# CBM_vol2biomass

## Overview

This module translates the m3/ha values into the biomass/ha increments that spadesCBMcore needs to simulate annual carbon fluxes and estimate stocks in the spadesCBMcore.R module of the spadesCBM.
This is an implementation of the Boudewyn et al 2007 stand-level translation.
Like many statistical models, this translation is not always successful.
This scripts is in line with the translation procedure in CBM-CFS3 with the addition of smoothing methods using Generalized Additive Models (GAMs).
This module provides two examples of smoothing methods: one applied to the cumulative carbon/ha curves (merch, foliage, and other for each growth curves provided), and one applied to the increment curves (differences betwee years of the cumulative curves).
**It is the responsibility of the user to decide if these curves and smoothing methods are appropriate.** This module can be run independently of the spadesCBM deck of SpaDES modules.

### User input

The user must provide two files for the growth information and one location file.
Note that these can be provided in the spadesCBMinputs module if the spadesCBM deck is used: \* one metadata file (userGcMeta) that has at a minimum an identifier that links each specific growth curve to a pixel on the study area raster (gcId - growth curve identification number) and the leading species for that curve.
\* one file that gives the m3 per ha by age for each of the identified growth curves.
\* a location information file.
This information is preferably passed along as a raster from the spadesCBMinputs module, but that can be by-passed by providing a vector of the ecozones and spatial units the growth curves apply to.
The information is used to identify the jurisdiction (province or territory) and the ecozone each gcId is in, which in turn are used to determine the appropriate parameters for the stand-level conversion from m3/ha to biomass per ha in the Boudewyn et al. approach.

### Default files provided

Two types of information are built in to this module: Boudewyn et al. parameters tables, and data frames to help link CBM-specific information and parameter categories.
\* URLs to the Boudewyn et al. five parameter tables are built into this module (see .inputObjects) and will be loaded from the NFIS site directly.
\* the cbmAdmin data frame links CBM-specific spatial units to administrative boundaries in Canada and ecozones.
\* the canfi_species data frame provides the canfi_species numbers and genus abbreviations used to identify the correct parameters in Boudewyn et al tables.
This also specifies a name for each tree species and the forest_type_id, which is an identifier used in CBM.

### Pseudocode: the general steps

This module reads in the user userGcMeta, then reads in the userGcM3.
It provides a plot of all the curves in userGcM3 for visual inspection (`sim$volCurves`).
It then matches the jurisdiction with each gcId (growth curve identification), and matches the leading species attaching a canfi_species (number) and a genus to each leading species.
canfi_species and genus are key to associating the correct parameters for conversion (Boudewyn et al. 2007).
Once all the species matches are complete, a series of functions (see the package `CBMutils`) are used to go from the provided cumulative m3/ha for each curve into cumulative tonnes of carbon/ha for each above ground live biomass pool (merch, foliage, other).
A visual check is provided via `sim$plotsRawCumulativeBiomass`.
Since most of the time these models need smoothing, two examples of smoothing, applied to the Saskatchewan default example, are provided by fitting GAMs to 1) each of the cumulative curve (per gcId) for each three pool, and 2) fitting GAMs to the increments between years for each of the three pools.
Currently, example 1) is commented out, and example 2) is used to show that sometimes, even perfect-looking curves need hard fixes.
In these examplee, the default knots for the GAMs are set at 20 (k=20) and extra weight is given to the 0 intercept and the maximum value of each curve by setting weights (wts in the script).
See <http://environmentalcomputing.net/intro-to-gams/> for an guide to GAMs.
Again, **it is the user's responsibility to decided if smoothing parameters and/or methods are appropriate**.
In the default example 1) (commented out), fitted values of the GAMs to the cumulative curves of carbon/ha are used to calculate annual increment, which are then dived by two for use in the spadesCBMcore.R module annual processing (half the growth is processes in two instances).
Halved increments can be visually assessed using `sim$checkInc` and the halved increments themselves are save in the simList (`sim$growth_increments`).
In example 2), currently executed for the default Saskatchewan example, the increments are calculated prior to fitting the GAMs.
In a final step (both examples), the halved growth increment table if hashed for processing speed (`sim$gcHash`).
This module uses an example from a region in Saskatchewan as a default simulation and can be modified to run independently (i.e., just for translation of m3 to above ground carbon in the three pools).

### Units

The user provides growth curves of cumulative m3/ha over time for one leading species (following CBM-CFS3).
Those curves are fed into the Boudewyn algorithms (CBM_vol2biomass module) with its results multiplied by 0.5 to give carbon/ha.
This deck of modules simulates on an annual basis and it is preferable that the cumulative curves of m3/ha be provided over individual years.
If these are not per individual years, the GAM fitted values should be modified to provide yearly fitted.values.
The object cumPoolsRaw (line 411) in CBM_vol2biomass.R, is the cumulative values for each of the three above-ground live pools in tonnes of carbon/ha.
All following values are in tonnes of c/ha.

### Output

This module provides the `sim$growth_increments` and its hashed version, `sim$gcHash` available for the spadesCBMcore module.
All other output created by this module is for visual checks of the growth curve themselves and their processing.

### SpaDES

There is only one event in this module (init), and this module is only scheduled once.
This module is designed to be part of SpaDES-deck spadesCBM, SpaDES modules representing CBM-CFS3, but transparent and spatialized.
There are four modules in this family: spadesCBMdefaults, spadesSBMinputs, CBM_vol2biomass, and spadesCBMcore.

### list of potential improvements

-   add a check to see if the growth curves provided are on an annual basis and if not, modify the GAM outputs to provided and annual fitted value.
-   make k (knots in the GAMs) a user defined parameter
-   provide an option to fit a GAM prior to the maximum value of each curve (from 0 to max) with its own number of knots (k) and one for after the maximum value with possibly few knots.
-   same as previous but for the weights going into the GAMs.

## Usage

```{r module_vol2biomass_usage, eval=FALSE}
library(SpaDES)
library(magrittr) # this is needed to use "%>%" below
moduleDir <- "C:/Celine/github/spadesCBM"
inputDir <- file.path(moduleDir, "inputs") %>% reproducible::checkPath(create = TRUE)
outputDir <- file.path(moduleDir, "outputs")
cacheDir <- file.path(outputDir, "cache")
times <- list(start = 0, end = 10)

parameters <- list(
  CBM_vol2biomass = list(.useCache = ".inputObjects")
  #.progress = list(type = "text", interval = 1), # for a progress bar
  ## If there are further modules, each can have its own set of parameters:
  #module1 = list(param1 = value1, param2 = value2),
  #module2 = list(param1 = value1, param2 = value2)
)

modules <- list("CBM_vol2biomass")
objects <- list(
  #userGcMetafileName <- c("/RIA2019/gcMetaRuns.csv"),
  #userGcM3 <- c("/RIA2019/gcRIAm3.csv")

)
paths <- list(
  cachePath = cacheDir,
  modulePath = moduleDir,
  inputPath = inputDir,
  outputPath = outputDir
)
options(
    rasterTmpDir = inputDir,
    reproducible.cachePath = cacheDir,
    spades.inputPath = inputDir,
    spades.outputPath = outputDir,
    spades.modulePath = moduleDir
  )

myBiomass <- simInit(times = times, params = parameters, modules = modules,
                 objects = objects)

myBiomassOut <- spades(myBiomass)
```