Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,12 @@ website:
contents:
- section: "Unbounded"
contents:
- gen_linear_regression/continuous_unbounded_overview.md
- gen_linear_regression/normal.qmd
- gen_linear_regression/student_t.qmd
- section: "Bounded"
contents:
- gen_linear_regression/continuous_bounded_overview.md
- gen_linear_regression/beta.qmd
- section: "Discrete outcome"
contents:
Expand All @@ -25,6 +27,7 @@ website:
- gen_linear_regression/bernoulli_logit.qmd
- section: "Count"
contents:
- gen_linear_regression/discrete_count_overview.qmd
- gen_linear_regression/poisson.qmd
- gen_linear_regression/negative_binomial.qmd
- gen_linear_regression/binomial.qmd
Expand Down Expand Up @@ -54,7 +57,7 @@ website:
href: tools.qmd
- text: "Contributors"
href: contributors.md

format:
html:
theme: cosmo
Expand Down
12 changes: 6 additions & 6 deletions contributors.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@
<tbody>
<tr>
<td align="center">
<a href="https://github.com/aloctavodia">
<img src="https://avatars.githubusercontent.com/u/1338958?v=4" width="100;" alt="aloctavodia"/>
<a href="https://github.com/n-kall">
<img src="https://avatars.githubusercontent.com/u/33577035?v=4" width="100;" alt="n-kall"/>
<br />
<sub><b>Osvaldo A Martin</b></sub>
<sub><b>Noa Kallioinen</b></sub>
</a>
</td>
<td align="center">
<a href="https://github.com/n-kall">
<img src="https://avatars.githubusercontent.com/u/33577035?v=4" width="100;" alt="n-kall"/>
<a href="https://github.com/aloctavodia">
<img src="https://avatars.githubusercontent.com/u/1338958?v=4" width="100;" alt="aloctavodia"/>
<br />
<sub><b>Noa Kallioinen</b></sub>
<sub><b>Osvaldo A Martin</b></sub>
</a>
</td>
<td align="center">
Expand Down
26 changes: 17 additions & 9 deletions gen_linear_regression/beta.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,35 @@ title: Beta model

## Description

Beta regression is used for outcomes on the [0, 1] interval. It is a
distributional regression, not a generalized linear model.
Beta regression is used for outcomes on the [0, 1] interval.


## Definition

For continuous outcome $y$ bounded [0, 1] and predictors $x$, the model is:

$$
\begin{align}
y_i \sim \text{beta}(\text{logit}^{-1}(\mu_i), \log{\phi_i}) \\
\mu_i = \alpha_{\mu} + \beta_{\mu} \cdot x \\
\phi_i = \alpha_{\phi} + \beta_{\phi} \cdot x \\
y_i \sim \text{beta}(\text{logit}^{-1}(\mu_i), \kappa) \\
\mu_i = \alpha + \beta \cdot x_i \\
\end{align}
$$

The [logit](https://en.wikipedia.org/wiki/Logit) function is the quantile function associated with the standard logistic distribution. Its inverse maps $\mu_i$ from the real line into a value within the interval (0, 1).


## Parameters needing priors

- $\alpha_{\mu}$ (intercept for $\mu$)
- $\alpha_{\phi}$ (intercept for $\phi$)
- $\beta_{\mu}$ (predictor weights for $\mu$)
- $\beta_{\mu}$ (predictor weights for $\phi$)
- $\alpha$ (intercept)
- $\beta$ (predictor weights)
- $\kappa$ (concentration parameter)

The larger the value of $\kappa$, the more concentrated the distribution is around the mean. This parameter is also named the precision parameter or the sample size parameter. Greek letters $\nu$ and $\phi$ are also commonly used to refer to this parameter.


## Prior for $\kappa$

For weakly informative prior we can use a gamma distribution with low probability mass on very small values of $\kappa$, such priors disfavor u-shaped beta distributions which are generally less common in real-world data. For instance, we could choose $\text{gamma}(4, 0.1)$ as discussed by [Solomon Kurz](https://solomonkurz.netlify.app/blog/2023-06-25-causal-inference-with-beta-regression/), which place most of the prior mass on the double-digit range with long right tail allowing for greater concentrations if needed.


## See also
Expand Down
9 changes: 9 additions & 0 deletions gen_linear_regression/continuous_bounded_overview.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
title: Continuous bounded outcome
---

For lower-bounded observations, there are the [gamma model](gamma.qmd),
[lognormal](lognormal.qmd)

For both upper- and lower-bounded observations, the [beta model] may
be appropriate, particularly if the bounds are 0 and 1 (or can be transformed to this scale).
7 changes: 7 additions & 0 deletions gen_linear_regression/continuous_unbounded_overview.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: Continuous unbounded outcome
---

When observations are unbounded and continuous, the most common model
is the [normal model](normal.qmd). When the observations may include
outliers, the [Student-t model](student_t.qmd) may be appropriate.
16 changes: 16 additions & 0 deletions gen_linear_regression/discrete_count_overview.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
title: Discrete count outcome
---

Counts can be considered in different ways. If the count represents
simple counting of observations, the [Poisson model](poisson.qmd) may
be an appropriate starting point. However, if there is
over-dispersion, the [negative binomial](negative_binomial.qmd) may
better capture the data.

If the counts are of the number of successes in repeated binary
trials, then the [binomial model](binomial.qmd) is appropriate.

In cases where there is an over-representation of zeroes in the
data, this can be modelled with zero-inflated versions of the
distributions.