-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
d5259f4
commit 163c6a1
Showing
57 changed files
with
4,746 additions
and
57 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
--- | ||
title: "Formulas for Bivariate Analyses" | ||
subtitle: "Formulas and Tables" | ||
author: StatsResource | ||
output: | ||
prettydoc::html_pretty: | ||
theme: cayman | ||
highlight: github | ||
--- | ||
|
||
### Introduction | ||
|
||
* This sheet deals specifically with formulas for linear models and related bivariate analyses. | ||
|
||
* Material related to categorical data will be published elsewhere. | ||
|
||
### Bivariate Summations | ||
|
||
$$\begin{eqnarray} | ||
S_{XY} &=& | ||
\sum x_iy_i - \frac{\sum x_i\sum y_i}{n}\\ | ||
S_{XX} &=& | ||
\sum x_i^2 - \frac{(\sum x_i)^2}{n}\\ | ||
S_{YY} &=& | ||
\sum y_i^2 - \frac{(\sum y_i)^2}{n}\\ | ||
\end{eqnarray}$$ | ||
|
||
### Correlation | ||
|
||
**Pearson's correlation coefficient** | ||
|
||
$$\begin{eqnarray} | ||
r = \frac{S_{XY}}{\sqrt{S_{XX} \times S_{YY}}} | ||
\end{eqnarray}$$ | ||
|
||
### Linear Regression Estimates | ||
|
||
**Slope Estimate** | ||
$$\begin{eqnarray} | ||
b_1 = \frac{S_{XY}}{S_{XX}} | ||
\end{eqnarray}$$ | ||
|
||
**Intercept Estimate** | ||
$$\begin{eqnarray} | ||
b_0 = \bar{y} -b_1\bar{x} | ||
\end{eqnarray}$$ | ||
|
||
**Standard error of the Slope** | ||
$$\begin{eqnarray*} | ||
S.E.(b1) = \sqrt{\frac{s^2}{S_{XX}}} | ||
\end{eqnarray*}$$ | ||
|
||
where $s^2 = \frac{SSE}{n-2}$ | ||
|
||
and SSE $= S_{YY} - b_1S_{XY}$ |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
--- | ||
title: "Chi Square Test" | ||
subtitle: "Formulas and Tables" | ||
author: StatsResource | ||
output: | ||
prettydoc::html_pretty: | ||
theme: cayman | ||
highlight: github | ||
--- | ||
|
||
|
||
### Critical Values for Chi Square Test | ||
|
||
$$\begin{array}{|c|c|c|c|c|} | ||
\hline | ||
df & \alpha=0.10 & \alpha=0.05 & \alpha=0.01 & \alpha=0.001 \\ \hline | ||
1 & 2.705 & 3.841 & 6.634 & 10.827 \\ \hline | ||
2 & 4.605 & 5.991 & 7.378 & 9.21 \\ \hline | ||
3 & 6.251 & 7.815 & 9.348 & 11.345 \\ \hline | ||
4 & 7.779 & 9.488 & 11.143 & 13.277 \\ \hline | ||
5 & 9.236 & 11.07 & 12.833 & 15.086 \\ \hline | ||
6 & 10.645 & 12.592 & 14.449 & 16.812 \\ \hline | ||
7 & 12.017 & 14.067 & 16.013 & 18.475 \\ \hline | ||
8 & 13.362 & 15.507 & 17.535 & 20.09 \\ \hline | ||
9 & 14.684 & 16.919 & 19.023 & 21.666 \\ \hline | ||
10 & 15.987 & 18.307 & 20.483 & 23.209 \\ \hline | ||
\end{array}$$ |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
--- | ||
title: "Factors for Control Charts" | ||
subtitle: "Formulas and Tables" | ||
author: StatsResource | ||
output: | ||
prettydoc::html_pretty: | ||
theme: cayman | ||
highlight: github | ||
--- | ||
|
||
|
||
## Factors for Control Charts | ||
|
||
Control Chart Factors, also known as control chart constants, are critical components used in the creation and interpretation of control charts in statistical process control (SPC). These factors help in calculating control limits and other chart parameters, ensuring that the process variability is accurately monitored. Key control chart factors include the A2, D3, D4, B3, and B4 constants, which are used to set the upper and lower control limits for various types of control charts such as X-bar, R-chart, and S-chart. These factors depend on sample size and are derived from statistical distributions. | ||
|
||
|
||
$$\begin{array}{|c|c|c|c|c|c|c|} | ||
\hline | ||
\text{Sample Size (n)} & c_4 & c_5 & d_2 & d_3 & D_3 & D_4 \\ \hline | ||
2 & 0.7979 & 0.6028 & 1.128 & 0.853 & 0 & 3.267 \\ | ||
3 & 0.8862 & 0.4633 & 1.693 & 0.888 & 0 & 2.574 \\ | ||
4 & 0.9213 & 0.3889 & 2.059 & 0.88 & 0 & 2.282 \\ | ||
5 & 0.9400 & 0.3412 & 2.326 & 0.864 & 0 & 2.114 \\ | ||
6 & 0.9515 & 0.3076 & 2.534 & 0.848 & 0 & 2.004 \\ | ||
7 & 0.9594 & 0.282 & 2.704 & 0.833 & 0.076 & 1.924 \\ | ||
8 & 0.9650 & 0.2622 & 2.847 & 0.82 & 0.136 & 1.864 \\ | ||
9 & 0.9693 & 0.2459 & 2.970 & 0.808 & 0.184 & 1.816 \\ | ||
10 & 0.9727 & 0.2321 & 3.078 & 0.797 & 0.223 & 1.777 \\ | ||
11 & 0.9754 & 0.2204 & 3.173 & 0.787 & 0.256 & 1.744 \\ | ||
12 & 0.9776 & 0.2105 & 3.258 & 0.778 & 0.283 & 1.717 \\ | ||
13 & 0.9794 & 0.2019 & 3.336 & 0.770 & 0.307 & 1.693 \\ | ||
14 & 0.9810 & 0.1940 & 3.407 & 0.763 & 0.328 & 1.672 \\ | ||
15 & 0.9823 & 0.1873 & 3.472 & 0.756 & 0.347 & 1.653 \\ | ||
16 & 0.9835 & 0.1809 & 3.532 & 0.750 & 0.363 & 1.637 \\ | ||
17 & 0.9845 & 0.1754 & 3.588 & 0.744 & 0.378 & 1.622 \\ | ||
18 & 0.9854 & 0.1703 & 3.64 & 0.739 & 0.391 & 1.608 \\ | ||
19 & 0.9862 & 0.1656 & 3.689 & 0.734 & 0.403 & 1.597 \\ | ||
20 & 0.9869 & 0.1613 & 3.735 & 0.729 & 0.415 & 1.585 \\ | ||
21 & 0.9876 & 0.1570 & 3.778 & 0.724 & 0.425 & 1.575 \\ | ||
22 & 0.9882 & 0.1532 & 3.819 & 0.720 & 0.434 & 1.566 \\ | ||
23 & 0.9887 & 0.1499 & 3.858 & 0.716 & 0.443 & 1.557 \\ | ||
24 & 0.9892 & 0.1466 & 3.895 & 0.712 & 0.451 & 1.548 \\ | ||
25 & 0.9896 & 0.1438 & 3.931 & 0.708 & 0.459 & 1.541 \\ | ||
\hline | ||
\end{array}$$ | ||
|
||
|
||
### Usage | ||
|
||
By using these constants, practitioners can effectively detect process variations and maintain quality control in manufacturing and other operational processes. |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
--- | ||
title: "Dixon Q Test for Outliers" | ||
subtitle: "Formulas and Tables" | ||
author: StatsResource | ||
output: | ||
prettydoc::html_pretty: | ||
theme: cayman | ||
highlight: github | ||
--- | ||
|
||
## Dixon Q Test for Outliers | ||
|
||
The **Dixon Q Test**, also known simply as the **Q Test**, is a statistical test used to identify and reject outliers in a small data set. It's particularly useful for normally distributed data sets with fewer than 30 observations. | ||
|
||
|
||
### Test Statistic | ||
The test calculates a Q statistic, which is the ratio of the gap between the suspected outlier and the closest data point to the range of the data set. | ||
|
||
|
||
The test statistic for this procedure is as follows: | ||
|
||
$$Q_{TS} = \frac{\mbox{Gap}}{\mbox{Range}}$$ | ||
|
||
|
||
|
||
### Crtical Values | ||
|
||
If the Q statistic exceeds a critical value from a Q table (which varies based on sample size and confidence level), the suspected data point is considered an outlier and can be rejected. | ||
|
||
$$\begin{array}{|c|c|c|c|} | ||
\hline | ||
n & \alpha=0.10 & \alpha=0.05 & \alpha=0.01 \\ \hline | ||
3 & 0.941 & 0.970 & 0.994 \\ \hline | ||
4 & 0.765 & 0.829 & 0.926 \\ \hline | ||
5 & 0.642 & 0.710 & 0.821 \\ \hline | ||
6 & 0.560 & 0.625 & 0.740 \\ \hline | ||
7 & 0.507 & 0.568 & 0.680 \\ \hline | ||
8 & 0.468 & 0.526 & 0.634 \\ \hline | ||
9 & 0.437 & 0.493 & 0.598 \\ \hline | ||
10 & 0.412 & 0.466 & 0.568 \\ \hline | ||
11 & 0.392 & 0.444 & 0.542 \\ \hline | ||
12 & 0.376 & 0.426 & 0.522 \\ \hline | ||
13 & 0.361 & 0.410 & 0.503 \\ \hline | ||
14 & 0.349 & 0.396 & 0.488 \\ \hline | ||
15 & 0.338 & 0.384 & 0.475 \\ \hline | ||
\end{array}$$ | ||
|
||
|
||
If the Test Statistic is greater than the Critical value, reject the null hypothesis | ||
$$Q_{TS} > Q_{CV}$$ |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
|
||
--- | ||
title: "Kolmogorov-Smirnov Test" | ||
subtitle: "Inference Procedures with R" | ||
author: StatsResource | ||
output: | ||
prettydoc::html_pretty: | ||
theme: cayman | ||
highlight: github | ||
--- | ||
|
||
## Kolmogorov-Smirnov Test | ||
|
||
\section{Kolmogorov-Smirnov test} | ||
The Kolmogorov-Smirnov test is defined by: | ||
\\ | ||
H$_0$: The data follow a specified distribution\\ | ||
H$_1$: The data do not follow the specified distribution\\ | ||
|
||
Test Statistic: The Kolmogorov-Smirnov test statistic is defined as | ||
|
||
where F is the theoretical cumulative distribution of the distribution being tested which must be a continuous distribution (i.e., no discrete distributions such as the binomial or Poisson), and it must be fully specified | ||
|
||
|
||
### Characteristics and Limitations of the K-S Test | ||
|
||
#### Advantages | ||
An attractive feature of this test is that the distribution of the K-S test statistic itself does not depend on the underlying cumulative distribution function being tested. Another advantage is that it is an exact test (the chi-square goodness-of-fit test depends on an adequate sample size for the approximations to be valid). | ||
|
||
#### Limitations | ||
|
||
Despite these advantages, the K-S test has several important limitations: | ||
|
||
1. It only applies to continuous distributions. | ||
|
||
2. It tends to be more sensitive near the center of the distribution than at the tails. | ||
|
||
3. Perhaps the most serious limitation is that the distribution must be fully specified. That is, if location, scale, and shape parameters are estimated from the data, the critical region of the K-S test is no longer valid. It typically must be determined by simulation. | ||
|
||
Due to limitations 2 and 3 above, many analysts prefer to use the Anderson-Darling goodness-of-fit test. | ||
|
||
However, the Anderson-Darling test is only available for a few specific distributions. |
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.