The goal of this assignment is to get you warmed up with git, GitHub, markdown, and R. The first few seminars will cover all you need to know for completing this practice assignment.
Note that this practice assignment, just like the later Analysis
Assignment, will need to be done in Rmarkdown (.Rmd
) because it
requires R code. You can start with the .Rmd
version of this file, and
fill it in where necessary. Note that Question 1 asks you to add an
additional file.
Check out the assignment tips to make sure your report meets our standards. It’s important to write up organized reports that are clear and easy to follow, so that your collaborators and the future you can understand what you are doing. To ensure you earn full marks on this and future assignments, be sure to address the following points:
- Well-formatted markdown (use of headings, subheadings, and numbering that makes sense)
- Codes are commented when needed and output is interpreted
- Properly labeled graphs
- No broken code
- Rmarkdown is properly knitted and the
.md
is rendered on Github (including any figures/images)
Be sure to commit, push your completed assignment, and submit the repository URL on Canvas before the deadline. Anything submitted or modified after the deadline will not be marked.
Once you have accepted this assignment through the Canvas invite link
and have a repository set up (in the format of
STAT540-UBC-2023/intro-assignment-yourgithubID
), you can start
personalizing it by adding a README.md file. This would help people who
visit your repository understand who you are and what your repository is
about.
Please be sure that your markdown is properly formatted. See this markdown reference from GitHub if you need help getting started.
Add some information to this README.md file to introduce yourself to the teaching team. Some things you may wish to include:
- Your name, your program
- Links to other profiles (e.g. Mastodon, a personal website, other relevant links, etc)
- What you hope to achieve in this course
- What this repository is for (example: “This repository contains the intro assignment for STAT540”)
- Anything else that you want. You can add images or emoji, too!
When answering the following questions, please always “sandwich” your R
code with some text. Ensure fluency by avoiding inserting R code or
outputs without any explanation. For example, if the question asks you
what is x + y
given x=1
and y=2
, you can answer it like this:
Now we want to find the sum of x and y.
x <- 1 # Note how I assign them to variables instead of just priting 1+2
y <- 2
z <- x + y
z # By doing this, I can print out the output of z. Alternately, you can also do (z <- x+y).
## [1] 3
Conclusion: we found that x + y
is 3 (notice here I’m using inline R
code to print the value of z
).
If you are looking at the .Rmd
of this file (the Rmarkdown that
creates the markdown .md
file), you’ll see that I have used inline R
code in the previous line to refer to the variable z
instead of
directly typing 3
. This is because if we ever need to change the
values of x
and y,
we could possibly forget to update where we typed
in 3
as well. You can imagine that if you have to change the initial
inputs of a long statistical reports, all your numbers in the rest of
the report will change. You don’t want to have to look through every
sentence to make sure that your report is still up-to-date.
By using variables and inline R code, I can change the value of x
and/or y
, and rest assured that when I re-knit my Rmarkdown, my
conclusion statement will be updated.
There are some data
sets
in base R for you to play around with. We will explore the Titanic
dataset. You can view it by typing Titanic
. Just like with functions,
you can use the question mark ?Titanic
to see more information about
the dataset.
Answer the following questions by subsetting, printing out the result or by graphing.
- Convert the array into a data frame by using
data.frame()
function. - How many children were on Titanic?
- How many adults were on Titanic?
- Were there more female adult or male adult passengers?
Find the answers these questions by adding code to the code block below.
# YOUR CODE HERE
Replace this text with an explanation of your code output to answer the questions above.
Using the same data frame, examine the survival rates.
- Did the children had better survival rate than the adults?
- Order the classes of passengers (Crew, 1st, 2nd, and 3rd) by survival rate worst to best.
Find the answers these questions by adding code to the code block below.
# YOUR CODE HERE
Replace this text with an explanation of your code output to answer the questions.
Here you’ll practice reading data from a text file and do some graphing. The file “PaintedTurtles.txt” contains data on several variables for a sample of Painted turtles.
- Create a figure to visualize at least 3 variables from this dataset (choose whatever graph you’d like!)
- Explain how your graph is informative: what does it tell you about the relationship between the variables? Also explain why you choose to present the data in this way.
Add code to the code block below to complete these tasks.
# YOUR CODE HERE
Replace this text with an explanation of your code output to answer the question.