Skip to content

Latest commit

 

History

History
161 lines (122 loc) · 5.97 KB

intro_assignment.md

File metadata and controls

161 lines (122 loc) · 5.97 KB

Intro Assignment

The goal of this assignment is to get you warmed up with git, GitHub, markdown, and R. The first few seminars will cover all you need to know for completing this practice assignment.

Note that this practice assignment, just like the later Analysis Assignment, will need to be done in Rmarkdown (.Rmd) because it requires R code. You can start with the .Rmd version of this file, and fill it in where necessary. Note that Question 1 asks you to add an additional file.

Check out the assignment tips to make sure your report meets our standards. It’s important to write up organized reports that are clear and easy to follow, so that your collaborators and the future you can understand what you are doing. To ensure you earn full marks on this and future assignments, be sure to address the following points:

  • Well-formatted markdown (use of headings, subheadings, and numbering that makes sense)
  • Codes are commented when needed and output is interpreted
  • Properly labeled graphs
  • No broken code
  • Rmarkdown is properly knitted and the .md is rendered on Github (including any figures/images)

Be sure to commit, push your completed assignment, and submit the repository URL on Canvas before the deadline. Anything submitted or modified after the deadline will not be marked.

QUESTION 1: Create a README in your repo (1pt)

Once you have accepted this assignment through the Canvas invite link and have a repository set up (in the format of STAT540-UBC-2023/intro-assignment-yourgithubID), you can start personalizing it by adding a README.md file. This would help people who visit your repository understand who you are and what your repository is about.

Please be sure that your markdown is properly formatted. See this markdown reference from GitHub if you need help getting started.

Add some information to this README.md file to introduce yourself to the teaching team. Some things you may wish to include:

  • Your name, your program
  • Links to other profiles (e.g. Mastodon, a personal website, other relevant links, etc)
  • What you hope to achieve in this course
  • What this repository is for (example: “This repository contains the intro assignment for STAT540”)
  • Anything else that you want. You can add images or emoji, too!

QUESTION 2: Data inspection with R (2pt)

When answering the following questions, please always “sandwich” your R code with some text. Ensure fluency by avoiding inserting R code or outputs without any explanation. For example, if the question asks you what is x + y given x=1 and y=2, you can answer it like this:


Now we want to find the sum of x and y.

x <- 1  # Note how I assign them to variables instead of just priting 1+2
y <- 2 
z <- x + y
z # By doing this, I can print out the output of z. Alternately, you can also do (z <- x+y). 
## [1] 3

Conclusion: we found that x + y is 3 (notice here I’m using inline R code to print the value of z).


If you are looking at the .Rmd of this file (the Rmarkdown that creates the markdown .md file), you’ll see that I have used inline R code in the previous line to refer to the variable z instead of directly typing 3. This is because if we ever need to change the values of x and y, we could possibly forget to update where we typed in 3 as well. You can imagine that if you have to change the initial inputs of a long statistical reports, all your numbers in the rest of the report will change. You don’t want to have to look through every sentence to make sure that your report is still up-to-date.

By using variables and inline R code, I can change the value of x and/or y, and rest assured that when I re-knit my Rmarkdown, my conclusion statement will be updated.

There are some data sets in base R for you to play around with. We will explore the Titanic dataset. You can view it by typing Titanic. Just like with functions, you can use the question mark ?Titanic to see more information about the dataset.

Answer the following questions by subsetting, printing out the result or by graphing.

2.1 Passenger breakdown (1pt)

  • Convert the array into a data frame by using data.frame() function.
  • How many children were on Titanic?
  • How many adults were on Titanic?
  • Were there more female adult or male adult passengers?

Find the answers these questions by adding code to the code block below.

# YOUR CODE HERE

Replace this text with an explanation of your code output to answer the questions above.

2.2 Survival (1pt)

Using the same data frame, examine the survival rates.

  • Did the children had better survival rate than the adults?
  • Order the classes of passengers (Crew, 1st, 2nd, and 3rd) by survival rate worst to best.

Find the answers these questions by adding code to the code block below.

# YOUR CODE HERE

Replace this text with an explanation of your code output to answer the questions.

3. Data visualization (2pt)

Here you’ll practice reading data from a text file and do some graphing. The file “PaintedTurtles.txt” contains data on several variables for a sample of Painted turtles.

  • Create a figure to visualize at least 3 variables from this dataset (choose whatever graph you’d like!)
  • Explain how your graph is informative: what does it tell you about the relationship between the variables? Also explain why you choose to present the data in this way.

Add code to the code block below to complete these tasks.

# YOUR CODE HERE

Replace this text with an explanation of your code output to answer the question.