GitHub - cornelalbrecht/GettingCleaningCourseProj: Getting and Cleaning Data Course Project

title	output
Readme	html_document

Summary and Purpose

The file run_analysis.R cleans test and training data from the Human Activity Reconition Study.

Output

The analysis function outputs a file grouped_data.txt that contains the mean and standard deviation values for the various sensor measurements grouped by activity and participant (please refer to the source code and the codebook for more detail).

The data is grouped by the activity performed and by participant within the study. The values given are mean values of all measurement readings.

At 6 different activities assessed and 30 participants, there are 180 values of 86 variables. In addition, an actvity and participant identifier are found for every observation making the outoput matrix [180 x 88]-dimensional.

Steps

The following steps are performed within the script:

Read-in of all the data files.
The data is then filtered for only mean and std values as we are only interested in these values.
The different data-frames are then merged into one large data frame.
The activities are identified by their ID and added to the data frame for better readability.
The column names are set to the respective measurements for better readability.
The final data frame is grouped by activity and participant and the mean values are calculated.
The grouped data with mean values is exported into a fixed-width delimited file called grouped_data.txt.

Source Code

# Load needed Libraries
library(dplyr)

# Read all files
feat <- read.table("./other/features.txt")
act.lab <- read.table("./other/activity_labels.txt")

test.X <- read.table("./test/X_test.txt")
test.y <- read.table("./test/y_test.txt")
test.subj <- read.table("./test/subject_test.txt")

train.X <- read.table("./train/X_train.txt")
train.y <- read.table("./train/y_train.txt")
train.subj <- read.table("./train/subject_train.txt")


# Filter the mean() and std() values from the data
filt.vars <- grep("mean()", feat$V2, ignore.case = TRUE)
filt.vars <- append(filt.vars, grep("std()", feat$V2, ignore.case = TRUE))

# RBind Test and Training Sets
X <- rbind(test.X, train.X)
y <- rbind(test.y, train.y)
subj <- rbind(test.subj, train.subj)

# Subset X variables for only mean() and std() variables
Xsub <- X[,filt.vars]

# CBind all datasets into one
dat <- cbind(y, subj, Xsub)

# Assign temporary column names for merging (this is quick and dirty code, but it works :-) )
colnames(dat) <- c("act", "subj", "data")

# Merge Data on Activity ID
dat <- merge(act.lab, dat, by.x = "V1", by.y = "act")
dat <- dat[,-1] # Remove extra column with Activity ID that stems from the merging process

# Make the Column names all nice
desc <- feat[filt.vars,]

desc$V2 <- gsub("Acc", ".Acceleration.", desc$V2)
desc$V2 <- gsub("Gyro", ".Gyroscope.", desc$V2)
desc$V2 <- gsub("mean\\(\\)", ".MeanValue.", desc$V2)
desc$V2 <- gsub("meanFreq\\(\\)", ".MeanFrequency.", desc$V2)
desc$V2 <- gsub("std\\(\\)", ".StandardDeviation.", desc$V2)
desc$V2 <- gsub("tBody", "TimeDomain.Body", desc$V2)
desc$V2 <- gsub("fBody", "FrequencyDomain.Body", desc$V2)
desc$V2 <- gsub("tGravity", "TimeDomain.Gravity", desc$V2)
desc$V2 <- gsub("-X", "-X.Direction", desc$V2)
desc$V2 <- gsub("-Y", "-Y.Direction", desc$V2)
desc$V2 <- gsub("-Z", "-Z.Direction", desc$V2)
desc$V2 <- gsub("angle\\(", "Angle.of.\\(", desc$V2)
desc$V2 <- gsub("Mag", "Magnitude", desc$V2)

colnames(dat) <- c("Activity", "Subject", as.character(desc$V2))

# Group and summarize data with dplyr package 
grouped.data <- dat %>% group_by(Activity,Subject) %>% summarise_each(funs(mean))

# Write data to file
write.table(grouped.data, file = "grouped_data.txt", row.names = FALSE)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Summary and Purpose

Output

Steps

Source Code

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
other		other
test		test
train		train
README.md		README.md
codebook.md		codebook.md
grouped_data.txt		grouped_data.txt
run_analysis.R		run_analysis.R

cornelalbrecht/GettingCleaningCourseProj

Folders and files

Latest commit

History

Repository files navigation

Summary and Purpose

Output

Steps

Source Code

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages