non-profit-organization

Final Result
https://github.com/sangje-lee/non-profit-org-employment

Process of this analysis

Export csv file into the big dataset.
Filtered some columns/attributes and removed null values that are founded.
Division into different datasets based on the Indicators (There's should be seven datasets)
Division into four different datasets based on the year. Contains three years worth of data (2010-2012, 2013-2015, 2016-2018, 2019-2021)
Division into four different characteristics into four dataasets.
Division based on the GEO, provinces.

Variable names involve during the analysis

df - Whole dataset without any filtering or division
df_sorted - Whole dataset with any filtering like removing non-important attributes.
df_sorted_na - Whole dataset with removal of the null values inside the dataset.

Division of into new dataset based on Indicator

df_AvgAnnHrsWrk - Average annual hours worked
df_AvgAnnWages - Average annual wages and salaries
df_AvgHrsWages - Average hourly wage
df_AvgWeekHrsWrked - Average weekly hours worked
df_Hrs_Wrked - Hours Worked
df_NumOfJob - Number of jobs
df_WagesAndSalaries - Wages and Salaries

Division of into new dataset based on the GEO/year

df_AvgAnnHrsWrk_2010 - Average annual hours worked in 2010
df_AvgAnnHrsWrk_2013 - Average annual hours worked in 2013
df_AvgAnnHrsWrk_2016 - Average annual hours worked in 2016
df_AvgAnnHrsWrk_2019 - Average annual hours worked in 2019

Then merge into

training_df_AvgAnnHrsWrk - Average annual hours worked for training set (2013-2018)
testing_df_AvgAnnHrsWrk - Average annual hours worked for testing set (2019-2021)

Not being used anymore

df_AvgAnnHrsWrk_below_2016 - Average annual hours worked below 2016
df_AvgAnnHrsWrk_above_2017 - Average annual hours worked above 2017

Variable names involve during the analysis

Division of into new dataset based on the group of Characteristics

testing_df_WagesAndSalaries_ByAge - Wages and Salaries By Age For Testing set
testing_df_WagesAndSalaries_ByGender - Wages and Salaries By Gender Group For Testing set
testing_df_WagesAndSalaries_ByEducation - Wages and Salaries By Education level For Testing set
testing_df_WagesAndSalaries_ByImmigrant - Wages and Salaries By Immigrant level For Testing set
testing_df_WagesAndSalaries_ByIndigenous - Wages and Salaries By Indigenous status For Testing set

Division of into new dataset based on the provinces

testing_df_AvgAnnHrsWrk_ByAge_Provinces - Average annual hours worked for testing set by age group grouped by provinces
testing_df_AvgAnnHrsWrk_ByGender_Provinces - Average annual hours worked for testing set by gender grouped by provinces
testing_df_AvgAnnHrsWrk_ByEducation_Provinces - Average annual hours worked for testing set by education level grouped by provinces
testing_df_AvgAnnHrsWrk_ByImmigrant_Provinces - Average annual hours worked for testing set by immigrant status grouped by provinces
testing_df_AvgAnnHrsWrk_ByIndigenous_Provinces - Average annual hours worked for testing set by indigenous status grouped by provinces

ProvinceAnalysis(df_AvgAnnHrsWrk_201x_ByAge, pd, np, pp) - Create new object using ProvinceAnalysis using datasets and other necessary part.
Variables:

self.df = Dataset, the dataset that import
self.provinces = array of provinces
self.indicators = array of indicators
self.characteristics = array of characteristics
self.year = array of years being analysis
self.dfProvinces = array of analysis based of division by provinces, do analysis from the df Dataset

Methods:

outputAnalysis(province_id) - Output detail analysis including sum, mean, and skewness.
outputAnalysisSimple(province_id) - Summarized the output details.
outputList(province_id, num) - Output first "num" amount of dataset.
outputPandaProfiling(province_id) - Do Panda profiling for specific provinces in specific year.

Province Code [0-13]: ['Alberta', 'BC', 'GEO = Canada' , 'Manitoba' , 'New Brunswick', 'Newfoundland', 'Northwest Territories' , 'Nova Scotia' , 'Nunavut', 'Ontario' , 'PEI', 'Quebec', 'Saskatchewan', 'Yukon']

OutputProvinceAnalysis(df_AvgAnnHrsWrk_201x_ByAge_Provinces, ProCode, "201x", pd, np, pp) - Create new object using ProvinceAnalysis using dataset and other necessary part.

ProCode is code for the provinces mentions above.
"201x" here is the year of the analysis.

self.df_output - dataset that are analyzing
self.ProCode - province to analysis (in numeric code)
self.YearOutput - year that was analyized (more for panda-profiling)
OutputResult(self) - Display the result that was analyzed.
OutputPandaProfiling(self) - Do Panda Analysis in specific provinces

For custom output for provinces

For first input (variable categorized_province),

Input the province to analysis, full province name required. Otherwise, error sign will rise.

For second input,

From the numeric code below from 0 - 6 (variable list_indicator),

"0. Average annual hours worked"
"1. Average annual wages and salaries"
"2. Average hourly wage"
"3. Average weekly hours worked"
"4. Hours Worked"
"5. Number of jobs"
"6. Wages and Salaries"

Input the indicators required, numerics sign required, if not prompted, it will raise error.

Contents in this pages

Data_Anlaysis_x - Contain last modified work. Last one is Data_Analysis_v07.
36100651-eng.zip - Contain original dataset employment of non-profit organizations.
36100651.csv - Contain original dataset employment of non-profit organizations in csv file.
EDA_Report_v00.pdf - Inital EDA Report before spliting dataset
data_analysis_categorized_technical_report.ipynb - Contain techncial report in Jupiter Notebook
data_analysis_categorized_technical_report.py - contain technical report in Python file.
data_analysis_categorized_technical_report.html - contain technical report in html file.
data_analysis_categorized_technical_report.pdf - contain technical report in pdf file.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
Data_Analysis_v00		Data_Analysis_v00
Data_Analysis_v01		Data_Analysis_v01
Data_Analysis_v03		Data_Analysis_v03
Data_Analysis_v05		Data_Analysis_v05
Data_Analysis_v07		Data_Analysis_v07
Data_Analysis_v08		Data_Analysis_v08
Data_Analysis_v09		Data_Analysis_v09
Data_Analysis_v10		Data_Analysis_v10
Data_Analysis_v10a		Data_Analysis_v10a
Data_Analysis_v11		Data_Analysis_v11
Data_Analysis_v11a		Data_Analysis_v11a
Final_Result		Final_Result
HTML_Splited_Result		HTML_Splited_Result
Panda_profiling_v01		Panda_profiling_v01
Result_By_Characteristics		Result_By_Characteristics
Result_By_Indicators		Result_By_Indicators
Result_By_Provinces		Result_By_Provinces
Result_By_Testing_Training		Result_By_Testing_Training
Result_Initial		Result_Initial
36100651-eng.zip		36100651-eng.zip
36100651.csv		36100651.csv
Cohort_Analysis_Using_Excel.xlsx		Cohort_Analysis_Using_Excel.xlsx
EDA_Report_v00.pdf		EDA_Report_v00.pdf
Empty_Result_Set.zip		Empty_Result_Set.zip
README.md		README.md
Sangjin_Eric_Lee_cind820-Final_Report.pdf		Sangjin_Eric_Lee_cind820-Final_Report.pdf
Sangjin_Eric_Lee_cind820_-Final_Report-output.pdf		Sangjin_Eric_Lee_cind820_-Final_Report-output.pdf
data_analysis_categorized_technical_report.html		data_analysis_categorized_technical_report.html
data_analysis_categorized_technical_report.ipynb		data_analysis_categorized_technical_report.ipynb
data_analysis_categorized_technical_report.py		data_analysis_categorized_technical_report.py
df_resources.txt		df_resources.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

non-profit-organization

Process of this analysis

Variable names involve during the analysis

Variable names involve during the analysis

For custom output for provinces

Contents in this pages

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

non-profit-organization

Process of this analysis

Variable names involve during the analysis

Variable names involve during the analysis

For custom output for provinces

Contents in this pages

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages