Skip to content

Commit

Permalink
Merge pull request #40 from brianspiering/revise_module_0
Browse files Browse the repository at this point in the history
Revise module 0
  • Loading branch information
brianspiering authored Feb 23, 2018
2 parents 833a48b + 3badd31 commit 37f4b5d
Show file tree
Hide file tree
Showing 15 changed files with 30 additions and 80 deletions.
40 changes: 22 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,48 +3,52 @@ Introduction to Data Science for Social Good

__How can we use data for social impact?__

In this introductory course, students learn foundational theory and the coding skills necessary to translate data into actionable insights. Students learn about some of the latest tools and algorithms being used by corporate and academic machine learning teams. Each of the 6 hands-on, project-based modules use real world data from KIVA, a non-profit that connects people through lending to alleviate poverty. Data is powerful, and we believe that anyone can harness it for change.
Data is powerful, and we believe that anyone can harness it for change.

Data science is a highly interdisciplinary practice, demanding critical thinking, understanding of statistics, and technical coding ability. Irresponsible application of powerful algorithms or an inadequate exploration of underlying assumptions can lead to spurious results. In this course, we emphasize the fundamentals of proper data science and expose students to what is possible using sophisticated machine learning methods.
In this introductory course, students will learn foundational theory and the necessary coding skills to translate data into actionable insights. Students will learn the latest machine learning tools and algorithms.

Delta Analytics is a 501(c)3 Bay Area non-profit dedicated to bringing rigorous data science to problem-solving, effecting change in nonprofits and the public sector, and making data science an accessible and democratic resource for anyone with the same mission.
Data science is a highly interdisciplinary practice: demanding critical thinking, understanding of statistics, and technical coding ability. Irresponsible application of powerful algorithms or an inadequate exploration of underlying assumptions can lead to spurious results. In this course, we emphasize the fundamentals of proper data science and expose students to what is possible using sophisticated machine learning methods.

Each of the modules is hands-on, project-based, using real world data from [KIVA](https://www.kiva.org/), a non-profit that connects people through lending to alleviate poverty.

[Delta Analytics](http://www.deltanalytics.org/) is a 501(c)3 Bay Area non-profit dedicated to bringing rigorous data science to problem-solving, effecting change in nonprofits and the public sector, and making data science an accessible and democratic resource for anyone with the same mission.

Curriculum
----

Topics covered in this course include supervised learning, unsupervised learning, ensemble approaches, recommendation algorithms, and text analysis (also called Natural Language Processing, or NLP).
Topics covered in this course include: supervised learning, unsupervised learning, ensemble approaches, recommendation algorithms, and text analysis (also called Natural Language Processing or NLP).

Algorithms covered in this course include linear regression, decision tree, random forest, and k-means clustering.
Algorithms covered in this course include: linear regression, decision trees, random forest, and k-means clustering.


The slides that provide accompanying theory to the code are available [here](http://www.deltanalytics.org/curriculum.html). Our curriculum structure of presenting theory alongside a real-life long-form data science project will open doors to novices and professionals alike to harness the power of data for good.
The slides that cover the theory behind the code are available [here](http://www.deltanalytics.org/curriculum.html). Our curriculum structure of presenting theory alongside a real-life long-form data science project will open doors to novices and professionals alike to harness the power of data for good.

Modules:
----

0) Introduction / Overview of Syllabus
0) [Introduction](module_0_introduction/README.md)

- Who is Delta Analytics?
- What is data science? What is machine learning?
- Setting up your environment
- Accessing the data

1) Descriptive Statistics
1) [Descriptive Statistics](module_1_descriptive_statistics/README.md)

- Data validation and cleaning

2) Feature Engineering
2) [Feature Engineering](module_2_feature_engineering/README.md)

3) Linear Regression
3) [Linear Regression](module_3_linear_regression/README.md)

4) Decision Trees
4) [Decision Trees](module_4_decision_trees/README.md)

- Ensemble approaches
- Why use an ensemble approach?
- Decision tree, random forest and bagging
- Parametric vs. non-parametric models
- What are hyperparameters and how do you choose them?

5) Unsupervised Learning
5) [Unsupervised Learning](module_5_unsupervised_learning/README.md)

- Clustering
- K-means algorithm
Expand All @@ -54,16 +58,16 @@ Outcomes of course

At the end of the course, students will:

1. Have a solid foundational understanding of the statistical and mathematical logic underlying predominant data science methods.
2. Be able to communicate with other data scientists using technical terms about foundational concepts.
3. Write code to clean, process, analyze and visualize real world data from KIVA, a non-profit that connects people through lending to alleviate poverty.
1. Have a solid understanding of the fundamental statistical and programming that underlying common data science methods.
2. Be able to communicate with other data scientists using technical terms.
3. Write code to clean, process, analyze, and visualize real world data.

Who is our target student?
----

The course is intended for any and all individuals interested in harnessing data towards solving problems in their communities. No prior coding or mathematical/statistical experience is expected, but computer proficiency is necessary.
The course is intended for any and all individuals interested in harnessing data towards solving problems in their communities. Minimal prior coding or mathematical/statistical experience is expected. Computer proficiency is necessary.

Our teachers
-----

Delta teaching fellows are all data professionals working in the Bay Area. All of our time is donated for free to build out a curriculum that makes machine learning tools and knowledge more accessible to communities around the world. You can learn more about our team [here](http://www.deltanalytics.org/delta-teaching-fellows.html).
[Delta Teaching Fellows](http://www.deltanalytics.org/delta-teaching-fellows.html) are all data professionals working in the Bay Area. All of our time is donated for free to build out a curriculum that makes machine learning tools and knowledge more accessible to communities around the world. You can learn more about our team [here](http://www.deltanalytics.org/delta-teaching-fellows.html).
1 change: 0 additions & 1 deletion module_0_introduction/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

Module 0: Introduction to the course
======

Expand Down
Binary file removed module_0_introduction/images/R.PNG
Binary file not shown.
Binary file removed module_0_introduction/images/RStudio_1.PNG
Binary file not shown.
Binary file removed module_0_introduction/images/RStudio_2.PNG
Binary file not shown.
Binary file removed module_0_introduction/images/RStudio_3.PNG
Binary file not shown.
Binary file removed module_0_introduction/images/RStudio_4.PNG
Binary file not shown.
Binary file removed module_0_introduction/images/RStudio_5.PNG
Binary file not shown.
Binary file removed module_0_introduction/images/RStudio_6.PNG
Binary file not shown.
Binary file removed module_0_introduction/images/RStudio_7.PNG
Binary file not shown.
Binary file removed module_0_introduction/images/R_2.PNG
Binary file not shown.
Binary file removed module_0_introduction/images/R_3.PNG
Binary file not shown.
Binary file added module_0_introduction/images/anaconda_nav.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added module_0_introduction/images/jupyter_notebook.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
69 changes: 8 additions & 61 deletions module_0_introduction/python_installation_instructions.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,11 @@
Module 0: Setup
Module 0: Installing Python with Anaconda
===============

Install Python
--------------

### Installing Anaconda
1. Get the latest version of Anaconda (requires ~1.8Gb space) for your operating system at: [https://www.continuum.io/downloads](https://www.continuum.io/downloads)
1. Get the latest version of Anaconda (requires ~1.8Gb space) for your operating system at: [www.anaconda.com/download/](https://www.anaconda.com/download/). Get the Python 3 verison

![1](images/Anaconda_1.PNG)

2. Follow the prompts.
2. Follow the prompts:

![2](images/Anaconda_2.PNG)
![3](images/Anaconda_3.PNG)
Expand All @@ -19,61 +15,12 @@ Install Python
![7](images/Anaconda_7.PNG)
![8](images/Anaconda_8.PNG)

This also installs Python for you!


Install R and RStudio
---------------------

- **R** is a software environment for statistical computing and graphics.
Download the appropriate version at: [https://cran.rstudio.com/](https://cran.rstudio.com/)

- **RStudio** makes R easier to use. It includes a code editor, debugging & visualization tools.
Download the appropriate version for your platform at: [https://www.rstudio.com/products/rstudio/download2/](https://www.rstudio.com/products/rstudio/download2/)

### Installing R
Installing R:

1. Go to: https://cran.rstudio.com.
2. Click “Download R 3.x.x for [Windows or OS X]”.
3. Save the latest installer file or package binary (e.g. 3.4.0).
4. Open the installer from where it was downloaded.
5. Click “Run”.

![1](images/R.PNG)

6. Choose your language.

![2](images/R_2.PNG)

7. Continue “Next” accepting defaults.
8. “Finish” installing R!

![3](images/R_3.PNG)


### Installing RStudio
1. Go to: www.rstudio.com/products/rstudio/download2/
2. Find RStudio Desktop Open Source License.
3. Click "Download".

![1](images/RStudio_1.PNG)

4. Save the installer for your platform.

![2](images/RStudio_2.PNG)

5. Open the installer from where it was downloaded.
6. Click “Run”.

![3](images/RStudio_3.PNG)
3. Start Anaconda Navigator application.

7. Follow prompts.
4. Within Anaconda Navigator, click on "Lanuch" button for Jupyter Notebook.

![4](images/RStudio_4.PNG)
![5](images/RStudio_5.PNG)
![6](images/RStudio_6.PNG)
![](images/anaconda_nav.png)

8. “Finish” installing RStudio!
That will open Jupyter Notebook in your favorite web browser.

![7](images/RStudio_7.PNG)
![](images/jupyter_notebook.png)

0 comments on commit 37f4b5d

Please sign in to comment.