|
4 | 4 | Introduction to **Data Science** provides a comprehensive overview of modern data science: the practice of obtaining, exploring, modeling, and interpreting data.
|
5 | 5 |
|
6 | 6 | #### What is [Data Science](https://en.wikipedia.org/wiki/Data_science)?
|
7 |
| -Data science is a "concept to unify statistics, data analysis and their related methods" in order to "understand and analyze actual phenomena" with data. It employs techniques and theories drawn from many fields within the broad areas of **mathematics**, **statistics**, **information science**, and **computer science**, in particular from the subdomains of **machine learning**, **classification**, **cluster analysis**, **data mining**, **databases**, and **visualization**. Since, all of these topics are mentioned it is quite obvious that people might take it in wring way, that leads to our second question i.e What is not Data Science. |
| 7 | +Data science is a "concept to unify statistics, data analysis and their related methods" in order to "understand and analyze actual phenomena" with data. It employs techniques and theories drawn from many fields within the broad areas of **mathematics**, **statistics**, **information science**, and **computer science**, in particular from the subdomains of **machine learning**, **classification**, **cluster analysis**, **data mining**, **databases**, and **visualization**. Since, all of these topics are mentioned it is quite obvious that people might take it in wrong way, that leads to our second question i.e What is not Data Science. |
8 | 8 |
|
9 | 9 | #### What is not Data Science?
|
10 | 10 | Since Data Science is buzz word in the media, people are compelled to assume it whatever they want to. If you ask set of people what Data Science exactly is, you might end up hearing plethora of definitions and concepts involved. Let's head to what it isn't:
|
@@ -122,33 +122,47 @@ Drawing insight from a piece of data involves understanding how it fits into the
|
122 | 122 |
|
123 | 123 | ## Machine Learning - hyperlink
|
124 | 124 | - What is ML?
|
125 |
| -- Numerical Var |
126 |
| -- Catagorical Var |
127 |
| -- Supervised Learning |
128 |
| -- Unsupervised Learning |
| 125 | +- Variables |
| 126 | + - Numerical |
| 127 | + - Categorical |
| 128 | +- Learning |
| 129 | + - Supervised Learning |
| 130 | + - Unsupervised Learning |
129 | 131 | - Concepts, Inputs & Attributes
|
130 | 132 | - Training & Test Data
|
131 | 133 | - Classifier
|
132 | 134 | - Prediction
|
133 | 135 | - Lift
|
134 | 136 | - Overfitting
|
| 137 | +- Underfitting |
135 | 138 | - Bias & Variance
|
136 |
| -- Trees & Classification |
137 | 139 | - Classification
|
138 |
| - - Classification Rate |
139 |
| - - Decision Trees |
140 |
| - - Boosting |
141 |
| - - Naive Bayes Classifiers |
142 |
| - - K-Nearest Neighbor |
143 |
| - - Logistic Regression |
144 |
| -- Regression |
145 |
| - - Ranking |
146 |
| - - Linear Regression |
147 |
| - - Perceptron |
| 140 | +- Classification Rate |
| 141 | +- Regression |
| 142 | + - Simple Linear Regression |
| 143 | + - Multiple Linear Regression |
| 144 | + - Polynomial Regression |
| 145 | + - Logistic Regression |
| 146 | +- Decision Trees |
| 147 | +- Random Forest |
| 148 | +- Boosting |
| 149 | + - Gradient Boosting |
| 150 | + - GBM |
| 151 | + - XGBoost |
| 152 | + - LightGBM |
| 153 | + - CatBoost |
| 154 | + - AdaBoost |
| 155 | +- Naive Bayes Classifiers |
| 156 | +- K-Nearest Neighbor |
| 157 | +- Ranking |
148 | 158 | - Clustering
|
149 |
| - - Hierarchical Clustering |
150 |
| - - K-Means Clustering |
| 159 | + - K-Means Clustering |
| 160 | + - Hierarchical Clustering |
| 161 | +- Perceptron |
151 | 162 | - Neural Networks
|
| 163 | +- Linear discriminant analysis |
| 164 | +- Quadratic discriminant analysis |
| 165 | +- Support Vector Machine |
152 | 166 | - Sentiment Analysis
|
153 | 167 | - Collaborative Filtering
|
154 | 168 | - Tagging
|
@@ -236,6 +250,67 @@ Drawing insight from a piece of data involves understanding how it fits into the
|
236 | 250 | - Stratified Sampling
|
237 | 251 | - Principle Component Analysis
|
238 | 252 |
|
| 253 | + |
| 254 | + |
| 255 | +## <p align="center">Machine Learning</p> |
| 256 | + |
| 257 | +**Topics covered**: |
| 258 | +`Intelligence` |
| 259 | +`Algorithms` |
| 260 | +`Intelligent System` |
| 261 | +`Artificial Intelligence` |
| 262 | + |
| 263 | + |
| 264 | +####What is ML? |
| 265 | + |
| 266 | +Concept | Best Video Resource | Best Text Resource | Duration | Prerequisites |
| 267 | +:-- | :--: | :--: | :--: | :--: |
| 268 | +Basics | [Youtube](https://youtu.be/-rMMTv7XLYw?list=PLUZBeqSwWIJ2RJGqkMVwJcDrl0_haf6zo) | [Article](http://blog.hackerearth.com/explaining-basics-of-machine-learning-algorithms-applications) | 0.5 H | None |
| 269 | +Philosophy | [Youtube](https://youtu.be/R_3bqaUlSkk) | [Documentation](http://ml.typepad.com/machine_learning_thoughts/philosophy/) | 0.5 hour | Basics |
| 270 | + |
| 271 | +<br> |
| 272 | + |
| 273 | + |
| 274 | +### Variables |
| 275 | +This section apprise the types of data/variable. |
| 276 | + |
| 277 | +**Topics covered**: |
| 278 | +`Ordinal` |
| 279 | +`Nominal` |
| 280 | +`Discrete` |
| 281 | +`Continuos` |
| 282 | + |
| 283 | +Concept | Best Video Resource | Best Text Resource | Duration | Prerequisites |
| 284 | +:-- | :--: | :--: | :--: | :--: |
| 285 | +Numerical Variable | [Youtube](https://www.youtube.com/watch?v=9OKRHhakTmE) | [Documentation](https://mathtutoring.jimdo.com/statistics/numerical-data/) | 15 mins | Basic math |
| 286 | +Categorical Variable | [Youtube](https://www.youtube.com/watch?v=9OKRHhakTmE) | [Documentation](https://cyfar.org/types-variables-categorical) | 15 mins | Basic math |
| 287 | + |
| 288 | + |
| 289 | +<br> |
| 290 | + |
| 291 | +### Learning |
| 292 | +Explains the types of machine learning algorithms and when you should use each of them |
| 293 | + |
| 294 | +**Topics covered**: |
| 295 | +`Classification` |
| 296 | +`Labeled data` |
| 297 | +`Unlabeled data` |
| 298 | +`Regression` |
| 299 | +`Clustering` |
| 300 | + |
| 301 | +Concept | Best Video Resource | Best Text Resource | Duration | Prerequisites |
| 302 | +:-- | :--: | :--: | :--: | :--: |
| 303 | +Supervised learning | [Coursera](https://www.coursera.org/learn/machine-learning/lecture/1VkCb/supervised-learning) | [Documentation](https://www.saylor.org/site/wp-content/uploads/2011/11/CS405-6.2.1.2-WIKIPEDIA.pdf) | 1.5 hours | Statistics |
| 304 | +Unsupervised Learning | [Coursera](https://www.coursera.org/learn/machine-learning/lecture/olRZo/unsupervised-learning) | [Documentation](https://brilliant.org/wiki/unsupervised-learning/) | 0.5 hours | Statistics |
| 305 | + |
| 306 | +#### Tasks : |
| 307 | +- Make a list of popular supervised and unsupervised algorithms. |
| 308 | +- Frequently follow five reputed blogs/websites about Machine Learning. |
| 309 | +- List three famous machine learning problems and mention which algorithm can be used to solve each of them. |
| 310 | + |
| 311 | +<br> |
| 312 | + |
| 313 | + |
239 | 314 | # <p align="center">Additional Resources</p>
|
240 | 315 | [An awesome Data Science repository to learn and apply for real world problems](https://github.com/bulutyazilim/awesome-datascience).
|
241 | 316 |
|
|
0 commit comments