-
-
Notifications
You must be signed in to change notification settings - Fork 119
DOC: Emphasize NumPy in Ecosystem openers #242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,27 +8,44 @@ | |
</div> | ||
<div> | ||
<p> | ||
Data Science makes it possible to analyze massive amounts of data | ||
and gain meaningful insights. A typical data science workflow involves | ||
various techniques and tools such as: | ||
NumPy lies at the core of a rich ecosystem of data science libraries. | ||
</p> | ||
<p> | ||
Data science is the analysis of massive amounts of data | ||
to gain insight. A typical workflow might be: | ||
|
||
<ul class="content-tab"> | ||
<li><b>Extract, Transform, Load (ETL):</b> Pandas, Beautiful Soup, Intake</li> | ||
<li><b>Explore:</b> Seaborn, Matplotlib</li> | ||
<li><b>Model:</b> Scikit-learn, SciPy, statsmodels</li> | ||
<li><b>Evaluate:</b> NumPy, TensorFlow </li> | ||
<li><b>Extract, Transform, Load (ETL):</b> | ||
<a href="https://pandas.pydata.org">Pandas</a>, | ||
<a href="https://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a>, | ||
<a href="https://intake.readthedocs.io/en/latest/"> Intake</a> | ||
</li> | ||
|
||
<li><b>Explore:</b> | ||
<a href="https://seaborn.pydata.org"> Seaborn</a>, | ||
<a href="https://matplotlib.org">Matplotlib</a>, | ||
|
||
</li> | ||
|
||
<li><b>Model:</b> | ||
<a href="https://scikit-learn.org">scikit-learn</a>, | ||
<a href="https://www.scipy.org">SciPy</a>, | ||
<a href="https://www.statsmodels.org/stable/index.html"> statsmodels</a>. | ||
</li> | ||
|
||
<li><b>Evaluate:</b> | ||
NumPy, | ||
<a href="https://www.tensorflow.org">TensorFlow</a> | ||
</li> | ||
|
||
<li> | ||
<b>Presentation:</b> | ||
<b>Display:</b> | ||
<a href="./index.html/#tab-visual"> Data Visualization Tools</a> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. New issue found: this doesn't work as well as it should; it opens the correct tab, but the view jumps back to top of the page. Noting here, can deal with it later. |
||
</li> | ||
</ul> | ||
</p> | ||
</div> | ||
</div> | ||
<p> | ||
Python has a rich ecosystem of libraries that enable Data Science | ||
workflows. <b> NumPy</b> is the foundation of almost all of these tools | ||
such as Pandas, Seaborn, Beautiful Soup and several others. | ||
</p> | ||
<div class="grid-container"> | ||
<div> | ||
<p> | ||
|
@@ -37,13 +54,13 @@ | |
data access and distribution, while | ||
<a href="https://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a> | ||
is widely used for web-scraping and gathering data sets. | ||
<a href="https://seaborn.pydata.org"> Seaborn</a> is well known for its | ||
<a href="https://towardsdatascience.com/how-to-perform-exploratory-data-analysis-with-seaborn-97e3413e841d">exploratory data analysis (EDA)</a> | ||
capabilities, <a href="https://scikit-learn.org">Scikit-learn</a> and | ||
<a href="https://www.scipy.org">Scipy</a> (statistical computing) serve some | ||
<a href="https://seaborn.pydata.org"> Seaborn</a> is well known for | ||
<a href="https://towardsdatascience.com/how-to-perform-exploratory-data-analysis-with-seaborn-97e3413e841d">exploratory data analysis (EDA)</a>; | ||
<a href="https://scikit-learn.org">scikit-learn</a> and | ||
<a href="https://www.scipy.org">SciPy</a> (statistical computing) serve some | ||
of the backbone processes required for machine learning (regression methods, | ||
classification, clustering, model validation and selection). | ||
Statistical data exploration, estimation of various statistical models | ||
Statistical data exploration, estimation of various statistical models, | ||
and conducting statistical tests are some of the functions offered by | ||
<a href="https://www.statsmodels.org/stable/index.html"> statsmodels</a>. | ||
</p> | ||
|
@@ -53,11 +70,11 @@ | |
</div> | ||
</div> | ||
<p> | ||
Effective data analytics require deep knowledge of the data domain (e.g., | ||
Retail, Healthcare, Marketing, Finance, Social Media, Automation, Sales, Travel, | ||
etc.) as well as other core disciplines of Data Science, Data Engineering and | ||
Data Visualization. Tools such as <a href="https://mlflow.org">MLFlow</a> address | ||
experiment hyper-parameter and result tracking needs, while | ||
Effective data analytics requires deep knowledge of the data domain (e.g., | ||
retail, healthcare, marketing, finance, social media, automation, sales, travel, | ||
etc.) as well as other core disciplines of data science, data engineering, and | ||
data visualization. Tools such as <a href="https://mlflow.org">MLFlow</a> address | ||
experiment hyperparameter and result tracking needs, while | ||
<a href="https://dvc.org"> DVC</a> provides data version control for data science | ||
and machine learning workflows. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Still would like to make this tab a little more compact. Perhaps in a follow-up, all these changes look good. EDIT: followed up in gh-262 |
||
</p> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will rethink the contents here - having NumPy right at the end again is a little odd perhaps.