Skip to content

Commit 1acbbcf

Browse files
committed
add sort_index
1 parent 2995ef3 commit 1acbbcf

File tree

2 files changed

+10
-15
lines changed

2 files changed

+10
-15
lines changed

episodes/data-visualisation.md

+10-15
Original file line numberDiff line numberDiff line change
@@ -47,15 +47,10 @@ df_long.head()
4747

4848
Ok! We are now ready to plot our data. Since this data is monthly data, we can plot the circulation data over time.
4949

50-
::::::::::::::::::::::::::::::::::::: instructor
51-
## Instructor note: Pandas 2.2.* bug
52-
There is a bug in the latest release of Pandas that is causing certain plots to display in a garbled manner. This is a [known issue](https://github.com/pandas-dev/pandas/issues/59960) that the Pandas team plans to address. In the meantime, learners and instructors can user older versions of pandas *or* add `.sort_index()` before any instance of `.plot()`. For example, use `albany['circulation'].sort_index().plot()` instead of `albany['circulation'].plot()`.
53-
:::::::::::::::::::::::::::::::::::::::::::::::::
54-
55-
At first, let’s focus on a specific branch. We can select the rows for the Albany Park branch:
50+
At first, let’s focus on a specific branch. We can select the rows for the Albany Park branch and then use `.sort_index()` to be explicit that we want our data to be sorted in the order of the date index.
5651

5752
``` python
58-
albany = df_long[df_long['branch'] == 'Albany Park']
53+
albany = df_long[df_long['branch'] == 'Albany Park'].sort_index()
5954
```
6055

6156
``` python
@@ -66,13 +61,13 @@ albany.head()
6661
|------------|-------------|----------------------|---------|----------|--------|------|---------|-------------|
6762
| date | | | | | | | | |
6863
| 2011-01-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 120059 | 2011 | january | 8427 |
69-
| 2012-01-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 83297 | 2012 | january | 10173 |
70-
| 2013-01-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 572 | 2013 | january | 0 |
71-
| 2014-01-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 50484 | 2014 | january | 35 |
72-
| 2015-01-01 | Albany Park | NaN | NaN | NaN | 133366 | 2015 | january | 10889 |
64+
| 2011-02-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 120059 | 2011 | february | 7023 |
65+
| 2011-03-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 120059 | 2011 | march | 9702 |
66+
| 2011-04-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 120059 | 2011 | april | 9344 |
67+
| 2011-05-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 120059 | 2011 | may | 8865 |
7368

7469

75-
Now we can use the `plot()` function that is built in to pandas. Let’s try it:
70+
Now we can use the `plot()` function that is built in to pandas. Let’s try it:
7671

7772
``` python
7873
albany.plot()
@@ -199,7 +194,7 @@ Here is a view of the [interactive output of the Plotly bar chart](learners/bar_
199194
## Plotting with Pandas
200195

201196
1. Load the dataset `df_long.pkl` using Pandas.
202-
2. Create a new DataFrame that only includes the data for the "Chinatown" branch.
197+
2. Create a new DataFrame that only includes the data for the "Chinatown" branch. (Don't forget to sort by the index)
203198
3. Use the Pandas plotting function to plot the "circulation" column over time.
204199

205200

@@ -211,7 +206,7 @@ Here is a view of the [interactive output of the Plotly bar chart](learners/bar_
211206
```python
212207
import pandas as pd
213208
df_long = pd.read_pickle('data/df_long.pkl')
214-
chinatown = df_long[df_long['branch'] == 'Chinatown']
209+
chinatown = df_long[df_long['branch'] == 'Chinatown'].sort_index()
215210
chinatown['circulation'].plot()
216211
```
217212

@@ -235,7 +230,7 @@ Add a line to the code below to plot the Uptown branch circulation including the
235230
```python
236231
import pandas as pd
237232
df_long = pd.read_pickle('data/df_long.pkl')
238-
uptown = df_long[df_long['branch'] == 'Uptown']
233+
uptown = df_long[df_long['branch'] == 'Uptown'].sort_index()
239234
```
240235

241236
::::::::::::::: solution

episodes/fig/albany-plot-1.png

29.9 KB
Loading

0 commit comments

Comments
 (0)