@@ -276,7 +276,7 @@ print(result)
276
276
277
277
## Processing Files Based on Record Length
278
278
279
- Modify this program so that it only processes files with fewer than 50 records.
279
+ Modify this program so that it only processes files with fewer than 85 records.
280
280
281
281
``` python
282
282
import glob
@@ -344,58 +344,6 @@ print(smallest, largest)
344
344
345
345
:::::::::::::::::::::::::
346
346
347
- ::::::::::::::::::::::::::::::::::::::::::::::::::
348
-
349
- ::::::::::::::::::::::::::::::::::::::::: callout
350
-
351
- ## Using Functions With Conditionals in Pandas
352
-
353
- Functions will often contain conditionals. Here is a short example that
354
- will indicate which quartile the argument is in based on hand-coded values
355
- for the quartile cut points.
356
-
357
- ``` python
358
- def calculate_life_quartile (exp ):
359
- if exp < 58.41 :
360
- # This observation is in the first quartile
361
- return 1
362
- elif exp >= 58.41 and exp < 67.05 :
363
- # This observation is in the second quartile
364
- return 2
365
- elif exp >= 67.05 and exp < 71.70 :
366
- # This observation is in the third quartile
367
- return 3
368
- elif exp >= 71.70 :
369
- # This observation is in the fourth quartile
370
- return 4
371
- else :
372
- # This observation has bad data
373
- return None
374
-
375
- calculate_life_quartile(62.5 )
376
- ```
377
-
378
- ``` output
379
- 2
380
- ```
381
-
382
- That function would typically be used within a ` for ` loop, but Pandas has
383
- a different, more efficient way of doing the same thing, and that is by
384
- * applying* a function to a dataframe or a portion of a dataframe. Here
385
- is an example, using the definition above.
386
-
387
- ``` python
388
- data = pd.read_csv(' Americas-data.csv' )
389
- data[' life_qrtl' ] = data[' lifeExp' ].apply(calculate_life_quartile)
390
- ```
391
-
392
- There is a lot in that second line, so let's take it piece by piece.
393
- On the right side of the ` = ` we start with ` data['lifeExp'] ` , which is the
394
- column in the dataframe called ` data ` labeled ` lifExp ` . We use the
395
- ` apply() ` to do what it says, apply the ` calculate_life_quartile ` to the
396
- value of this column for every row in the dataframe.
397
-
398
-
399
347
::::::::::::::::::::::::::::::::::::::::::::::::::
400
348
401
349
:::::::::::::::::::::::::::::::::::::::: keypoints
0 commit comments