Skip to content

Commit 85174fa

Browse files
committed
Some fixes after a readthrough.
1 parent 57e61a1 commit 85174fa

File tree

2 files changed

+40
-38
lines changed

2 files changed

+40
-38
lines changed

source/resampling_with_code.Rmd

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -355,7 +355,7 @@ round(-2.7)
355355
```
356356

357357
Like many functions, [`round`]{.r}[`np.round`]{.python} can take more than one
358-
argument (component). You can send `range` the number of digits you want to
358+
argument (component). You can send `round` the number of digits you want to
359359
round to, after the number of you want it to work on, like this (see
360360
@fig-round_ndigits_pl):
361361

@@ -502,15 +502,15 @@ such as 0 through 9.
502502

503503
Ranges can be confusing in normal speech because it is not always clear whether
504504
they include their beginning and end. For example, if someone says "pick a
505-
number between 1 and 5", do they mean *all* the numbers, including the first
506-
and last (any of 1 or 2 or 3 or 4 or 5)? Or do they mean only the numbers that
507-
are *between* 1 and 5 (so 2 or 3 or 4)? Or do they mean all the numbers up to,
508-
but not including 5 (so 1 or 2 or 3 or 4)?
505+
number between 1 and 5", do they mean to pick from *all* of the numbers,
506+
including the first and last (any of 1 or 2 or 3 or 4 or 5)? Or do they mean
507+
only the numbers that are *between* 1 and 5 (so 2 or 3 or 4)? Or do they mean
508+
all the numbers up to, but not including 5 (so 1 or 2 or 3 or 4)?
509509

510510
To avoid this confusion, we will nearly always use "from" and "through" in
511511
ranges, meaning that we do include both the start and the end number. For
512-
example, if we say "pick a number from 1 through 5" we mean one of 1 or 2 or 3
513-
or 4 or 5.
512+
example, if we say "pick a number from 1 through 5" we mean one of 1 or 2 or
513+
3 or 4 or 5.
514514
:::
515515

516516
Creating ranges of numbers is so common that {{< var lang >}} has a [special

source/resampling_with_code2.Rmd

Lines changed: 33 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -289,7 +289,7 @@ Notice the argument `5` to the [`np.zeros`]{.python}[`numeric`]{.r} function.
289289
This tells the function how many zeros we want in the {{< var array >}} that
290290
the function will return.
291291

292-
## {{< var array >}} length {#sec-array-length}
292+
## [Array]{.python}[Vector]{.r} length {#sec-array-length}
293293

294294
The are various useful things we can do with this {{< var array >}} container.
295295
One is to ask how many elements there are in the {{< var array >}} container.
@@ -414,7 +414,7 @@ give different counts.
414414

415415
```{python}
416416
rnd = np.random.default_rng()
417-
zero_thru_99 = np.arange(0, 100)
417+
zero_thru_99 = np.arange(100)
418418
# Get 12 random numbers from 0 through 99
419419
a = rnd.choice(zero_thru_99, size=12)
420420
# How many numbers are greater than 74?
@@ -678,7 +678,7 @@ We often use `for` loops with ranges (see @sec-ranges). Here we use a loop to
678678
print out the numbers [0 through 3]{.python}[1 through 4]{.r}:
679679

680680
```{python}
681-
for n in np.arange(0, 4):
681+
for n in np.arange(4):
682682
print('The loop variable n is', n)
683683
```
684684

@@ -725,7 +725,7 @@ Using the combination of looping over a range, and {{< var array >}} indexing,
725725
we can print out the author position *and* the author birth year:
726726

727727
```{python}
728-
for n in np.arange(0, 4):
728+
for n in np.arange(3):
729729
year = author_birth_years[n]
730730
print('Birth year of author position', n, 'is', year)
731731
```
@@ -783,43 +783,43 @@ You have just seen how we would use `np.arange` to send the numbers 0, 1, 2,
783783
and 3 to a `for` loop, in the example above, repeated here:
784784

785785
```{python}
786-
for n in np.arange(0, 4):
786+
for n in np.arange(3):
787787
year = author_birth_years[n]
788788
print('Birth year of author position', n, 'is', year)
789789
```
790790

791791
We could also use `range` instead of `np.arange` to do the same task:
792792

793793
```{python}
794-
for n in range(0, 4):
794+
for n in range(3):
795795
year = author_birth_years[n]
796796
print('Birth year of author position', n, 'is', year)
797797
```
798798

799799
In fact, you will see this pattern throughout the book, where we use `for`
800800
statements like `for value in range(10000):` to ask Python to put each number
801-
in the range 0 up to (not including) 100000 into the variable `value`, and then
801+
in the range 0 up to (not including) 10000 into the variable `value`, and then
802802
do something in the body of the loop. Just to be clear, we could always, and
803803
almost as easily, write `for value in np.arange(10000):` to do the same task.
804-
But — even though we could use `np.arange` to get an array of numbers, we
805-
generally prefer `range` in our Python `for` loops, because it is just a little
806-
less typing (without the `np.a` of `np.arange`, and because it is a more common
807-
pattern in standard Python code.[^range-efficiency]
804+
However, we generally prefer `range` in our Python `for` loops, because it is
805+
just a little less typing (without the `np.a` of `np.arange`), and because it
806+
is a more common pattern in standard Python code.[^range-efficiency]
808807

809808
[^range-efficiency]: Actually, there is a reason why many Python programmers
810-
prefer `range` in their `for` loops to `np.arange`. `range` is a very
811-
efficient container, in that it doesn't need to take up all the memory
812-
required to create the full array, it just needs to keep track of the number
813-
to give you next. For example, consider `for i in np.arange(10000000):` — in
814-
this case Python has to make an array with 10,000,000 elements, and then,
815-
from that array, it passes each value one by one to the `for` loop. On the
816-
other hand, `for i in range(10000000):` will do the job just as well,
817-
passing the same sequence of 0 through 9,999,999 to `i`, one by one, but
818-
`range(10000000)` never has to make the whole 10,000,000 element array — it
819-
just needs to keep track of which number to give up next. Therefore `range`
820-
is very quick, and very efficient in memory. This doesn't have any great
821-
practical impact for the arrays we are using here, typically of 10,0000
822-
elements or so, but it is worthwhile for larger arrays.
809+
prefer `range` to `np.arange` in the headers for their `for` loops. `range`
810+
is a very efficient container, in that it doesn't need to take up all the
811+
memory required to create the full array, it just needs to keep track of the
812+
number to give you next. For example, consider `for i in
813+
np.arange(10000000):` — in this case Python has to make an array with
814+
10,000,000 elements, and then, from that array, it passes each value one by
815+
one to the `for` loop. On the other hand, `for i in range(10000000):` will
816+
do the job just as well, passing the same sequence of 0 through 9,999,999 to
817+
`i`, one by one, but `range(10000000)` never has to make the whole
818+
10,000,000 element array — it just needs to keep track of which number to
819+
give up next. Therefore `range` is very quick, and very efficient in
820+
memory. This doesn't have any great practical impact for the arrays we are
821+
using here, typically of 10,0000 elements or so, but it can be important for
822+
larger arrays.
823823

824824
:::
825825

@@ -829,7 +829,7 @@ Here is the code we worked out above, to implement a single trial:
829829

830830
```{python}
831831
rnd = np.random.default_rng()
832-
zero_thru_99 = np.arange(0, 100)
832+
zero_thru_99 = np.arange(100)
833833
# Get 12 random numbers from 0 through 99
834834
a = rnd.choice(zero_thru_99, size=12)
835835
# How many numbers are greater than 74?
@@ -860,13 +860,13 @@ Now we can put these parts together to do 50 simulated trials:
860860
rnd = np.random.default_rng()
861861
862862
# All the numbers from 0 through 99.
863-
zero_through_99 = np.arange(0, 100)
863+
zero_through_99 = np.arange(100)
864864
865865
# An array to store the counts for each trial.
866866
z = np.zeros(50)
867867
868868
# Repeat the trial procedure 50 times.
869-
for i in np.arange(0, 50):
869+
for i in np.arange(50):
870870
# Get 12 random numbers from 0 through 99
871871
a = rnd.choice(zero_through_99, size=12)
872872
# How many numbers are greater than 74?
@@ -979,7 +979,8 @@ the population, which was 26% black, would have no black jurors.
979979
## Many many trials
980980

981981
Our experiment above is only 50 simulated trials. The higher the number of
982-
trials, the more confident we can be of our estimate for `p` — the proportion of trials where we get an all-white jury.
982+
trials, the more confident we can be of our estimate for `p` — the proportion
983+
of trials where we get an all-white jury.
983984

984985
It is no extra trouble for us to tell the computer to do a very large number
985986
of trials. For example, we might want to run 10,000 trials instead of 50.
@@ -993,10 +994,10 @@ comments, to make the code more compact.
993994
```{python}
994995
# Full simulation procedure, with 10,000 trials.
995996
rnd = np.random.default_rng()
996-
zero_through_99 = np.arange(0, 100)
997+
zero_through_99 = np.arange(100)
997998
# 10,000 trials.
998999
z = np.zeros(10000)
999-
for i in np.arange(0, 10000):
1000+
for i in np.arange(10000):
10001001
a = rnd.choice(zero_through_99, size=12)
10011002
b = np.sum(a > 74)
10021003
z[i] = b
@@ -1020,7 +1021,8 @@ p <- n_all_white / 10000
10201021
p
10211022
```
10221023

1023-
We now have a new, more accurate estimate of the proportion of Hypothetical County juries with all-white juries. The proportion is
1024+
We now have a new, more accurate estimate of the proportion of Hypothetical
1025+
County juries that are all white. The proportion is
10241026
`r round(get_var('p'), 3)`, and so
10251027
`r round(get_var('p') * 100, 1)`%.
10261028

0 commit comments

Comments
 (0)