-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path10-cmdline.html
434 lines (410 loc) · 29.2 KB
/
10-cmdline.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<title>Software Carpentry: Programming with Python</title>
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap.css" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap-theme.css" />
<link rel="stylesheet" type="text/css" href="css/swc.css" />
<link rel="alternate" type="application/rss+xml" title="Software Carpentry Blog" href="http://software-carpentry.org/feed.xml"/>
<meta charset="UTF-8" />
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
</head>
<body class="lesson">
<div class="container card">
<div class="banner">
<a href="http://software-carpentry.org" title="Software Carpentry">
<img alt="Software Carpentry banner" src="img/software-carpentry-banner.png" />
</a>
</div>
<article>
<div class="row">
<div class="col-md-10 col-md-offset-1">
<a href="index.html"><h1 class="title">Programming with Python</h1></a>
<h2 class="subtitle">Command-Line Programs</h2>
<section class="objectives panel panel-warning">
<div class="panel-heading">
<h2 id="learning-objectives"><span class="glyphicon glyphicon-certificate"></span>Learning Objectives</h2>
</div>
<div class="panel-body">
<ul>
<li>Use the values of command-line arguments in a program.</li>
<li>Handle flags and files separately in a command-line program.</li>
<li>Read data from standard input in a program so that it can be used in a pipeline.</li>
</ul>
</div>
</section>
<p>The Jupyter Notebook and other interactive tools are great for prototyping code and exploring data, but sooner or later we will want to use our program in a pipeline or run it in a shell script to process thousands of data files. In order to do that, we need to make our programs work like other Unix command-line tools. For example, we may want a program that reads a dataset and prints the average inflammation per patient.</p>
<aside class="callout panel panel-info">
<div class="panel-heading">
<h2 id="switching-to-shell-commands"><span class="glyphicon glyphicon-pushpin"></span>Switching to Shell Commands</h2>
</div>
<div class="panel-body">
<p>In this lesson we are switching from typing commands in a Python interpreter to typing commands in a shell terminal window (such as bash). When you see a <code>$</code> in front of a command that tells you to run that command in the shell rather than the Python interpreter.</p>
</div>
</aside>
<p>This program does exactly what we want - it prints the average inflammation per patient for a given file.</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> code/readings-04.py --mean data/inflammation-01.csv
<span class="kw">5.45</span>
<span class="kw">5.425</span>
<span class="kw">6.1</span>
<span class="kw">...</span>
<span class="kw">6.4</span>
<span class="kw">7.05</span>
<span class="kw">5.9</span></code></pre></div>
<p>We might also want to look at the minimum of the first four lines</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">head</span> -4 data/inflammation-01.csv <span class="kw">|</span> <span class="kw">python</span> code/readings-04.py --min</code></pre></div>
<p>or the maximum inflammations in several files one after another:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> code/readings-04.py --max data/inflammation-*.csv</code></pre></div>
<p>Our scripts should do the following:</p>
<ol style="list-style-type: decimal">
<li>If no filename is given on the command line, read data from <a href="reference.html#standard-input">standard input</a>.</li>
<li>If one or more filenames are given, read data from them and report statistics for each file separately.</li>
<li>Use the <code>--min</code>, <code>--mean</code>, or <code>--max</code> flag to determine what statistic to print.</li>
</ol>
<p>To make this work, we need to know how to handle command-line arguments in a program, and how to get at standard input. We’ll tackle these questions in turn below.</p>
<h2 id="command-line-arguments">Command-Line Arguments</h2>
<p>Using the text editor of your choice, save the following in a text file called <code>sys-version.py</code>:</p>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="im">import</span> sys
<span class="bu">print</span>(<span class="st">'version is'</span>, sys.version)</code></pre></div>
<p>The first line imports a library called <code>sys</code>, which is short for “system”. It defines values such as <code>sys.version</code>, which describes which version of Python we are running. We can run this script from the command line like this:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> sys-version.py</code></pre></div>
<pre class="output"><code>version is 3.4.3+ (default, Jul 28 2015, 13:17:50)
[GCC 4.9.3]</code></pre>
<p>Create another file called <code>argv-list.py</code> and save the following text to it.</p>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="im">import</span> sys
<span class="bu">print</span>(<span class="st">'sys.argv is'</span>, sys.argv)</code></pre></div>
<p>The strange name <code>argv</code> stands for “argument values”. Whenever Python runs a program, it takes all of the values given on the command line and puts them in the list <code>sys.argv</code> so that the program can determine what they were. If we run this program with no arguments:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> argv-list.py</code></pre></div>
<pre class="output"><code>sys.argv is ['argv-list.py']</code></pre>
<p>the only thing in the list is the full path to our script, which is always <code>sys.argv[0]</code>. If we run it with a few arguments, however:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> argv-list.py first second third</code></pre></div>
<pre class="output"><code>sys.argv is ['argv-list.py', 'first', 'second', 'third']</code></pre>
<p>then Python adds each of those arguments to that magic list.</p>
<p>With this in hand, let’s build a version of <code>readings.py</code> that always prints the per-patient mean of a single data file. The first step is to write a function that outlines our implementation, and a placeholder for the function that does the actual work. By convention this function is usually called <code>main</code>, though we can call it whatever we want:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">cat</span> readings-01.py</code></pre></div>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="im">import</span> sys
<span class="im">import</span> numpy
<span class="kw">def</span> main():
script <span class="op">=</span> sys.argv[<span class="dv">0</span>]
filename <span class="op">=</span> sys.argv[<span class="dv">1</span>]
data <span class="op">=</span> numpy.loadtxt(filename, delimiter<span class="op">=</span><span class="st">','</span>)
<span class="cf">for</span> m <span class="op">in</span> data.mean(axis<span class="op">=</span><span class="dv">1</span>):
<span class="bu">print</span>(m)</code></pre></div>
<p>This function gets the name of the script from <code>sys.argv[0]</code>, because that’s where it’s always put, and the name of the file to process from <code>sys.argv[1]</code>. Here’s a simple test:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> readings-01.py inflammation-01.csv</code></pre></div>
<p>There is no output because we have defined a function, but haven’t actually called it. Let’s add a call to <code>main</code>:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">cat</span> readings-02.py</code></pre></div>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="im">import</span> sys
<span class="im">import</span> numpy
<span class="kw">def</span> main():
script <span class="op">=</span> sys.argv[<span class="dv">0</span>]
filename <span class="op">=</span> sys.argv[<span class="dv">1</span>]
data <span class="op">=</span> numpy.loadtxt(filename, delimiter<span class="op">=</span><span class="st">','</span>)
<span class="cf">for</span> m <span class="op">in</span> data.mean(axis<span class="op">=</span><span class="dv">1</span>):
<span class="bu">print</span>(m)
main()</code></pre></div>
<p>and run that:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> readings-02.py inflammation-01.csv</code></pre></div>
<pre class="output"><code>5.45
5.425
6.1
5.9
5.55
6.225
5.975
6.65
6.625
6.525
6.775
5.8
6.225
5.75
5.225
6.3
6.55
5.7
5.85
6.55
5.775
5.825
6.175
6.1
5.8
6.425
6.05
6.025
6.175
6.55
6.175
6.35
6.725
6.125
7.075
5.725
5.925
6.15
6.075
5.75
5.975
5.725
6.3
5.9
6.75
5.925
7.225
6.15
5.95
6.275
5.7
6.1
6.825
5.975
6.725
5.7
6.25
6.4
7.05
5.9</code></pre>
<aside class="callout panel panel-info">
<div class="panel-heading">
<h2 id="the-right-way-to-do-it"><span class="glyphicon glyphicon-pushpin"></span>The Right Way to Do It</h2>
</div>
<div class="panel-body">
<p>If our programs can take complex parameters or multiple filenames, we shouldn’t handle <code>sys.argv</code> directly. Instead, we should use Python’s <code>argparse</code> library, which handles common cases in a systematic way, and also makes it easy for us to provide sensible error messages for our users. We will not cover this module in this lesson but you can go to Tshepang Lekhonkhobe’s <a href="http://docs.python.org/dev/howto/argparse.html">Argparse tutorial</a> that is part of Python’s Official Documentation.</p>
</div>
</aside>
<h2 id="handling-multiple-files">Handling Multiple Files</h2>
<p>The next step is to teach our program how to handle multiple files. Since 60 lines of output per file is a lot to page through, we’ll start by using three smaller files, each of which has three days of data for two patients:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">ls</span> small-*.csv</code></pre></div>
<pre class="output"><code>small-01.csv small-02.csv small-03.csv</code></pre>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">cat</span> small-01.csv</code></pre></div>
<pre class="output"><code>0,0,1
0,1,2</code></pre>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> readings-02.py small-01.csv</code></pre></div>
<pre class="output"><code>0.333333333333
1.0</code></pre>
<p>Using small data files as input also allows us to check our results more easily: here, for example, we can see that our program is calculating the mean correctly for each line, whereas we were really taking it on faith before. This is yet another rule of programming: <em>test the simple things first</em>.</p>
<p>We want our program to process each file separately, so we need a loop that executes once for each filename. If we specify the files on the command line, the filenames will be in <code>sys.argv</code>, but we need to be careful: <code>sys.argv[0]</code> will always be the name of our script, rather than the name of a file. We also need to handle an unknown number of filenames, since our program could be run for any number of files.</p>
<p>The solution to both problems is to loop over the contents of <code>sys.argv[1:]</code>. The ‘1’ tells Python to start the slice at location 1, so the program’s name isn’t included; since we’ve left off the upper bound, the slice runs to the end of the list, and includes all the filenames. Here’s our changed program <code>readings-03.py</code>:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">cat</span> readings-03.py</code></pre></div>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="im">import</span> sys
<span class="im">import</span> numpy
<span class="kw">def</span> main():
script <span class="op">=</span> sys.argv[<span class="dv">0</span>]
<span class="cf">for</span> filename <span class="op">in</span> sys.argv[<span class="dv">1</span>:]:
data <span class="op">=</span> numpy.loadtxt(filename, delimiter<span class="op">=</span><span class="st">','</span>)
<span class="cf">for</span> m <span class="op">in</span> data.mean(axis<span class="op">=</span><span class="dv">1</span>):
<span class="bu">print</span>(m)
main()</code></pre></div>
<p>and here it is in action:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> readings-03.py small-01.csv small-02.csv</code></pre></div>
<pre class="output"><code>0.333333333333
1.0
13.6666666667
11.0</code></pre>
<aside class="callout panel panel-info">
<div class="panel-heading">
<h2 id="the-right-way-to-do-it-1"><span class="glyphicon glyphicon-pushpin"></span>The Right Way to Do It</h2>
</div>
<div class="panel-body">
<p>At this point, we have created three versions of our script called <code>readings-01.py</code>, <code>readings-02.py</code>, and <code>readings-03.py</code>. We wouldn’t do this in real life: instead, we would have one file called <code>readings.py</code> that we committed to version control every time we got an enhancement working. For teaching, though, we need all the successive versions side by side.</p>
</div>
</aside>
<h2 id="handling-command-line-flags">Handling Command-Line Flags</h2>
<p>The next step is to teach our program to pay attention to the <code>--min</code>, <code>--mean</code>, and <code>--max</code> flags. These always appear before the names of the files, so we could just do this:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">cat</span> readings-04.py</code></pre></div>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="im">import</span> sys
<span class="im">import</span> numpy
<span class="kw">def</span> main():
script <span class="op">=</span> sys.argv[<span class="dv">0</span>]
action <span class="op">=</span> sys.argv[<span class="dv">1</span>]
filenames <span class="op">=</span> sys.argv[<span class="dv">2</span>:]
<span class="cf">for</span> f <span class="op">in</span> filenames:
data <span class="op">=</span> numpy.loadtxt(f, delimiter<span class="op">=</span><span class="st">','</span>)
<span class="cf">if</span> action <span class="op">==</span> <span class="st">'--min'</span>:
values <span class="op">=</span> data.<span class="bu">min</span>(axis<span class="op">=</span><span class="dv">1</span>)
<span class="cf">elif</span> action <span class="op">==</span> <span class="st">'--mean'</span>:
values <span class="op">=</span> data.mean(axis<span class="op">=</span><span class="dv">1</span>)
<span class="cf">elif</span> action <span class="op">==</span> <span class="st">'--max'</span>:
values <span class="op">=</span> data.<span class="bu">max</span>(axis<span class="op">=</span><span class="dv">1</span>)
<span class="cf">for</span> m <span class="op">in</span> values:
<span class="bu">print</span>(m)
main()</code></pre></div>
<p>This works:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> readings-04.py --max small-01.csv</code></pre></div>
<pre class="output"><code>1.0
2.0</code></pre>
<p>but there are several things wrong with it:</p>
<ol style="list-style-type: decimal">
<li><p><code>main</code> is too large to read comfortably.</p></li>
<li><p>If we do not specify at least two additional arguments on the command-line, one for the <strong>flag</strong> and one for the <strong>filename</strong>, but only one, the program will not throw an exception but will run. It assumes that the file list is empty, as <code>sys.argv[1]</code> will be considered the <code>action</code>, even if it is a filename. <a href="reference.html#silence-failure">Silent failures</a> like this are always hard to debug.</p></li>
<li><p>The program should check if the submitted <code>action</code> is one of the three recognized flags.</p></li>
</ol>
<p>This version pulls the processing of each file out of the loop into a function of its own. It also checks that <code>action</code> is one of the allowed flags before doing any processing, so that the program fails fast:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">cat</span> readings-05.py</code></pre></div>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="im">import</span> sys
<span class="im">import</span> numpy
<span class="kw">def</span> main():
script <span class="op">=</span> sys.argv[<span class="dv">0</span>]
action <span class="op">=</span> sys.argv[<span class="dv">1</span>]
filenames <span class="op">=</span> sys.argv[<span class="dv">2</span>:]
<span class="cf">assert</span> action <span class="op">in</span> [<span class="st">'--min'</span>, <span class="st">'--mean'</span>, <span class="st">'--max'</span>], <span class="op">\</span>
<span class="co">'Action is not one of --min, --mean, or --max: '</span> <span class="op">+</span> action
<span class="cf">for</span> f <span class="op">in</span> filenames:
process(f, action)
<span class="kw">def</span> process(filename, action):
data <span class="op">=</span> numpy.loadtxt(filename, delimiter<span class="op">=</span><span class="st">','</span>)
<span class="cf">if</span> action <span class="op">==</span> <span class="st">'--min'</span>:
values <span class="op">=</span> data.<span class="bu">min</span>(axis<span class="op">=</span><span class="dv">1</span>)
<span class="cf">elif</span> action <span class="op">==</span> <span class="st">'--mean'</span>:
values <span class="op">=</span> data.mean(axis<span class="op">=</span><span class="dv">1</span>)
<span class="cf">elif</span> action <span class="op">==</span> <span class="st">'--max'</span>:
values <span class="op">=</span> data.<span class="bu">max</span>(axis<span class="op">=</span><span class="dv">1</span>)
<span class="cf">for</span> m <span class="op">in</span> values:
<span class="bu">print</span>(m)
main()</code></pre></div>
<p>This is four lines longer than its predecessor, but broken into more digestible chunks of 8 and 12 lines.</p>
<h2 id="handling-standard-input">Handling Standard Input</h2>
<p>The next thing our program has to do is read data from standard input if no filenames are given so that we can put it in a pipeline, redirect input to it, and so on. Let’s experiment in another script called <code>count-stdin.py</code>:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">cat</span> count-stdin.py</code></pre></div>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="im">import</span> sys
count <span class="op">=</span> <span class="dv">0</span>
<span class="cf">for</span> line <span class="op">in</span> sys.stdin:
count <span class="op">+=</span> <span class="dv">1</span>
<span class="bu">print</span>(count, <span class="st">'lines in standard input'</span>)</code></pre></div>
<p>This little program reads lines from a special “file” called <code>sys.stdin</code>, which is automatically connected to the program’s standard input. We don’t have to open it — Python and the operating system take care of that when the program starts up — but we can do almost anything with it that we could do to a regular file. Let’s try running it as if it were a regular command-line program:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> count-stdin.py <span class="kw"><</span> small-01.csv</code></pre></div>
<pre class="output"><code>2 lines in standard input</code></pre>
<p>A common mistake is to try to run something that reads from standard input like this:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> count_stdin.py small-01.csv</code></pre></div>
<p>i.e., to forget the <code><</code> character that redirect the file to standard input. In this case, there’s nothing in standard input, so the program waits at the start of the loop for someone to type something on the keyboard. Since there’s no way for us to do this, our program is stuck, and we have to halt it using the <code>Interrupt</code> option from the <code>Kernel</code> menu in the Notebook.</p>
<p>We now need to rewrite the program so that it loads data from <code>sys.stdin</code> if no filenames are provided. Luckily, <code>numpy.loadtxt</code> can handle either a filename or an open file as its first parameter, so we don’t actually need to change <code>process</code>. Only <code>main</code> changes:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">cat</span> readings-06.py</code></pre></div>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="im">import</span> sys
<span class="im">import</span> numpy
<span class="kw">def</span> main():
script <span class="op">=</span> sys.argv[<span class="dv">0</span>]
action <span class="op">=</span> sys.argv[<span class="dv">1</span>]
filenames <span class="op">=</span> sys.argv[<span class="dv">2</span>:]
<span class="cf">assert</span> action <span class="op">in</span> [<span class="st">'--min'</span>, <span class="st">'--mean'</span>, <span class="st">'--max'</span>], <span class="op">\</span>
<span class="co">'Action is not one of --min, --mean, or --max: '</span> <span class="op">+</span> action
<span class="cf">if</span> <span class="bu">len</span>(filenames) <span class="op">==</span> <span class="dv">0</span>:
process(sys.stdin, action)
<span class="cf">else</span>:
<span class="cf">for</span> f <span class="op">in</span> filenames:
process(f, action)
<span class="kw">def</span> process(filename, action):
data <span class="op">=</span> numpy.loadtxt(filename, delimiter<span class="op">=</span><span class="st">','</span>)
<span class="cf">if</span> action <span class="op">==</span> <span class="st">'--min'</span>:
values <span class="op">=</span> data.<span class="bu">min</span>(axis<span class="op">=</span><span class="dv">1</span>)
<span class="cf">elif</span> action <span class="op">==</span> <span class="st">'--mean'</span>:
values <span class="op">=</span> data.mean(axis<span class="op">=</span><span class="dv">1</span>)
<span class="cf">elif</span> action <span class="op">==</span> <span class="st">'--max'</span>:
values <span class="op">=</span> data.<span class="bu">max</span>(axis<span class="op">=</span><span class="dv">1</span>)
<span class="cf">for</span> m <span class="op">in</span> values:
<span class="bu">print</span>(m)
main()</code></pre></div>
<p>Let’s try it out:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">python</span> readings-06.py --mean <span class="kw"><</span> small-01.csv</code></pre></div>
<pre class="output"><code>0.333333333333
1.0</code></pre>
<p>That’s better. In fact, that’s done: the program now does everything we set out to do.</p>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h2 id="arithmetic-on-the-command-line"><span class="glyphicon glyphicon-pencil"></span>Arithmetic on the command line</h2>
</div>
<div class="panel-body">
<p>Write a command-line program that does addition and subtraction:</p>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python">$ python arith.py add <span class="dv">1</span> <span class="dv">2</span></code></pre></div>
<pre class="output"><code>3</code></pre>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python">$ python arith.py subtract <span class="dv">3</span> <span class="dv">4</span></code></pre></div>
<pre class="output"><code>-1</code></pre>
</div>
</section>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h2 id="finding-particular-files"><span class="glyphicon glyphicon-pencil"></span>Finding particular files</h2>
</div>
<div class="panel-body">
<p>Using the <code>glob</code> module introduced <a href="04-files.html">earlier</a>, write a simple version of <code>ls</code> that shows files in the current directory with a particular suffix. A call to this script should look like this:</p>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python">$ python my_ls.py py</code></pre></div>
<pre class="output"><code>left.py
right.py
zero.py</code></pre>
</div>
</section>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h2 id="changing-flags"><span class="glyphicon glyphicon-pencil"></span>Changing flags</h2>
</div>
<div class="panel-body">
<p>Rewrite <code>readings.py</code> so that it uses <code>-n</code>, <code>-m</code>, and <code>-x</code> instead of <code>--min</code>, <code>--mean</code>, and <code>--max</code> respectively. Is the code easier to read? Is the program easier to understand?</p>
</div>
</section>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h2 id="adding-a-help-message"><span class="glyphicon glyphicon-pencil"></span>Adding a help message</h2>
</div>
<div class="panel-body">
<p>Separately, modify <code>readings.py</code> so that if no parameters are given (i.e., no action is specified and no filenames are given), it prints a message explaining how it should be used.</p>
</div>
</section>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h2 id="adding-a-default-action"><span class="glyphicon glyphicon-pencil"></span>Adding a default action</h2>
</div>
<div class="panel-body">
<p>Separately, modify <code>readings.py</code> so that if no action is given it displays the means of the data.</p>
</div>
</section>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h2 id="a-file-checker"><span class="glyphicon glyphicon-pencil"></span>A file-checker</h2>
</div>
<div class="panel-body">
<p>Write a program called <code>check.py</code> that takes the names of one or more inflammation data files as arguments and checks that all the files have the same number of rows and columns. What is the best way to test your program?</p>
</div>
</section>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h2 id="counting-lines"><span class="glyphicon glyphicon-pencil"></span>Counting lines</h2>
</div>
<div class="panel-body">
<p>Write a program called <code>line-count.py</code> that works like the Unix <code>wc</code> command:</p>
<ul>
<li>If no filenames are given, it reports the number of lines in standard input.</li>
<li>If one or more filenames are given, it reports the number of lines in each, followed by the total number of lines.</li>
</ul>
</div>
</section>
</div>
</div>
</article>
<div class="footer">
<a class="label swc-blue-bg" href="http://software-carpentry.org">Software Carpentry</a>
<a class="label swc-blue-bg" href="https://github.com/swcarpentry/python-novice-inflammation">Source</a>
<a class="label swc-blue-bg" href="mailto:[email protected]">Contact</a>
<a class="label swc-blue-bg" href="LICENSE.html">License</a>
</div>
</div>
<!-- Javascript placed at the end of the document so the pages load faster -->
<script src="http://software-carpentry.org/v5/js/jquery-1.9.1.min.js"></script>
<script src="css/bootstrap/bootstrap-js/bootstrap.js"></script>
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-37305346-2', 'auto');
ga('send', 'pageview');
</script>
</body>
</html>