Commit 5357f55
Benjamin Moody
dl_files, dl_database: avoid using multiple processes
The standard multiprocessing module is used to distribute a task to
multiple processes, which is useful when doing heavy computation due
to the limitations of CPython; however, making this work is dependent
on the ability to fork processes or else to kludgily emulate forking
on systems that don't support it. In particular, it tends to cause
problems on Windows unless you are very scrupulous about how you write
your program.
Therefore, as a rule, the multiprocessing module shouldn't be used by
general-purpose libraries, and should only be invoked by application
programmers themselves (who are in a position to guarantee that
imports have no side effects, the main script uses 'if __name__ ==
"__main__"', etc.)
However, downloading a file isn't a CPU-bound task, it's an I/O-bound
task, and therefore for this purpose, parallel threads should work as
well or even better than parallel processes. The
multiprocessing.dummy module provides the same API as the
multiprocessing module, but uses threads instead of processes, so it
should be safe to use in a general-purpose library.1 parent 84cbefb commit 5357f55
2 files changed
+4
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| |||
566 | 566 | | |
567 | 567 | | |
568 | 568 | | |
569 | | - | |
| 569 | + | |
570 | 570 | | |
571 | 571 | | |
572 | 572 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| |||
3090 | 3090 | | |
3091 | 3091 | | |
3092 | 3092 | | |
3093 | | - | |
| 3093 | + | |
3094 | 3094 | | |
3095 | 3095 | | |
3096 | 3096 | | |
| |||
0 commit comments