@@ -18,7 +18,7 @@ worker processes. The application generates a large number of small
18
18
tasks, which are distributed to workers.
19
19
As tasks access external data sources and produce their own outputs,
20
20
more and more data is pulled into local storage on cluster nodes.
21
- This data is used to accelerate future tasks and avoid re-computing exisiting results.
21
+ This data is used to accelerate future tasks and avoid re-computing existing results.
22
22
The application gradually grows "like a vine" through
23
23
the cluster.
24
24
@@ -489,14 +489,14 @@ additional setup, depending on the language in use:
489
489
490
490
#### Python Setup
491
491
492
- If you installed via Conda , then no further setup is needed.
492
+ If you installed via conda , then no further setup is needed.
493
493
494
- If you are running a Python application and did * not* install via Conda ,
494
+ If you are running a Python application and did * not* install via conda ,
495
495
then you will need to set the ` PYTHONPATH ` to point to the cctools
496
496
installation, like this:
497
497
498
498
``` sh
499
- # Note: This is only needed if not using Conda :
499
+ # Note: This is only needed if not using conda :
500
500
$ PYVER=$( python -c ' import sys; print("%s.%s" % sys.version_info[:2])' )
501
501
$ export PYTHONPATH=${HOME} /cctools/lib/python${PYVER} /site-packages:${PYTHONPATH}
502
502
```
@@ -699,7 +699,7 @@ We can also create a factory directly in python. Creating a factory object does
699
699
immediately launch it, so this is a good time to configure the resources,
700
700
number of workers, etc. Factory objects function as Python context managers, so
701
701
to indicate that a set of commands should be run with a factory running, wrap
702
- them in a with statement. The factory will be cleaned up automtically at the
702
+ them in a with statement. The factory will be cleaned up automatically at the
703
703
end of the block. As an example:
704
704
705
705
``` python
@@ -793,7 +793,7 @@ If peer transfers have been disabled, they may be re-enabled accordingly:
793
793
```
794
794
795
795
Transfers between workers may be impacted by transient issues which may cause intermittent transfer failures. In these situations we take note of the
796
- failure that occured , and avoid using the same worker as a source for a period of time. This time period has a default value of 15 seconds.
796
+ failure that occurred , and avoid using the same worker as a source for a period of time. This time period has a default value of 15 seconds.
797
797
It may be changed by the user using ` vine_tune ` with the parameter ` transient-error-interval ` .
798
798
799
799
### MiniTasks
@@ -1003,7 +1003,7 @@ Now we are ready to declare the execution context from its local directory "my_c
1003
1003
```
1004
1004
1005
1005
1006
- ##### Apptainer Execution Cpntext From a Mini Task
1006
+ ##### Apptainer Execution Context From a Mini Task
1007
1007
1008
1008
In the previous section we manually built the directory structure needed for
1009
1009
the execution context. This is not very flexible, as we need to create one such
@@ -1115,7 +1115,7 @@ openssl rand -hex 32 > vine.password
1115
1115
```
1116
1116
1117
1117
This password will be particular to your application, and only managers and
1118
- workers with the same password will be able to interoperator .
1118
+ workers with the same password will be able to interoperate .
1119
1119
Then, modify your manager program to use the password:
1120
1120
1121
1121
=== "Python"
@@ -1350,7 +1350,7 @@ with an i686 architecture. These files will be named "my_exe" in the task's
1350
1350
sandbox, which means that the command line of the tasks does not need to
1351
1351
change.
1352
1352
1353
- Note this feature is specifically designed for specifying and distingushing
1353
+ Note this feature is specifically designed for specifying and distinguishing
1354
1354
input file names for different platforms and architectures. Also, this is
1355
1355
different from the $VINE_SANDBOX shell environment variable that exports
1356
1356
the location of the working directory of the worker to its execution
@@ -1496,7 +1496,7 @@ controlling scheduling, managing resources, and setting performance options
1496
1496
all apply to ` PythonTask ` as well.
1497
1497
1498
1498
When running a Python function remotely, it is assumed that the Python interpreter
1499
- and libraries available at the worker correspond to the appropiate python environment for the task.
1499
+ and libraries available at the worker correspond to the appropriate python environment for the task.
1500
1500
If this is not the case, an environment file can be provided with t.set_environment:
1501
1501
1502
1502
=== "Python"
@@ -1863,8 +1863,8 @@ m.pair(fn, seq1, seq2, chunk_size)
1863
1863
The ** treeReduce** function combines an array using a given function by
1864
1864
breaking up the array into chunk_sized chunks, computing the results, and returning
1865
1865
the results to a new array. It then does the same process on the new array until there
1866
- only one element left and then returns it. The given fucntion must accept an iterable,
1867
- and must be an associative fucntion , or else the same result cannot be gaurenteed for
1866
+ only one element left and then returns it. The given function must accept an iterable,
1867
+ and must be an associative function , or else the same result cannot be guaranteed for
1868
1868
different chunk sizes. Again, cheaper functions work better with larger chunk_sizes,
1869
1869
more expensive functions work better with smaller ones. Errors will be placed in results.
1870
1870
Also, the minimum chunk size is 2, as going 1 element at time would not reduce the array
@@ -2092,7 +2092,7 @@ batch system for a node of the desired size.
2092
2092
The only caveat is when using ` vine_submit_workers -T uge ` , as there are many
2093
2093
differences across systems that the script cannot manage. For `
2094
2094
vine_submit_workers -T uge` you have to set ** both** the resources used by the
2095
- worker (i.e., with ` --cores ` , etc.) and the appropiate computing node with the `
2095
+ worker (i.e., with ` --cores ` , etc.) and the appropriate computing node with the `
2096
2096
-p ` option.
2097
2097
2098
2098
For example, say that your local UGE installation requires you to set the
@@ -2103,7 +2103,7 @@ cores:
2103
2103
$ vine_submit_workers -T uge --cores 4 -p " -pe smp 4" MACHINENAME 9123
2104
2104
```
2105
2105
2106
- If you find that there are options that are needed everytime , you can compile
2106
+ If you find that there are options that are needed every time , you can compile
2107
2107
CCTools using the ` --uge-parameter ` . For example, at Notre Dame we
2108
2108
automatically set the number of cores as follows:
2109
2109
@@ -2456,7 +2456,7 @@ cores, memory and disk have modifiers `~` and `>` as follows:
2456
2456
2457
2457
A TaskVine manager produces several logs: ` debug ` , ` taskgraph ` , ` performance ` ,
2458
2458
and ` transactions ` . These logs are always enabled, and appear in the current
2459
- working directory in the sudirectories :
2459
+ working directory in the subdirectories :
2460
2460
2461
2461
``` sh
2462
2462
vine-run-info/YYYY-mm-ddTHH:MM:SS/vine-logs
@@ -2532,7 +2532,7 @@ conda install conda-forge::gnuplot
2532
2532
```
2533
2533
2534
2534
The script ` vine_graph_log ` is a wrapper for ` gnuplot ` , and with it you
2535
- can plot some of the statistics, such as total time spent transfering tasks,
2535
+ can plot some of the statistics, such as total time spent transferring tasks,
2536
2536
number of tasks running, and workers connected. For example, this command:
2537
2537
2538
2538
``` sh
@@ -2606,7 +2606,7 @@ Note that very large task graphs may be impractical to graph at this level of de
2606
2606
2607
2607
### Other Tools
2608
2608
2609
- ` vine_plot_compose ` visualizes workflow executions in a variety of ways, creating a composition of multiple plots in a single visualiztion . This tool may be useful in
2609
+ ` vine_plot_compose ` visualizes workflow executions in a variety of ways, creating a composition of multiple plots in a single visualization . This tool may be useful in
2610
2610
comparing performance across multiple executions.
2611
2611
2612
2612
``` sh
@@ -2844,8 +2844,8 @@ The `compute` call above may receive the following keyword arguments:
2844
2844
2845
2845
This subsection describes the communication patterns between a library and a worker, agnostic of programming languages a library is implemented in.
2846
2846
2847
- Upon library startup, it should send to its worker a json object as a byte stream.
2848
- The json object should have the following keys and associated values' types: ` {"name": type-string, "taskid": type-int, "exec\_mode": type-string} ` .
2847
+ Upon library startup, it should send to its worker a JSON object as a byte stream.
2848
+ The JSON object should have the following keys and associated values' types: ` {"name": type-string, "taskid": type-int, "exec\_mode": type-string} ` .
2849
2849
` "name" ` should be the name of the library.
2850
2850
` "taskid" ` should be the library' taskid as assigned by a taskvine manager.
2851
2851
` "exec\_mode" ` should be the function execution mode of the library.
0 commit comments