Execute a queue of system commands in parallel.
executeMultiProcess(
commandQueue,
finishHandler,
timeoutHandler = function(...) TRUE,
errorHandler = defMultiProcErrorHandler,
prepareHandler = NULL,
cacheName = NULL,
setHash = NULL,
procTimeout = NULL,
printOutput = FALSE,
printError = FALSE,
logSubDir = NULL,
showProgress = TRUE,
waitTimeout = 50,
batchSize = 1,
delayBetweenProc = 0,
method = NULL
)
A list with commands. Should contain command
(scalar string) and args
(character
vector). More user
defineds fields are allowed and useful to attach command information that
can be used in the finish, timeout and error handlers.
A function that is called when a command has finished.
This function is typically used to process any results generated by the
command. The function is called right after spawning a new process, hence
processing results can occur while the next command is running in the
background. The function signature should be function(cmd)
where
cmd
is the queue data (from commandQueue
) of the command that
has finished.
A function that is called whenever a timeout for a
command occurs. Should return TRUE
if execution of the command
should be retried. The function signature should be function(cmd,
retries)
where cmd
is the queue data for that command and
retries
the number of times the command has been retried.
Similar to timeoutHandler
, but called whenever a
command has failed. The signature should be function(cmd, exitStatus,
retries)
. The exitStatus
argument is the exit code of the command
(may be NA
in rare cases this is unknown). Other arguments are as
timeoutHandler
. The return value should be as timeoutHandler
or a character
with an error message which will be thrown with
stop
.
A function that is called prior to execution of the
command. The function signature should be function(cmd)
where
cmd
is the queue data (from commandQueue
) of the command to
be started. The return value must be (an updated) cmd
.
Used for caching results. Set to NULL
to
disable caching.
The maximum time a process may consume before a timeout
occurs (in seconds). Set to NULL
to disable timeouts. Ignored if
patRoon.MP.method="future"
.
Set to TRUE
to print stdout/stderr
output to the console. Ignored if patRoon.MP.method="future".
The sub-directory used for log files. The final log file
path is constructed from patRoon.MP.logPath, logSubDir
and
logFile
set in the commandQueue
.
Set to TRUE
to display a progress bar. Ignored if
patRoon.MP.method="future".
Number of milliseconds to wait before checking if a new process should be spawned. Ignored if patRoon.MP.method="future".
Number of commands that should be executed in sequence per processes. See details. Ignored if patRoon.MP.method="future".
Minimum number of milliseconds to wait before spawning a new process. Might be needed to workaround errors. Ignored if patRoon.MP.method="future".
Overrides patRoon.MP.method if not NULL
.
This function executes a given queue with system commands in parallel to speed up computation. Commands are executed in the background using the processx package. A configurable maximum amount of processes are created to execute multiple commands in parallel.
Multiple commands may be executed in sequence that are launched from a single
parent process (as part of a batch script on Windows or combined with the
shell AND operator otherwise). Note that in this scenario still multiple
processes are spawned. Each of these processes will manage a chunk of the
command queue (size defined by batchSize
argument). This approach is
typically suitable for fast running commands: the overhead of spawning a new
process for each command from R would in this case be significant enough to
loose most of the speedup otherwise gained with parallel execution. Note that
the actual batch size may be adjusted to ensure that a maximum number of
processes are running simultaneously.
Other functionalities of this function include timeout and error handling.