ACMEdaemon

class acme.ACMEdaemon(pmap: ParallelMap, n_workers: int | str = 'auto', write_worker_results: bool = True, output_dir: str | None = None, result_shape: tuple[int | None, ...] | None = None, result_dtype: str = 'float', single_file: bool = False, write_pickle: bool = False, dryrun: bool = False, partition: str = 'auto', mem_per_worker: str = 'auto', setup_timeout: int = 60, setup_interactive: bool = True, stop_client: bool | str = 'auto', verbose: bool | None = None, logfile: bool | str | None = None)[source]

Bases: object

Attributes Summary

acme_func

argv

client

collect_results

func

has_slurm

kwargv

n_calls

n_workers

objName

out_dir

result_dtype

result_shape

results_container

sleepTime

stacking_dim

stop_client

task_ids

tqdmFormat

Methods Summary

cleanup()

Shut down any ad-hoc distributed computing clients created by prepare_client

compute([debug])

Perform the actual parallel execution of func

estimate_memuse()

A brute-force guessing approach to determine memory consumption of provided workload

func_wrapper(*args, **kwargs)

If the output of func is saved to disk, wrap func with this static method to take care of filling up HDF5/pickle files

perform_dryrun(setup_interactive)

Execute user function with one prepared randomly picked args, kwargs combo

post_process(futures)

Local helper to post-process results on disk/in-memory

pre_process(verbose, logfile, ...)

If write_* is True set up directories for saving output HDF5 containers (or pickle files).

prepare_client([n_workers, partition, ...])

Setup or fetch dask distributed processing client.

setup_output(output_dir, result_shape, ...)

Local helper for creating output directories and preparing containers

Attributes Documentation

acme_func
argv
client
collect_results
func
has_slurm
kwargv
n_calls
n_workers
objName = '<ACMEdaemon>'
out_dir
result_dtype
result_shape
results_container
sleepTime = 0.1
stacking_dim
stop_client
task_ids
tqdmFormat = '{desc}: {percentage:3.0f}% |{bar}| {n_fmt}/{total_fmt} [{elapsed}<{remaining}]'

Methods Documentation

cleanup() None[source]

Shut down any ad-hoc distributed computing clients created by prepare_client

compute(debug: bool = False) List | None[source]

Perform the actual parallel execution of func

If debug is True, use a single-threaded dask scheduler that does not actually process anything concurrently but uses the dask framework in a sequential setup.

estimate_memuse() str[source]

A brute-force guessing approach to determine memory consumption of provided workload

static func_wrapper(*args: Any, **kwargs: Any | None) None[source]

If the output of func is saved to disk, wrap func with this static method to take care of filling up HDF5/pickle files

If writing to HDF5 fails, use an “emergency-pickling” mechanism to try to save the output of func using pickle instead

perform_dryrun(setup_interactive: bool) bool[source]

Execute user function with one prepared randomly picked args, kwargs combo

post_process(futures: Future) List | None[source]

Local helper to post-process results on disk/in-memory

The return values is either None : if neither in-memory results collection or auto-writing was requested list of file-names: if write_worker_results is True list of objects: if in-memory results collection was requested

pre_process(verbose: bool | None, logfile: bool | str | None, write_worker_results: bool, output_dir: str | None, result_shape: tuple[int | None, ...] | None, result_dtype: str, single_file: bool, write_pickle: bool) None[source]

If write_* is True set up directories for saving output HDF5 containers (or pickle files). Warn if results are to be collected in memory

prepare_client(n_workers: int | str = 'auto', partition: str = 'auto', mem_per_worker: str = 'auto', setup_timeout: int = 60, setup_interactive: bool = True, stop_client: bool | str = 'auto') None[source]

Setup or fetch dask distributed processing client. Depending on available hardware, either start a local multi-processing client or launch a worker cluster via SLURM.

Also ensure that ad-hoc clients created here are stopped and worker jobs are properly released at the end of computation. However, ensure any client not created by prepare_client is not automatically cleaned up.

setup_output(output_dir: str | None, result_shape: tuple[int | None, ...] | None, single_file: bool, write_pickle: bool) None[source]

Local helper for creating output directories and preparing containers

__init__(pmap: ParallelMap, n_workers: int | str = 'auto', write_worker_results: bool = True, output_dir: str | None = None, result_shape: tuple[int | None, ...] | None = None, result_dtype: str = 'float', single_file: bool = False, write_pickle: bool = False, dryrun: bool = False, partition: str = 'auto', mem_per_worker: str = 'auto', setup_timeout: int = 60, setup_interactive: bool = True, stop_client: bool | str = 'auto', verbose: bool | None = None, logfile: bool | str | None = None) None[source]

Manager class for performing concurrent user function calls

Parameters:
  • pmap (ParallelMap context manager) – By default, :class:~`acme.ACMEDaemon assumes that that the provided ParallelMap instance has already been properly set up to process func (all input arguments parsed and properly formatted). All other input arguments of :class:~`acme.ACMEDaemon are extracted from the provided ParallelMap instance.

  • n_workers (int or "auto") – Number of SLURM workers (=jobs) to spawn. See ParallelMap for details.

  • write_worker_results (bool) – If True, the return value(s) of func is/are saved on disk. See ParallelMap for details.

  • output_dir (str or None) – If provided, auto-generated results are stored in the given path. See ParallelMap for details.

  • result_shape (tuple or None) – If provided, results are slotted into a dataset/array with layout result_shape. See ParallelMap for details.

  • result_dtype (str) – Determines numerical datatype of dataset laid out by result_shape. See ParallelMap for details.

  • single_file (bool) – If True, parallel workers write to the same results container. See ParallelMap for details.

  • write_pickle (bool) – If True, the return value(s) of func is/are pickled to disk. See ParallelMap for details.

  • dryrun (bool) – If True, a dry-run of calling func is performed using a single args, kwargs tuple. See ParallelMap for details.

  • partition (str) – Name of SLURM partition to use. See ParallelMap for details.

  • mem_per_worker (str) – Memory booking for each SLURM worker. See ParallelMap for details.

  • setup_timeout (int) – Timeout period (in seconds) for SLURM workers to come online. See ParallelMap for details.

  • setup_interactive (bool) – If True, user input is queried in case not enough SLURM workers could be started within setup_timeout seconds. See ParallelMap for details.

  • stop_client (bool or "auto") – If “auto”, automatically started distributed computing clients are shut down at the end of computation, while user-provided clients are left untouched. See ParallelMap for details.

  • verbose (None or bool) – If None (default), general run-time information as well as warnings and errors are shown. See ParallelMap for details.

  • logfile (None or bool or str) – If None (default) or True, and write_worker_results is True, all run-time information as well as errors and warnings are tracked in a log-file. See ParallelMap for details.

Returns:

results – If write_worker_results is True, results is a list of HDF5 file-names containing computed results. If write_worker_results is False, results is a list comprising the actual return values of func. If :class:~`acme.ACMEDaemon was instantiated by ParallelMap, results are propagated back to ParallelMap.

Return type:

list

See also

ParallelMap

Context manager and main user interface