# optuna_utils

Utilities to create optuna studies for scikit-decide.

Domain specification

Domain

# generic_optuna_experiment_monoproblem

generic_optuna_experiment_monoproblem(
  domain_factory: Callable[[], Domain],
solver_classes: list[type[Solver]],
kwargs_fixed_by_solver: Optional[dict[type[Solver], dict[str, Any]]] = None,
suggest_optuna_kwargs_by_name_by_solver: Optional[dict[type[Solver], dict[str, dict[str, Any]]]] = None,
additional_hyperparameters_by_solver: Optional[dict[type[Solver], list[Hyperparameter]]] = None,
n_trials: int = 150,
allow_retry_same_trial: bool = False,
rollout_num_episodes: int = 3,
rollout_max_steps_by_episode: int = 1000,
rollout_from_memory: Optional[Memory[D.T_state]] = None,
domain_reset_is_deterministic: bool = False,
study_basename: str = study,
create_another_study: bool = True,
overwrite_study = False,
storage_path: str = ./optuna-journal.log,
sampler: Optional[BaseSampler] = None,
pruner: Optional[BasePruner] = None,
seed: Optional[int] = None,
objective: Optional[Callable[[Solver, list[tuple[list[StrDict[D.T_observation]], list[StrDict[list[D.T_event]]], list[StrDict[Value[D.T_value]]]]]], float]] = None,
optuna_tuning_direction: str = maximize,
alternative_domain_factory: Optional[dict[type[Solver], Callable[[], Domain]]] = None
) -> optuna.Study

Create and run an optuna study to tune solvers hyperparameters for a given domain factory.

The optuna study will choose a solver and its hyperparameters in order to optimize the cumulated reward during a rollout.

When

  • solver policy is deterministic,
  • domain treansitions are deterministic,
  • domain state to observation is deterministic,
  • and rollout starts from a specified memory or domain.reset() is deterministic, we avoid repeating episode are they will be all the same.

One can

  • freeze some hyperparameters via kwargs_fixed_by_solver
  • tune solvers.init via kwargs_fixed_by_solver
  • restrict the choices/ranges for some hyperparameters via suggest_optuna_kwargs_by_name_by_solver
  • add other hyperparameters to some solvers via additional_hyperparameters_by_solver

The optuna study can be monitored with optuna-dashboard with

optuna-dashboard optuna-journal.log

(or the relevant path set by storage_path)

# Parameters

  • domain_factory: a callable with no argument returning the domain to solve (can be a mere domain class).
  • solver_classes: list of solvers to consider.
  • kwargs_fixed_by_solver: fixed hyperparameters by solver. Can also be other parameters needed by solvers' init().
  • suggest_optuna_kwargs_by_name_by_solver: kwargs_by_name passed to solvers' suggest_with_optuna(). Useful to restrict or specify choices, step, high, ...
  • additional_hyperparameters_by_solver: additional user-defined hyperparameters by solver, to be suggested by optuna
  • n_trials: number of trials to be run in the optuna study
  • allow_retry_same_trial: if True, allow trial with same parameters as before to be retried (useful if solve process is random for instance)
  • rollout_num_episodes: nb of episodes used in rollout to compute the value associated to a set of hyperparameters
  • rollout_max_steps_by_episode: max steps by episode used in rollout to compute the value associated to a set of hyperparameters
  • rollout_from_memory: if specified, rollout episode will start from this memory
  • domain_reset_is_deterministic: specified whether the domain reset() method (when existing) is deterministic. This information is used when rollout_from_memory is None (and thus domain.reset() is used) , to decide if several episodes are needed or not, depending on whether everything is deterministic or not.
  • study_basename: base name of the study generated. If create_another_study is True, a timestamp will be added to this base name.
  • create_another_study: if True a timestamp prefix will be added to the study base name in order to avoid overwritting or continuing a previously created study. Should be False, if one wants to add trials to an existing study.
  • overwrite_study: if True, any study with the same name as the one generated here will be deleted before starting the optuna study. Should be False, if one wants to add trials to an existing study.
  • storage_path: path to the journal used by optuna used to log the study. Can be a NFS path to allow parallelized optuna studies.
  • sampler: sampler used by the optuna study. If None, a TPESampler is used with the provided seed.
  • pruner: pruner used by the optuna study. if None, a MedianPruner is used.
  • seed: used to create the sampler if sampler is None. Should be set to an integer if one wants to ensure reproducible results.
  • aggreg_outcome_rewards: function used to aggregate outcome.value into a scalar. Default to taking float(outcome.value.reward) for single agent solver, and to taking sum(float(v.reward) for v in outcome.value.values()) for multi agents solver.
  • objective: function used to compute the scalar optimized by optuna. Takes solver and episodes obtain by solve + rollout as arguments and should return a float. Episodes being a list of episode represented by a tuple of observations, actions, values. Default to the cumulated reward other all episodes (and all agents when on a multiagent domain).
  • optuna_tuning_direction: direction of optuna optimization ("maximize" or "minimize")
  • alternative_domain_factory: mapping solver_class -> domain_factory when some solvers need a different domain_factory (e.g. width-based solvers need GymDomainForWidthSolvers instead of simple GymDomain)

# Returns

the launched optuna study.