# hub.domain.graph_domain.GraphDomain

Domain specification

Domain

# GraphDomainUncertain

This domain is for uncertain goal MDP where the full transitions, probabilities and cost are already computed. In this case, using the dictionary structures will improve the computing performance of the domain and therefore its solving time

# Constructor GraphDomainUncertain

GraphDomainUncertain(
  next_state_map: dict[D.T_state, dict[D.T_event, dict[D.T_state, tuple[float, float]]]],
state_terminal: dict[D.T_state, bool],
state_goal: dict[D.T_state, bool]
)

# Parameters

  • next_state_map : a dictionary whose keys are state and values are dictionary with actions as keys and as value another dict with next state as keys and with (proba, cost) as value. This format could be changed in the future.
  • state_terminal: a dictionary indicating for each state if it's terminal or not
  • state_goal: a dictionary indicating for each state if it's a goal or not

# check_value Rewards

check_value(
  self,
value: Value[D.T_value]
) -> bool

Check that a value is compliant with its reward specification.

TIP

This function returns always True by default because any kind of reward should be accepted at this level.

# Parameters

  • value: The value to check.

# Returns

True if the value is compliant (False otherwise).

# get_action_mask Events

get_action_mask(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> StrDict[Mask]

Get action mask for the given memory or internal one if omitted.

An action mask is another (more specific) format for applicable actions, that has a meaning only if the action space can be iterated over in some way. It is represented by a flat array of 0's and 1's ordered as the actions when enumerated: 1 for an applicable action, and 0 for a not applicable action.

More precisely, this implementation makes the assumption that each agent action space is an EnumerableSpace, and calls internally self.get_applicable_action().

The action mask is used for instance by RL solvers to shut down logits associated to non-applicable actions in the output of their internal neural network.

# Parameters

  • memory: The memory to consider. If None, works on the internal memory of the domain.

# Returns

a numpy array (or dict agent-> numpy array for multi-agent domains) with 0-1 indicating applicability of the action (1 meaning applicable and 0 not applicable)

# get_action_space Events

get_action_space(
  self
) -> StrDict[Space[D.T_event]]

Get the (cached) domain action space (finite or infinite set).

By default, Events.get_action_space() internally calls Events._get_action_space_() the first time and automatically caches its value to make future calls more efficient (since the action space is assumed to be constant).

# Returns

The action space.

# get_agents MultiAgent

get_agents(
  self
) -> set[str]

Return the set of available agents ids.

# get_applicable_actions Events

get_applicable_actions(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> StrDict[Space[D.T_event]]

Get the space (finite or infinite set) of applicable actions in the given memory (state or history), or in the internal one if omitted.

By default, Events.get_applicable_actions() provides some boilerplate code and internally calls Events._get_applicable_actions(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

The space of applicable actions.

# get_enabled_events Events

get_enabled_events(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> Space[D.T_event]

Get the space (finite or infinite set) of enabled uncontrollable events in the given memory (state or history), or in the internal one if omitted.

By default, Events.get_enabled_events() provides some boilerplate code and internally calls Events._get_enabled_events(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

The space of enabled events.

# get_goals Goals

get_goals(
  self
) -> StrDict[Space[D.T_observation]]

Get the (cached) domain goals space (finite or infinite set).

By default, Goals.get_goals() internally calls Goals._get_goals_() the first time and automatically caches its value to make future calls more efficient (since the goals space is assumed to be constant).

WARNING

Goal states are assumed to be fully observable (i.e. observation = state) so that there is never uncertainty about whether the goal has been reached or not. This assumption guarantees that any policy that does not reach the goal with certainty incurs in infinite expected cost. - Geffner, 2013: A Concise Introduction to Models and Methods for Automated Planning

# Returns

The goals space.

# get_next_state_distribution UncertainTransitions

get_next_state_distribution(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]]
) -> Distribution[D.T_state]

Get the probability distribution of next state given a memory and action.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The probability distribution of next state.

# get_observation TransformedObservable

get_observation(
  self,
state: D.T_state,
action: Optional[StrDict[list[D.T_event]]] = None
) -> StrDict[D.T_observation]

Get the deterministic observation given a state and action.

# Parameters

  • state: The state to be observed.
  • action: The last applied action (or None if the state is an initial state).

# Returns

The probability distribution of the observation.

# get_observation_distribution PartiallyObservable

get_observation_distribution(
  self,
state: D.T_state,
action: Optional[StrDict[list[D.T_event]]] = None
) -> Distribution[StrDict[D.T_observation]]

Get the probability distribution of the observation given a state and action.

In mathematical terms (discrete case), given an action , this function represents: , where is the random variable of the observation.

# Parameters

  • state: The state to be observed.
  • action: The last applied action (or None if the state is an initial state).

# Returns

The probability distribution of the observation.

# get_observation_space PartiallyObservable

get_observation_space(
  self
) -> StrDict[Space[D.T_observation]]

Get the (cached) observation space (finite or infinite set).

By default, PartiallyObservable.get_observation_space() internally calls PartiallyObservable._get_observation_space_() the first time and automatically caches its value to make future calls more efficient (since the observation space is assumed to be constant).

# Returns

The observation space.

# get_transition_value UncertainTransitions

get_transition_value(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]],
next_state: Optional[D.T_state] = None
) -> StrDict[Value[D.T_value]]

Get the value (reward or cost) of a transition.

The transition to consider is defined by the function parameters.

TIP

If this function never depends on the next_state parameter for its computation, it is recommended to indicate it by overriding UncertainTransitions._is_transition_value_dependent_on_next_state_() to return False. This information can then be exploited by solvers to avoid computing next state to evaluate a transition value (more efficient).

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.
  • next_state: The next state in which the transition ends (if needed for the computation).

# Returns

The transition value (reward or cost).

# is_action Events

is_action(
  self,
event: D.T_event
) -> bool

Indicate whether an event is an action (i.e. a controllable event for the agents).

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain action space provided by Events.get_action_space(), but it can be overridden for faster implementations.

# Parameters

  • event: The event to consider.

# Returns

True if the event is an action (False otherwise).

# is_applicable_action Events

is_applicable_action(
  self,
action: StrDict[D.T_event],
memory: Optional[Memory[D.T_state]] = None
) -> bool

Indicate whether an action is applicable in the given memory (state or history), or in the internal one if omitted.

By default, Events.is_applicable_action() provides some boilerplate code and internally calls Events._is_applicable_action(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

True if the action is applicable (False otherwise).

# is_enabled_event Events

is_enabled_event(
  self,
event: D.T_event,
memory: Optional[Memory[D.T_state]] = None
) -> bool

Indicate whether an uncontrollable event is enabled in the given memory (state or history), or in the internal one if omitted.

By default, Events.is_enabled_event() provides some boilerplate code and internally calls Events._is_enabled_event(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

True if the event is enabled (False otherwise).

# is_goal Goals

is_goal(
  self,
observation: StrDict[D.T_observation]
) -> StrDict[D.T_predicate]

Indicate whether an observation belongs to the goals.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain goals space provided by Goals.get_goals(), but it can be overridden for faster implementations.

# Parameters

  • observation: The observation to consider.

# Returns

True if the observation is a goal (False otherwise).

# is_observation PartiallyObservable

is_observation(
  self,
observation: StrDict[D.T_observation]
) -> bool

Check that an observation indeed belongs to the domain observation space.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain observation space provided by PartiallyObservable.get_observation_space(), but it can be overridden for faster implementations.

# Parameters

  • observation: The observation to consider.

# Returns

True if the observation belongs to the domain observation space (False otherwise).

# is_terminal UncertainTransitions

is_terminal(
  self,
state: D.T_state
) -> StrDict[D.T_predicate]

Indicate whether a state is terminal.

A terminal state is a state with no outgoing transition (except to itself with value 0).

# Parameters

  • state: The state to consider.

# Returns

True if the state is terminal (False otherwise).

# is_transition_value_dependent_on_next_state UncertainTransitions

is_transition_value_dependent_on_next_state(
  self
) -> bool

Indicate whether get_transition_value() requires the next_state parameter for its computation (cached).

By default, UncertainTransitions.is_transition_value_dependent_on_next_state() internally calls UncertainTransitions._is_transition_value_dependent_on_next_state_() the first time and automatically caches its value to make future calls more efficient (since the returned value is assumed to be constant).

# Returns

True if the transition value computation depends on next_state (False otherwise).

# sample Simulation

sample(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]]
) -> EnvironmentOutcome[StrDict[D.T_observation], StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Sample one transition of the simulator's dynamics.

By default, Simulation.sample() provides some boilerplate code and internally calls Simulation._sample() (which returns a transition outcome). The boilerplate code automatically samples an observation corresponding to the sampled next state.

TIP

Whenever an existing simulator needs to be wrapped instead of implemented fully in scikit-decide (e.g. a simulator), it is recommended to overwrite Simulation.sample() to call the external simulator and not use the Simulation._sample() helper function.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The environment outcome of the sampled transition.

# set_memory Simulation

set_memory(
  self,
memory: Memory[D.T_state]
) -> None

Set internal memory attribute _memory to given one.

This can be useful to set a specific "starting point" before doing a rollout with successive Environment.step() calls.

# Parameters

  • memory: The memory to set internally.

# Example

# Set simulation_domain memory to my_state (assuming Markovian domain)
simulation_domain.set_memory(my_state)

# Start a 100-steps rollout from here (applying my_action at every step)
for _ in range(100):
    simulation_domain.step(my_action)

# step Environment

step(
  self,
action: StrDict[list[D.T_event]]
) -> EnvironmentOutcome[StrDict[D.T_observation], StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Run one step of the environment's dynamics.

By default, Environment.step() provides some boilerplate code and internally calls Environment._step() (which returns a transition outcome). The boilerplate code automatically stores next state into the _memory attribute and samples a corresponding observation.

TIP

Whenever an existing environment needs to be wrapped instead of implemented fully in scikit-decide (e.g. compiled ATARI games), it is recommended to overwrite Environment.step() to call the external environment and not use the Environment._step() helper function.

WARNING

Before calling Environment.step() the first time or when the end of an episode is reached, Initializable.reset() must be called to reset the environment's state.

# Parameters

  • action: The action taken in the current memory (state or history) triggering the transition.

# Returns

The environment outcome of this step.

# _check_value Rewards

_check_value(
  self,
value: Value[D.T_value]
) -> bool

Check that a value is compliant with its cost specification (must be positive).

TIP

This function calls PositiveCost._is_positive() to determine if a value is positive (can be overridden for advanced value types).

# Parameters

  • value: The value to check.

# Returns

True if the value is compliant (False otherwise).

# _get_action_mask Events

_get_action_mask(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> StrDict[Mask]

Get action mask for the given memory or internal one if omitted.

An action mask is another (more specific) format for applicable actions, that has a meaning only if the action space can be iterated over in some way. It is represented by a flat array of 0's and 1's ordered as the actions when enumerated: 1 for an applicable action, and 0 for a not applicable action.

More precisely, this implementation makes the assumption that each agent action space is an EnumerableSpace, and calls internally self.get_applicable_action().

The action mask is used for instance by RL solvers to shut down logits associated to non-applicable actions in the output of their internal neural network.

# Parameters

  • memory: The memory to consider. If None, works on the internal memory of the domain.

# Returns

a numpy array (or dict agent-> numpy array for multi-agent domains) with 0-1 indicating applicability of the action (1 meaning applicable and 0 not applicable)

# _get_action_space Events

_get_action_space(
  self
) -> StrDict[Space[D.T_event]]

Get the (cached) domain action space (finite or infinite set).

By default, Events._get_action_space() internally calls Events._get_action_space_() the first time and automatically caches its value to make future calls more efficient (since the action space is assumed to be constant).

# Returns

The action space.

# _get_action_space_ Events

_get_action_space_(
  self
) -> Space[D.T_event]

Get the domain action space (finite or infinite set).

This is a helper function called by default from Events._get_action_space(), the difference being that the result is not cached here.

TIP

The underscore at the end of this function's name is a convention to remind that its result should be constant.

# Returns

The action space.

# _get_applicable_actions Events

_get_applicable_actions(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> StrDict[Space[D.T_event]]

Get the space (finite or infinite set) of applicable actions in the given memory (state or history), or in the internal one if omitted.

By default, Events._get_applicable_actions() provides some boilerplate code and internally calls Events._get_applicable_actions_from(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

The space of applicable actions.

# _get_applicable_actions_from Events

_get_applicable_actions_from(
  self,
memory: Memory[D.T_state]
) -> Space[D.T_event]

Get the space (finite or infinite set) of applicable actions in the given memory (state or history).

This is a helper function called by default from Events._get_applicable_actions(), the difference being that the memory parameter is mandatory here.

# Parameters

  • memory: The memory to consider.

# Returns

The space of applicable actions.

# _get_enabled_events Events

_get_enabled_events(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> Space[D.T_event]

Get the space (finite or infinite set) of enabled uncontrollable events in the given memory (state or history), or in the internal one if omitted.

By default, Events._get_enabled_events() provides some boilerplate code and internally calls Events._get_enabled_events_from(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

The space of enabled events.

# _get_enabled_events_from Events

_get_enabled_events_from(
  self,
memory: Memory[D.T_state]
) -> Space[D.T_event]

Get the space (finite or infinite set) of enabled uncontrollable events in the given memory (state or history).

This is a helper function called by default from Events._get_enabled_events(), the difference being that the memory parameter is mandatory here.

# Parameters

  • memory: The memory to consider.

# Returns

The space of enabled events.

# _get_goals Goals

_get_goals(
  self
) -> StrDict[Space[D.T_observation]]

Get the (cached) domain goals space (finite or infinite set).

By default, Goals._get_goals() internally calls Goals._get_goals_() the first time and automatically caches its value to make future calls more efficient (since the goals space is assumed to be constant).

WARNING

Goal states are assumed to be fully observable (i.e. observation = state) so that there is never uncertainty about whether the goal has been reached or not. This assumption guarantees that any policy that does not reach the goal with certainty incurs in infinite expected cost. - Geffner, 2013: A Concise Introduction to Models and Methods for Automated Planning

# Returns

The goals space.

# _get_goals_ Goals

_get_goals_(
  self
) -> Space[D.T_observation]

Get the domain goals space (finite or infinite set).

This is a helper function called by default from Goals._get_goals(), the difference being that the result is not cached here.

TIP

The underscore at the end of this function's name is a convention to remind that its result should be constant.

# Returns

The goals space.

# _get_memory_maxlen History

_get_memory_maxlen(
  self
) -> int

Get the (cached) memory max length.

By default, FiniteHistory._get_memory_maxlen() internally calls FiniteHistory._get_memory_maxlen_() the first time and automatically caches its value to make future calls more efficient (since the memory max length is assumed to be constant).

# Returns

The memory max length.

# _get_memory_maxlen_ FiniteHistory

_get_memory_maxlen_(
  self
) -> int

Get the memory max length.

This is a helper function called by default from FiniteHistory._get_memory_maxlen(), the difference being that the result is not cached here.

TIP

The underscore at the end of this function's name is a convention to remind that its result should be constant.

# Returns

The memory max length.

# _get_next_state_distribution UncertainTransitions

_get_next_state_distribution(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]]
) -> Distribution[D.T_state]

Get the probability distribution of next state given a memory and action.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The probability distribution of next state.

# _get_observation TransformedObservable

_get_observation(
  self,
state: D.T_state,
action: Optional[StrDict[list[D.T_event]]] = None
) -> StrDict[D.T_observation]

Get the deterministic observation given a state and action.

# Parameters

  • state: The state to be observed.
  • action: The last applied action (or None if the state is an initial state).

# Returns

The probability distribution of the observation.

# _get_observation_distribution PartiallyObservable

_get_observation_distribution(
  self,
state: D.T_state,
action: Optional[StrDict[list[D.T_event]]] = None
) -> Distribution[StrDict[D.T_observation]]

Get the probability distribution of the observation given a state and action.

In mathematical terms (discrete case), given an action , this function represents: , where is the random variable of the observation.

# Parameters

  • state: The state to be observed.
  • action: The last applied action (or None if the state is an initial state).

# Returns

The probability distribution of the observation.

# _get_observation_space PartiallyObservable

_get_observation_space(
  self
) -> StrDict[Space[D.T_observation]]

Get the (cached) observation space (finite or infinite set).

By default, PartiallyObservable._get_observation_space() internally calls PartiallyObservable._get_observation_space_() the first time and automatically caches its value to make future calls more efficient (since the observation space is assumed to be constant).

# Returns

The observation space.

# _get_observation_space_ PartiallyObservable

_get_observation_space_(
  self
) -> StrDict[Space[D.T_observation]]

Get the observation space (finite or infinite set).

This is a helper function called by default from PartiallyObservable._get_observation_space(), the difference being that the result is not cached here.

TIP

The underscore at the end of this function's name is a convention to remind that its result should be constant.

# Returns

The observation space.

# _get_transition_value UncertainTransitions

_get_transition_value(
  self,
memory: Memory[D.T_state],
event: D.T_event,
next_state: Optional[D.T_state] = None
) -> Value[D.T_value]

Get the value (reward or cost) of a transition.

The transition to consider is defined by the function parameters.

TIP

If this function never depends on the next_state parameter for its computation, it is recommended to indicate it by overriding UncertainTransitions._is_transition_value_dependent_on_next_state_() to return False. This information can then be exploited by solvers to avoid computing next state to evaluate a transition value (more efficient).

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.
  • next_state: The next state in which the transition ends (if needed for the computation).

# Returns

The transition value (reward or cost).

# _init_memory History

_init_memory(
  self,
state: Optional[D.T_state] = None
) -> Memory[D.T_state]

Initialize memory (possibly with a state) according to its specification and return it.

This function is automatically called by Initializable._reset() to reinitialize the internal memory whenever the domain is used as an environment.

# Parameters

  • state: An optional state to initialize the memory with (typically the initial state).

# Returns

The new initialized memory.

# _is_action Events

_is_action(
  self,
event: D.T_event
) -> bool

Indicate whether an event is an action (i.e. a controllable event for the agents).

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain action space provided by Events._get_action_space(), but it can be overridden for faster implementations.

# Parameters

  • event: The event to consider.

# Returns

True if the event is an action (False otherwise).

# _is_applicable_action Events

_is_applicable_action(
  self,
action: StrDict[D.T_event],
memory: Optional[Memory[D.T_state]] = None
) -> bool

Indicate whether an action is applicable in the given memory (state or history), or in the internal one if omitted.

By default, Events._is_applicable_action() provides some boilerplate code and internally calls Events._is_applicable_action_from(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

True if the action is applicable (False otherwise).

# _is_applicable_action_from Events

_is_applicable_action_from(
  self,
action: StrDict[D.T_event],
memory: Memory[D.T_state]
) -> bool

Indicate whether an action is applicable in the given memory (state or history).

This is a helper function called by default from Events._is_applicable_action(), the difference being that the memory parameter is mandatory here.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the space of applicable actions provided by Events._get_applicable_actions_from(), but it can be overridden for faster implementations.

# Parameters

  • memory: The memory to consider.

# Returns

True if the action is applicable (False otherwise).

# _is_enabled_event Events

_is_enabled_event(
  self,
event: D.T_event,
memory: Optional[Memory[D.T_state]] = None
) -> bool

Indicate whether an uncontrollable event is enabled in the given memory (state or history), or in the internal one if omitted.

By default, Events._is_enabled_event() provides some boilerplate code and internally calls Events._is_enabled_event_from(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

True if the event is enabled (False otherwise).

# _is_enabled_event_from Events

_is_enabled_event_from(
  self,
event: D.T_event,
memory: Memory[D.T_state]
) -> bool

Indicate whether an event is enabled in the given memory (state or history).

This is a helper function called by default from Events._is_enabled_event(), the difference being that the memory parameter is mandatory here.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the space of enabled events provided by Events._get_enabled_events_from(), but it can be overridden for faster implementations.

# Parameters

  • memory: The memory to consider.

# Returns

True if the event is enabled (False otherwise).

# _is_goal Goals

_is_goal(
  self,
state: D.T_state
) -> bool

Indicate whether an observation belongs to the goals.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain goals space provided by Goals._get_goals(), but it can be overridden for faster implementations.

# Parameters

  • observation: The observation to consider.

# Returns

True if the observation is a goal (False otherwise).

# _is_observation PartiallyObservable

_is_observation(
  self,
observation: StrDict[D.T_observation]
) -> bool

Check that an observation indeed belongs to the domain observation space.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain observation space provided by PartiallyObservable._get_observation_space(), but it can be overridden for faster implementations.

# Parameters

  • observation: The observation to consider.

# Returns

True if the observation belongs to the domain observation space (False otherwise).

# _is_positive PositiveCosts

_is_positive(
  self,
cost: D.T_value
) -> bool

Determine if a value is positive (can be overridden for advanced value types).

# Parameters

  • cost: The cost to evaluate.

# Returns

True if the cost is positive (False otherwise).

# _is_terminal UncertainTransitions

_is_terminal(
  self,
state: D.T_state
) -> bool

Indicate whether a state is terminal.

A terminal state is a state with no outgoing transition (except to itself with value 0).

# Parameters

  • state: The state to consider.

# Returns

True if the state is terminal (False otherwise).

# _is_transition_value_dependent_on_next_state UncertainTransitions

_is_transition_value_dependent_on_next_state(
  self
) -> bool

Indicate whether _get_transition_value() requires the next_state parameter for its computation (cached).

By default, UncertainTransitions._is_transition_value_dependent_on_next_state() internally calls UncertainTransitions._is_transition_value_dependent_on_next_state_() the first time and automatically caches its value to make future calls more efficient (since the returned value is assumed to be constant).

# Returns

True if the transition value computation depends on next_state (False otherwise).

# _is_transition_value_dependent_on_next_state_ UncertainTransitions

_is_transition_value_dependent_on_next_state_(
  self
) -> bool

Indicate whether _get_transition_value() requires the next_state parameter for its computation.

This is a helper function called by default from UncertainTransitions._is_transition_value_dependent_on_next_state(), the difference being that the result is not cached here.

TIP

The underscore at the end of this function's name is a convention to remind that its result should be constant.

# Returns

True if the transition value computation depends on next_state (False otherwise).

# _sample Simulation

_sample(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]]
) -> EnvironmentOutcome[StrDict[D.T_observation], StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Sample one transition of the simulator's dynamics.

By default, Simulation._sample() provides some boilerplate code and internally calls Simulation._state_sample() (which returns a transition outcome). The boilerplate code automatically samples an observation corresponding to the sampled next state.

TIP

Whenever an existing simulator needs to be wrapped instead of implemented fully in scikit-decide (e.g. a simulator), it is recommended to overwrite Simulation._sample() to call the external simulator and not use the Simulation._state_sample() helper function.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The environment outcome of the sampled transition.

# _set_memory Simulation

_set_memory(
  self,
memory: Memory[D.T_state]
) -> None

Set internal memory attribute _memory to given one.

This can be useful to set a specific "starting point" before doing a rollout with successive Environment._step() calls.

# Parameters

  • memory: The memory to set internally.

# Example

# Set simulation_domain memory to my_state (assuming Markovian domain)
simulation_domain._set_memory(my_state)

# Start a 100-steps rollout from here (applying my_action at every step)
for _ in range(100):
    simulation_domain._step(my_action)

# _state_sample Simulation

_state_sample(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]]
) -> TransitionOutcome[D.T_state, StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Compute one sample of the transition's dynamics.

This is a helper function called by default from Simulation._sample(). It focuses on the state level, as opposed to the observation one for the latter.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The transition outcome of the sampled transition.

# _state_step Environment

_state_step(
  self,
action: StrDict[list[D.T_event]]
) -> TransitionOutcome[D.T_state, StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Compute one step of the transition's dynamics.

This is a helper function called by default from Environment._step(). It focuses on the state level, as opposed to the observation one for the latter.

# Parameters

  • action: The action taken in the current memory (state or history) triggering the transition.

# Returns

The transition outcome of this step.

# _step Environment

_step(
  self,
action: StrDict[list[D.T_event]]
) -> EnvironmentOutcome[StrDict[D.T_observation], StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Run one step of the environment's dynamics.

By default, Environment._step() provides some boilerplate code and internally calls Environment._state_step() (which returns a transition outcome). The boilerplate code automatically stores next state into the _memory attribute and samples a corresponding observation.

TIP

Whenever an existing environment needs to be wrapped instead of implemented fully in scikit-decide (e.g. compiled ATARI games), it is recommended to overwrite Environment._step() to call the external environment and not use the Environment._state_step() helper function.

WARNING

Before calling Environment._step() the first time or when the end of an episode is reached, Initializable._reset() must be called to reset the environment's state.

# Parameters

  • action: The action taken in the current memory (state or history) triggering the transition.

# Returns

The environment outcome of this step.

# GraphDomain

This domain is for deterministic planning domain where the full transitions and cost are already computed. In this case, using the dictionary structures will improve the computing performance of the domain and therefore its solving time.

# Constructor GraphDomain

GraphDomain(
  next_state_map: dict[D.T_state, dict[D.T_event, D.T_state]],
next_state_attributes: dict[D.T_state, dict[D.T_event, dict[str, float]]],
targets: Optional[set[D.T_state]] = None,
attribute_weight = weight
)

# Parameters

  • next_state_map: is a dictionary with keys the state and values a dictionary with action as keys and next state as values
  • next_state_attributes: for each transition, stores float attributes (typically cost of transition)
  • target : set of goal states
  • attribute_weight: key in next_state_attributes to consider as the cost attribute.

# check_value Rewards

check_value(
  self,
value: Value[D.T_value]
) -> bool

Check that a value is compliant with its reward specification.

TIP

This function returns always True by default because any kind of reward should be accepted at this level.

# Parameters

  • value: The value to check.

# Returns

True if the value is compliant (False otherwise).

# get_action_mask Events

get_action_mask(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> StrDict[Mask]

Get action mask for the given memory or internal one if omitted.

An action mask is another (more specific) format for applicable actions, that has a meaning only if the action space can be iterated over in some way. It is represented by a flat array of 0's and 1's ordered as the actions when enumerated: 1 for an applicable action, and 0 for a not applicable action.

More precisely, this implementation makes the assumption that each agent action space is an EnumerableSpace, and calls internally self.get_applicable_action().

The action mask is used for instance by RL solvers to shut down logits associated to non-applicable actions in the output of their internal neural network.

# Parameters

  • memory: The memory to consider. If None, works on the internal memory of the domain.

# Returns

a numpy array (or dict agent-> numpy array for multi-agent domains) with 0-1 indicating applicability of the action (1 meaning applicable and 0 not applicable)

# get_action_space Events

get_action_space(
  self
) -> StrDict[Space[D.T_event]]

Get the (cached) domain action space (finite or infinite set).

By default, Events.get_action_space() internally calls Events._get_action_space_() the first time and automatically caches its value to make future calls more efficient (since the action space is assumed to be constant).

# Returns

The action space.

# get_agents MultiAgent

get_agents(
  self
) -> set[str]

Return a singleton for single agent domains.

We must be here consistent with skdecide.core.autocast() which transforms a single agent domain into a multi agents domain whose only agent has the id "agent".

# get_applicable_actions Events

get_applicable_actions(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> StrDict[Space[D.T_event]]

Get the space (finite or infinite set) of applicable actions in the given memory (state or history), or in the internal one if omitted.

By default, Events.get_applicable_actions() provides some boilerplate code and internally calls Events._get_applicable_actions(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

The space of applicable actions.

# get_enabled_events Events

get_enabled_events(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> Space[D.T_event]

Get the space (finite or infinite set) of enabled uncontrollable events in the given memory (state or history), or in the internal one if omitted.

By default, Events.get_enabled_events() provides some boilerplate code and internally calls Events._get_enabled_events(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

The space of enabled events.

# get_goals Goals

get_goals(
  self
) -> StrDict[Space[D.T_observation]]

Get the (cached) domain goals space (finite or infinite set).

By default, Goals.get_goals() internally calls Goals._get_goals_() the first time and automatically caches its value to make future calls more efficient (since the goals space is assumed to be constant).

WARNING

Goal states are assumed to be fully observable (i.e. observation = state) so that there is never uncertainty about whether the goal has been reached or not. This assumption guarantees that any policy that does not reach the goal with certainty incurs in infinite expected cost. - Geffner, 2013: A Concise Introduction to Models and Methods for Automated Planning

# Returns

The goals space.

# get_initial_state DeterministicInitialized

get_initial_state(
  self
) -> D.T_state

Get the (cached) initial state.

By default, DeterministicInitialized.get_initial_state() internally calls DeterministicInitialized._get_initial_state_() the first time and automatically caches its value to make future calls more efficient (since the initial state is assumed to be constant).

# Returns

The initial state.

# get_initial_state_distribution UncertainInitialized

get_initial_state_distribution(
  self
) -> Distribution[D.T_state]

Get the (cached) probability distribution of initial states.

By default, UncertainInitialized.get_initial_state_distribution() internally calls UncertainInitialized._get_initial_state_distribution_() the first time and automatically caches its value to make future calls more efficient (since the initial state distribution is assumed to be constant).

# Returns

The probability distribution of initial states.

# get_next_state DeterministicTransitions

get_next_state(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]]
) -> D.T_state

Get the next state given a memory and action.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The deterministic next state.

# get_next_state_distribution UncertainTransitions

get_next_state_distribution(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]]
) -> DiscreteDistribution[D.T_state]

Get the discrete probability distribution of next state given a memory and action.

TIP

In the Markovian case (memory only holds last state ), given an action , this function can be mathematically represented by , where is the next state random variable.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The discrete probability distribution of next state.

# get_observation TransformedObservable

get_observation(
  self,
state: D.T_state,
action: Optional[StrDict[list[D.T_event]]] = None
) -> StrDict[D.T_observation]

Get the deterministic observation given a state and action.

# Parameters

  • state: The state to be observed.
  • action: The last applied action (or None if the state is an initial state).

# Returns

The probability distribution of the observation.

# get_observation_distribution PartiallyObservable

get_observation_distribution(
  self,
state: D.T_state,
action: Optional[StrDict[list[D.T_event]]] = None
) -> Distribution[StrDict[D.T_observation]]

Get the probability distribution of the observation given a state and action.

In mathematical terms (discrete case), given an action , this function represents: , where is the random variable of the observation.

# Parameters

  • state: The state to be observed.
  • action: The last applied action (or None if the state is an initial state).

# Returns

The probability distribution of the observation.

# get_observation_space PartiallyObservable

get_observation_space(
  self
) -> StrDict[Space[D.T_observation]]

Get the (cached) observation space (finite or infinite set).

By default, PartiallyObservable.get_observation_space() internally calls PartiallyObservable._get_observation_space_() the first time and automatically caches its value to make future calls more efficient (since the observation space is assumed to be constant).

# Returns

The observation space.

# get_transition_value UncertainTransitions

get_transition_value(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]],
next_state: Optional[D.T_state] = None
) -> StrDict[Value[D.T_value]]

Get the value (reward or cost) of a transition.

The transition to consider is defined by the function parameters.

TIP

If this function never depends on the next_state parameter for its computation, it is recommended to indicate it by overriding UncertainTransitions._is_transition_value_dependent_on_next_state_() to return False. This information can then be exploited by solvers to avoid computing next state to evaluate a transition value (more efficient).

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.
  • next_state: The next state in which the transition ends (if needed for the computation).

# Returns

The transition value (reward or cost).

# is_action Events

is_action(
  self,
event: D.T_event
) -> bool

Indicate whether an event is an action (i.e. a controllable event for the agents).

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain action space provided by Events.get_action_space(), but it can be overridden for faster implementations.

# Parameters

  • event: The event to consider.

# Returns

True if the event is an action (False otherwise).

# is_applicable_action Events

is_applicable_action(
  self,
action: StrDict[D.T_event],
memory: Optional[Memory[D.T_state]] = None
) -> bool

Indicate whether an action is applicable in the given memory (state or history), or in the internal one if omitted.

By default, Events.is_applicable_action() provides some boilerplate code and internally calls Events._is_applicable_action(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

True if the action is applicable (False otherwise).

# is_enabled_event Events

is_enabled_event(
  self,
event: D.T_event,
memory: Optional[Memory[D.T_state]] = None
) -> bool

Indicate whether an uncontrollable event is enabled in the given memory (state or history), or in the internal one if omitted.

By default, Events.is_enabled_event() provides some boilerplate code and internally calls Events._is_enabled_event(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

True if the event is enabled (False otherwise).

# is_goal Goals

is_goal(
  self,
state: D.T_state
) -> bool

Indicate whether an observation belongs to the goals.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain goals space provided by Goals.get_goals(), but it can be overridden for faster implementations.

# Parameters

  • observation: The observation to consider.

# Returns

True if the observation is a goal (False otherwise).

# is_observation PartiallyObservable

is_observation(
  self,
observation: StrDict[D.T_observation]
) -> bool

Check that an observation indeed belongs to the domain observation space.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain observation space provided by PartiallyObservable.get_observation_space(), but it can be overridden for faster implementations.

# Parameters

  • observation: The observation to consider.

# Returns

True if the observation belongs to the domain observation space (False otherwise).

# is_terminal UncertainTransitions

is_terminal(
  self,
state: D.T_state
) -> bool

Indicate whether a state is terminal.

A terminal state is a state with no outgoing transition (except to itself with value 0).

# Parameters

  • state: The state to consider.

# Returns

True if the state is terminal (False otherwise).

# is_transition_value_dependent_on_next_state UncertainTransitions

is_transition_value_dependent_on_next_state(
  self
) -> bool

Indicate whether get_transition_value() requires the next_state parameter for its computation (cached).

By default, UncertainTransitions.is_transition_value_dependent_on_next_state() internally calls UncertainTransitions._is_transition_value_dependent_on_next_state_() the first time and automatically caches its value to make future calls more efficient (since the returned value is assumed to be constant).

# Returns

True if the transition value computation depends on next_state (False otherwise).

# merge GraphDomain

merge(
  self,
graph_domain: GraphDomain
)

Return a new graph domain merged from self and another instance of GraphDomain.

# reset Initializable

reset(
  self
) -> StrDict[D.T_observation]

Reset the state of the environment and return an initial observation.

By default, Initializable.reset() provides some boilerplate code and internally calls Initializable._reset() (which returns an initial state). The boilerplate code automatically stores the initial state into the _memory attribute and samples a corresponding observation.

# Returns

An initial observation.

# sample Simulation

sample(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]]
) -> EnvironmentOutcome[StrDict[D.T_observation], StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Sample one transition of the simulator's dynamics.

By default, Simulation.sample() provides some boilerplate code and internally calls Simulation._sample() (which returns a transition outcome). The boilerplate code automatically samples an observation corresponding to the sampled next state.

TIP

Whenever an existing simulator needs to be wrapped instead of implemented fully in scikit-decide (e.g. a simulator), it is recommended to overwrite Simulation.sample() to call the external simulator and not use the Simulation._sample() helper function.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The environment outcome of the sampled transition.

# set_memory Simulation

set_memory(
  self,
memory: Memory[D.T_state]
) -> None

Set internal memory attribute _memory to given one.

This can be useful to set a specific "starting point" before doing a rollout with successive Environment.step() calls.

# Parameters

  • memory: The memory to set internally.

# Example

# Set simulation_domain memory to my_state (assuming Markovian domain)
simulation_domain.set_memory(my_state)

# Start a 100-steps rollout from here (applying my_action at every step)
for _ in range(100):
    simulation_domain.step(my_action)

# set_nodes_target GraphDomain

set_nodes_target(
  self,
targets
)

Change the sources and targets attribute.

# set_sources_targets GraphDomain

set_sources_targets(
  self,
sources,
targets
)

Change the sources and targets attribute.

# step Environment

step(
  self,
action: StrDict[list[D.T_event]]
) -> EnvironmentOutcome[StrDict[D.T_observation], StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Run one step of the environment's dynamics.

By default, Environment.step() provides some boilerplate code and internally calls Environment._step() (which returns a transition outcome). The boilerplate code automatically stores next state into the _memory attribute and samples a corresponding observation.

TIP

Whenever an existing environment needs to be wrapped instead of implemented fully in scikit-decide (e.g. compiled ATARI games), it is recommended to overwrite Environment.step() to call the external environment and not use the Environment._step() helper function.

WARNING

Before calling Environment.step() the first time or when the end of an episode is reached, Initializable.reset() must be called to reset the environment's state.

# Parameters

  • action: The action taken in the current memory (state or history) triggering the transition.

# Returns

The environment outcome of this step.

# _check_value Rewards

_check_value(
  self,
value: Value[D.T_value]
) -> bool

Check that a value is compliant with its cost specification (must be positive).

TIP

This function calls PositiveCost._is_positive() to determine if a value is positive (can be overridden for advanced value types).

# Parameters

  • value: The value to check.

# Returns

True if the value is compliant (False otherwise).

# _get_action_mask Events

_get_action_mask(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> StrDict[Mask]

Get action mask for the given memory or internal one if omitted.

An action mask is another (more specific) format for applicable actions, that has a meaning only if the action space can be iterated over in some way. It is represented by a flat array of 0's and 1's ordered as the actions when enumerated: 1 for an applicable action, and 0 for a not applicable action.

More precisely, this implementation makes the assumption that each agent action space is an EnumerableSpace, and calls internally self.get_applicable_action().

The action mask is used for instance by RL solvers to shut down logits associated to non-applicable actions in the output of their internal neural network.

# Parameters

  • memory: The memory to consider. If None, works on the internal memory of the domain.

# Returns

a numpy array (or dict agent-> numpy array for multi-agent domains) with 0-1 indicating applicability of the action (1 meaning applicable and 0 not applicable)

# _get_action_space Events

_get_action_space(
  self
) -> StrDict[Space[D.T_event]]

Get the (cached) domain action space (finite or infinite set).

By default, Events._get_action_space() internally calls Events._get_action_space_() the first time and automatically caches its value to make future calls more efficient (since the action space is assumed to be constant).

# Returns

The action space.

# _get_action_space_ Events

_get_action_space_(
  self
) -> Space[D.T_event]

Get the domain action space (finite or infinite set).

This is a helper function called by default from Events._get_action_space(), the difference being that the result is not cached here.

TIP

The underscore at the end of this function's name is a convention to remind that its result should be constant.

# Returns

The action space.

# _get_applicable_actions Events

_get_applicable_actions(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> StrDict[Space[D.T_event]]

Get the space (finite or infinite set) of applicable actions in the given memory (state or history), or in the internal one if omitted.

By default, Events._get_applicable_actions() provides some boilerplate code and internally calls Events._get_applicable_actions_from(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

The space of applicable actions.

# _get_applicable_actions_from Events

_get_applicable_actions_from(
  self,
memory: D.T_state
) -> Space[D.T_event]

Get the space (finite or infinite set) of applicable actions in the given memory (state or history).

This is a helper function called by default from Events._get_applicable_actions(), the difference being that the memory parameter is mandatory here.

# Parameters

  • memory: The memory to consider.

# Returns

The space of applicable actions.

# _get_enabled_events Events

_get_enabled_events(
  self,
memory: Optional[Memory[D.T_state]] = None
) -> Space[D.T_event]

Get the space (finite or infinite set) of enabled uncontrollable events in the given memory (state or history), or in the internal one if omitted.

By default, Events._get_enabled_events() provides some boilerplate code and internally calls Events._get_enabled_events_from(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

The space of enabled events.

# _get_enabled_events_from Events

_get_enabled_events_from(
  self,
memory: Memory[D.T_state]
) -> Space[D.T_event]

Get the space (finite or infinite set) of enabled uncontrollable events in the given memory (state or history).

This is a helper function called by default from Events._get_enabled_events(), the difference being that the memory parameter is mandatory here.

# Parameters

  • memory: The memory to consider.

# Returns

The space of enabled events.

# _get_goals Goals

_get_goals(
  self
) -> StrDict[Space[D.T_observation]]

Get the (cached) domain goals space (finite or infinite set).

By default, Goals._get_goals() internally calls Goals._get_goals_() the first time and automatically caches its value to make future calls more efficient (since the goals space is assumed to be constant).

WARNING

Goal states are assumed to be fully observable (i.e. observation = state) so that there is never uncertainty about whether the goal has been reached or not. This assumption guarantees that any policy that does not reach the goal with certainty incurs in infinite expected cost. - Geffner, 2013: A Concise Introduction to Models and Methods for Automated Planning

# Returns

The goals space.

# _get_goals_ Goals

_get_goals_(
  self
) -> Space[D.T_observation]

Get the domain goals space (finite or infinite set).

This is a helper function called by default from Goals._get_goals(), the difference being that the result is not cached here.

TIP

The underscore at the end of this function's name is a convention to remind that its result should be constant.

# Returns

The goals space.

# _get_initial_state DeterministicInitialized

_get_initial_state(
  self
) -> D.T_state

Get the (cached) initial state.

By default, DeterministicInitialized._get_initial_state() internally calls DeterministicInitialized._get_initial_state_() the first time and automatically caches its value to make future calls more efficient (since the initial state is assumed to be constant).

# Returns

The initial state.

# _get_initial_state_ DeterministicInitialized

_get_initial_state_(
  self
) -> D.T_state

Get the initial state.

This is a helper function called by default from DeterministicInitialized._get_initial_state(), the difference being that the result is not cached here.

# Returns

The initial state.

# _get_initial_state_distribution UncertainInitialized

_get_initial_state_distribution(
  self
) -> Distribution[D.T_state]

Get the (cached) probability distribution of initial states.

By default, UncertainInitialized._get_initial_state_distribution() internally calls UncertainInitialized._get_initial_state_distribution_() the first time and automatically caches its value to make future calls more efficient (since the initial state distribution is assumed to be constant).

# Returns

The probability distribution of initial states.

# _get_initial_state_distribution_ UncertainInitialized

_get_initial_state_distribution_(
  self
) -> Distribution[D.T_state]

Get the probability distribution of initial states.

This is a helper function called by default from UncertainInitialized._get_initial_state_distribution(), the difference being that the result is not cached here.

TIP

The underscore at the end of this function's name is a convention to remind that its result should be constant.

# Returns

The probability distribution of initial states.

# _get_memory_maxlen History

_get_memory_maxlen(
  self
) -> int

Get the (cached) memory max length.

By default, FiniteHistory._get_memory_maxlen() internally calls FiniteHistory._get_memory_maxlen_() the first time and automatically caches its value to make future calls more efficient (since the memory max length is assumed to be constant).

# Returns

The memory max length.

# _get_memory_maxlen_ FiniteHistory

_get_memory_maxlen_(
  self
) -> int

Get the memory max length.

This is a helper function called by default from FiniteHistory._get_memory_maxlen(), the difference being that the result is not cached here.

TIP

The underscore at the end of this function's name is a convention to remind that its result should be constant.

# Returns

The memory max length.

# _get_next_state DeterministicTransitions

_get_next_state(
  self,
memory: Memory[D.T_state],
event: D.T_event
) -> D.T_state

Get the next state given a memory and action.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The deterministic next state.

# _get_next_state_distribution UncertainTransitions

_get_next_state_distribution(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]]
) -> SingleValueDistribution[D.T_state]

Get the discrete probability distribution of next state given a memory and action.

TIP

In the Markovian case (memory only holds last state ), given an action , this function can be mathematically represented by , where is the next state random variable.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The discrete probability distribution of next state.

# _get_observation TransformedObservable

_get_observation(
  self,
state: D.T_state,
action: Optional[StrDict[list[D.T_event]]] = None
) -> StrDict[D.T_observation]

Get the deterministic observation given a state and action.

# Parameters

  • state: The state to be observed.
  • action: The last applied action (or None if the state is an initial state).

# Returns

The probability distribution of the observation.

# _get_observation_distribution PartiallyObservable

_get_observation_distribution(
  self,
state: D.T_state,
action: Optional[StrDict[list[D.T_event]]] = None
) -> Distribution[StrDict[D.T_observation]]

Get the probability distribution of the observation given a state and action.

In mathematical terms (discrete case), given an action , this function represents: , where is the random variable of the observation.

# Parameters

  • state: The state to be observed.
  • action: The last applied action (or None if the state is an initial state).

# Returns

The probability distribution of the observation.

# _get_observation_space PartiallyObservable

_get_observation_space(
  self
) -> StrDict[Space[D.T_observation]]

Get the (cached) observation space (finite or infinite set).

By default, PartiallyObservable._get_observation_space() internally calls PartiallyObservable._get_observation_space_() the first time and automatically caches its value to make future calls more efficient (since the observation space is assumed to be constant).

# Returns

The observation space.

# _get_observation_space_ PartiallyObservable

_get_observation_space_(
  self
) -> Space[D.T_observation]

Get the observation space (finite or infinite set).

This is a helper function called by default from PartiallyObservable._get_observation_space(), the difference being that the result is not cached here.

TIP

The underscore at the end of this function's name is a convention to remind that its result should be constant.

# Returns

The observation space.

# _get_transition_value UncertainTransitions

_get_transition_value(
  self,
memory: Memory[D.T_state],
event: D.T_event,
next_state: Optional[D.T_state] = None
) -> Value[D.T_value]

Get the value (reward or cost) of a transition.

The transition to consider is defined by the function parameters.

TIP

If this function never depends on the next_state parameter for its computation, it is recommended to indicate it by overriding UncertainTransitions._is_transition_value_dependent_on_next_state_() to return False. This information can then be exploited by solvers to avoid computing next state to evaluate a transition value (more efficient).

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.
  • next_state: The next state in which the transition ends (if needed for the computation).

# Returns

The transition value (reward or cost).

# _init_memory History

_init_memory(
  self,
state: Optional[D.T_state] = None
) -> Memory[D.T_state]

Initialize memory (possibly with a state) according to its specification and return it.

This function is automatically called by Initializable._reset() to reinitialize the internal memory whenever the domain is used as an environment.

# Parameters

  • state: An optional state to initialize the memory with (typically the initial state).

# Returns

The new initialized memory.

# _is_action Events

_is_action(
  self,
event: D.T_event
) -> bool

Indicate whether an event is an action (i.e. a controllable event for the agents).

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain action space provided by Events._get_action_space(), but it can be overridden for faster implementations.

# Parameters

  • event: The event to consider.

# Returns

True if the event is an action (False otherwise).

# _is_applicable_action Events

_is_applicable_action(
  self,
action: StrDict[D.T_event],
memory: Optional[Memory[D.T_state]] = None
) -> bool

Indicate whether an action is applicable in the given memory (state or history), or in the internal one if omitted.

By default, Events._is_applicable_action() provides some boilerplate code and internally calls Events._is_applicable_action_from(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

True if the action is applicable (False otherwise).

# _is_applicable_action_from Events

_is_applicable_action_from(
  self,
action: StrDict[D.T_event],
memory: Memory[D.T_state]
) -> bool

Indicate whether an action is applicable in the given memory (state or history).

This is a helper function called by default from Events._is_applicable_action(), the difference being that the memory parameter is mandatory here.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the space of applicable actions provided by Events._get_applicable_actions_from(), but it can be overridden for faster implementations.

# Parameters

  • memory: The memory to consider.

# Returns

True if the action is applicable (False otherwise).

# _is_enabled_event Events

_is_enabled_event(
  self,
event: D.T_event,
memory: Optional[Memory[D.T_state]] = None
) -> bool

Indicate whether an uncontrollable event is enabled in the given memory (state or history), or in the internal one if omitted.

By default, Events._is_enabled_event() provides some boilerplate code and internally calls Events._is_enabled_event_from(). The boilerplate code automatically passes the _memory attribute instead of the memory parameter whenever the latter is None.

# Parameters

  • memory: The memory to consider (if None, the internal memory attribute _memory is used instead).

# Returns

True if the event is enabled (False otherwise).

# _is_enabled_event_from Events

_is_enabled_event_from(
  self,
event: D.T_event,
memory: Memory[D.T_state]
) -> bool

Indicate whether an event is enabled in the given memory (state or history).

This is a helper function called by default from Events._is_enabled_event(), the difference being that the memory parameter is mandatory here.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the space of enabled events provided by Events._get_enabled_events_from(), but it can be overridden for faster implementations.

# Parameters

  • memory: The memory to consider.

# Returns

True if the event is enabled (False otherwise).

# _is_goal Goals

_is_goal(
  self,
observation: StrDict[D.T_observation]
) -> StrDict[D.T_predicate]

Indicate whether an observation belongs to the goals.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain goals space provided by Goals._get_goals(), but it can be overridden for faster implementations.

# Parameters

  • observation: The observation to consider.

# Returns

True if the observation is a goal (False otherwise).

# _is_observation PartiallyObservable

_is_observation(
  self,
observation: StrDict[D.T_observation]
) -> bool

Check that an observation indeed belongs to the domain observation space.

TIP

By default, this function is implemented using the skdecide.core.Space.contains() function on the domain observation space provided by PartiallyObservable._get_observation_space(), but it can be overridden for faster implementations.

# Parameters

  • observation: The observation to consider.

# Returns

True if the observation belongs to the domain observation space (False otherwise).

# _is_positive PositiveCosts

_is_positive(
  self,
cost: D.T_value
) -> bool

Determine if a value is positive (can be overridden for advanced value types).

# Parameters

  • cost: The cost to evaluate.

# Returns

True if the cost is positive (False otherwise).

# _is_terminal UncertainTransitions

_is_terminal(
  self,
state: D.T_state
) -> StrDict[D.T_predicate]

Indicate whether a state is terminal.

A terminal state is a state with no outgoing transition (except to itself with value 0).

# Parameters

  • state: The state to consider.

# Returns

True if the state is terminal (False otherwise).

# _is_transition_value_dependent_on_next_state UncertainTransitions

_is_transition_value_dependent_on_next_state(
  self
) -> bool

Indicate whether _get_transition_value() requires the next_state parameter for its computation (cached).

By default, UncertainTransitions._is_transition_value_dependent_on_next_state() internally calls UncertainTransitions._is_transition_value_dependent_on_next_state_() the first time and automatically caches its value to make future calls more efficient (since the returned value is assumed to be constant).

# Returns

True if the transition value computation depends on next_state (False otherwise).

# _is_transition_value_dependent_on_next_state_ UncertainTransitions

_is_transition_value_dependent_on_next_state_(
  self
) -> bool

Indicate whether _get_transition_value() requires the next_state parameter for its computation.

This is a helper function called by default from UncertainTransitions._is_transition_value_dependent_on_next_state(), the difference being that the result is not cached here.

TIP

The underscore at the end of this function's name is a convention to remind that its result should be constant.

# Returns

True if the transition value computation depends on next_state (False otherwise).

# _reset Initializable

_reset(
  self
) -> StrDict[D.T_observation]

Reset the state of the environment and return an initial observation.

By default, Initializable._reset() provides some boilerplate code and internally calls Initializable._state_reset() (which returns an initial state). The boilerplate code automatically stores the initial state into the _memory attribute and samples a corresponding observation.

# Returns

An initial observation.

# _sample Simulation

_sample(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]]
) -> EnvironmentOutcome[StrDict[D.T_observation], StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Sample one transition of the simulator's dynamics.

By default, Simulation._sample() provides some boilerplate code and internally calls Simulation._state_sample() (which returns a transition outcome). The boilerplate code automatically samples an observation corresponding to the sampled next state.

TIP

Whenever an existing simulator needs to be wrapped instead of implemented fully in scikit-decide (e.g. a simulator), it is recommended to overwrite Simulation._sample() to call the external simulator and not use the Simulation._state_sample() helper function.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The environment outcome of the sampled transition.

# _set_memory Simulation

_set_memory(
  self,
memory: Memory[D.T_state]
) -> None

Set internal memory attribute _memory to given one.

This can be useful to set a specific "starting point" before doing a rollout with successive Environment._step() calls.

# Parameters

  • memory: The memory to set internally.

# Example

# Set simulation_domain memory to my_state (assuming Markovian domain)
simulation_domain._set_memory(my_state)

# Start a 100-steps rollout from here (applying my_action at every step)
for _ in range(100):
    simulation_domain._step(my_action)

# _state_reset Initializable

_state_reset(
  self
) -> D.T_state

Reset the state of the environment and return an initial state.

This is a helper function called by default from Initializable._reset(). It focuses on the state level, as opposed to the observation one for the latter.

# Returns

An initial state.

# _state_sample Simulation

_state_sample(
  self,
memory: Memory[D.T_state],
action: StrDict[list[D.T_event]]
) -> TransitionOutcome[D.T_state, StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Compute one sample of the transition's dynamics.

This is a helper function called by default from Simulation._sample(). It focuses on the state level, as opposed to the observation one for the latter.

# Parameters

  • memory: The source memory (state or history) of the transition.
  • action: The action taken in the given memory (state or history) triggering the transition.

# Returns

The transition outcome of the sampled transition.

# _state_step Environment

_state_step(
  self,
action: StrDict[list[D.T_event]]
) -> TransitionOutcome[D.T_state, StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Compute one step of the transition's dynamics.

This is a helper function called by default from Environment._step(). It focuses on the state level, as opposed to the observation one for the latter.

# Parameters

  • action: The action taken in the current memory (state or history) triggering the transition.

# Returns

The transition outcome of this step.

# _step Environment

_step(
  self,
action: StrDict[list[D.T_event]]
) -> EnvironmentOutcome[StrDict[D.T_observation], StrDict[Value[D.T_value]], StrDict[D.T_predicate], StrDict[D.T_info]]

Run one step of the environment's dynamics.

By default, Environment._step() provides some boilerplate code and internally calls Environment._state_step() (which returns a transition outcome). The boilerplate code automatically stores next state into the _memory attribute and samples a corresponding observation.

TIP

Whenever an existing environment needs to be wrapped instead of implemented fully in scikit-decide (e.g. compiled ATARI games), it is recommended to overwrite Environment._step() to call the external environment and not use the Environment._state_step() helper function.

WARNING

Before calling Environment._step() the first time or when the end of an episode is reached, Initializable._reset() must be called to reset the environment's state.

# Parameters

  • action: The action taken in the current memory (state or history) triggering the transition.

# Returns

The environment outcome of this step.