Environment Interface¶

All CSuite environments adhere to the Python interface defined in the abstract base class csuite.Environment. This base class specifies the standard methods to implement in a CSuite environment, which are outlined below.

class csuite.Environment[source]¶

Base class for continuing environments.

Observations and valid actions are described by the specs module in dm_env. Environment implementations should return specs as specific as possible.

Each environment will specify its own environment State, Configuration, and internal random number generator.

Loading a CSuite Environment¶

Environments in CSuite are specified by an identifying string and can be initialized using the load function.

import csuite

env = csuite.load('catch')

The list of available environments and their associated loading strings is given in the csuite.EnvName class.

class csuite.EnvName(value)[source]¶: An enumeration.

API Methods¶

Start¶

After initialization, start must be called to set the environment state. Since all environments are continuing, start should only be called once, at the beginning of the agent-environment interaction.

abstract Environment.start(seed=None)[source]¶

Starts (or restarts) the environment and returns an observation.

Return type: Any

Step¶

After start is called, step updates the environment by one timestep given the action taken. The resulting observation and reward are returned.

abstract Environment.step(action)[source]¶

Takes a step in the environment, returning an observation and reward.

Return type: Tuple[Any, Any]

Render¶

All CSuite environments are expected to return an object serving to render the environment for visualization at the current timestep.

abstract Environment.render()[source]¶

Returns an rgb (uint8) numpy array to facilitate visualization.

The shape of this array should be (width, height, 3), where the last dimension is for red, green, and blue. The values are in [0, 255].

Return type: np.ndarray

Get and Set State¶

The get_state and set_state methods permit environment state retrieval and manipulation. These methods should only be used for reproducibility or checkpointing purposes; thus, it is expected that these methods can sufficiently manipulate the internal state to provide full reproducibility of the environment dynamics (supplying the internal random number generator if applicable, for example).

abstract Environment.get_state()[source]¶

Returns the environment state.

Return type: Any

abstract Environment.set_state(state)[source]¶: Sets the environment state.

Observation and Action Specs¶

Environments are expected to return the specifications of the observation and action space by calling observation_spec and action_spec respectively. These methods should return structures of dm_env Array specs which adhere exactly to the format of observations and actions.

abstract Environment.observation_spec()[source]¶

Describes the observation space of the environment.

May use a subclass of specs.Array that specifies additional properties such as min and max bounds on the values.

Return type: specs.Array

abstract Environment.action_spec()[source]¶

Describes the valid action space of the environment.

May use a subclass of specs.Array that specifies additional properties such as min and max bounds on the values.

Return type: specs.Array