Rendering an Environment
It is often desirable to be able to watch your agent interacting with the environment (and it makes the whole
process more fun!). Currently, gym-anm
does not, however, support the rendering of arbitrary environments.
The only exception is the initial task ANM6Easy-v0
, for which a web-based rendering tool is available
(through the env.render()
and env.close()
calls).
In addition, the implementation was designed so as to make it easy for others to build variants of
ANM6Easy-v0
, while benefiting from its rendering tool. This can be achieved by creating your environment
as a sub-class of gym_anm.envs.anm6_env.anm6.ANM6
. By doing so, you will automatically inherit the
same 6-bus distribution power grid (defined by this network dictionary).
After slightly modifying the example from the previous page, the code below shows a custom environment
that inherits the 6-bus power grid used in ANM6Easy-v0
and therefore makes its rendering possible
to its users.
"""
This file contains an example of a custom gym-anm environment that
inherits from ANM6.
Features:
* rendering is available (since ANM6 is inherited),
* it uses the same 6-bus 7-device power grid as the ANM6Easy-v0 task,
* the initial state s0 is randomly sampled (see `init_state()`),
* load demands and maximum generations are randomly sampled from
within their physical limits (see `next_vars()`),
* the same auxiliary variable as in ANM6Easy-v0 is used, indicating
the time of day as an index in [0, 96] (see `next_vars()`),
For more information, see https://gym-anm.readthedocs.io/en/latest/topics/rendering.html.
"""
import numpy as np
from gym_anm.envs import ANM6
class CustomANM6Environment(ANM6):
"""A gym-anm task built on top of the ANM6 grid."""
def __init__(self):
observation = "state" # fully observable environment
K = 1 # 1 auxiliary variable
delta_t = 0.25 # 15min intervals
gamma = 0.9 # discount factor
lamb = 100 # penalty weighting hyperparameter
aux_bounds = np.array([[0, 10]]) # bounds on auxiliary variable
costs_clipping = (1, 100) # reward clipping parameters
seed = 1 # random seed
super().__init__(observation, K, delta_t, gamma, lamb, aux_bounds, costs_clipping, seed)
def init_state(self):
"""Return a state vector with random values in [0, 1]."""
n_dev = self.simulator.N_device # number of devices
n_des = self.simulator.N_des # number of DES units
n_gen = self.simulator.N_non_slack_gen # number of non-slack generators
s = np.random.rand(2 * n_dev + n_des + n_gen) # random state vector
# Let the auxiliary variable be a time of day index where increments
# represent `self.delta_t` time durations.
# Initial time: 00:00.
aux = 0
return np.hstack((s, aux)) # initial state vector s0
def next_vars(self, s_t):
"""Generate the next stochastic variables and auxiliary variables."""
next_var = []
# Random demand for residential area in [-10, 0] MW.
next_var.append(-10 * np.random.rand(1)[0])
# Random PV max generation in [0, 30] MW.
next_var.append(30 * np.random.rand(1)[0])
# Random demand for industrial complex in [-30, 0] MW.
next_var.append(-30 * np.random.rand(1)[0])
# Random wind farm max generation in [0, 50] MW.
next_var.append(50 * np.random.rand(1)[0])
# Random load from EV charging station in [-30, 0] MW.
next_var.append(-30 * np.random.rand(1)[0])
# Auxiliary variable is the time of day index in [0, 96].
aux = int((s_t[-1] + 1) % (24 / self.delta_t))
next_var.append(aux)
return np.array(next_var)
if __name__ == "__main__":
env = CustomANM6Environment()
env.reset()
for t in range(10):
a = env.action_space.sample()
o, r, done, _ = env.step(a)
print(f"t={t}, r_t={r:.3}")