MPC DCOPF policy

In the original paper, we describe a policy based on Model Predictive Control (MPC) that solves a multi-timestep DC Optimal Power Flow (OPF).

In the paper, we first describe the general MPC-based policy \(\pi_{MPC-N}\), which takes as input predictions of future loads \(P_l^{(dev)}\) and maximum generator outputs \(P_g^{(max)}\) over the optimization horizon \(\{t+1,t+N\}\). Two particular cases are then considered:

  1. \(\pi_{MPC-N}^{constant}\): \(P_l^{(dev)}\) and \(P_g^{(max)}\) are assumed constant throughout the optimization horizon,

  2. \(\pi_{MPC-N}^{perfect}\): \(P_l^{(dev)}\) and \(P_g^{(max)}\) are known (i.e., perfectly forecasted).

Constant forecast

The first approach can be ran in all gym-anm environments. Note, however, that the implementation of \(\pi_{MPC-N}^{constant}\) accesses the internal state of the power grid simulator. In other words, it assumes the environment is fully observable.

A code example is provided below for the environment ANM6Easy-v0.

"""
This script shows how to run the MPC-based DC OPF policy
:math:`\\pi_{MPC-N}^{constant}` in an arbitrary gym-anm
environment.

This policy assumes constant demand and generation during the
optimization horizon.

For more information, see https://gym-anm.readthedocs.io/en/latest/topics/mpc.html#constant-forecast.
"""
import gym
from gym_anm import MPCAgentConstant


def run():
    env = gym.make("ANM6Easy-v0")
    o = env.reset()

    # Initialize the MPC policy.
    agent = MPCAgentConstant(env.simulator, env.action_space, env.gamma, safety_margin=0.96, planning_steps=10)

    # Run the policy.
    for t in range(100):
        a = agent.act(env)
        obs, r, done, _ = env.step(a)
        print(f"t={t}, r_t={r:.3}")


if __name__ == "__main__":
    run()

Perfect forecast

The second approach can currently only be ran in the environment ANM6Easy-v0. This is because the implementation of \(\pi_{MPC-N}^{perfect}\) accesses the fixed future loads and generator outputs stored in the environment object (and custom environments may not have such fixed time series).

A code example is provided below. Note that the only difference with the example in the previous section is the change in the class name, from MPCAgent to MPCAgentANM6Easy.

"""
This script shows how to run the MPC-based DC OPF policy
:math:`\\pi_{MPC-N}^{perfect}` in the ANM6Easy-v0 environment.

This policy assumes perfect forecasts of demands and generations
over the optimization horizon.

For more information, see https://gym-anm.readthedocs.io/en/latest/topics/mpc.html#perfect-forecast.
"""
import gym
from gym_anm import MPCAgentPerfect


def run():
    env = gym.make("ANM6Easy-v0")
    o = env.reset()

    # Initialize the MPC policy.
    agent = MPCAgentPerfect(env.simulator, env.action_space, env.gamma, safety_margin=0.96, planning_steps=10)

    # Run the policy.
    for t in range(100):
        a = agent.act(env)
        obs, r, done, _ = env.step(a)
        print(f"t={t}, r_t={r:.3}")


if __name__ == "__main__":
    run()