Basic Usage

This example demonstrates the basic usage and interface of a control environment in EPyT-Control.

[1]:

from IPython.display import display, HTML
display(HTML('<a target=\"_blank\" href=\"https://colab.research.google.com/github/WaterFutures/EPyT-Control/blob/main/docs/examples/basic_usage.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>'))

[2]:

%pip install epyt-control --quiet

Note: you may need to restart the kernel to use updated packages.

[3]:

# Import a pre-defined control environment where the chlorine injection is to be controlled
from my_env import SimpleChlorineInjectionEnv

[4]:

# Create new instance of the control environment
env = SimpleChlorineInjectionEnv()

/tmp/Hanoi.inp: 100%|##########| 9.63k/9.63k [00:00<00:00, 11.4MB/s]
/tmp/weekPat_30min.mat: 100%|##########| 622/622 [00:00<00:00, 1.67MB/s]
/tmp/yearOffset_30min.mat: 100%|##########| 281/281 [00:00<00:00, 696kB/s]

Inspect the observation space (i.e. input to the agent/controller) by accessing the observation_space property:

[5]:

print(env.observation_space)  # Observations: 34 dimension real-valued vector

Box(-inf, inf, (34,), float32)

Inspect the observation space (i.e. input to the agent/controller) by accessing the action_space property:

[6]:

print(env.action_space)      # Action: A single scalar between 0 and 10000

Box(0.0, 10000.0, (1,), float32)

Reset the environment by calling the reset() function:

[7]:

obs, _ = env.reset()
print(obs)

[12556.066    11923.519     4822.2656    4733.9595    4261.0327
  3573.6365    2916.3572    2554.4258    2281.027     1386.6521
  1134.0741     746.6124     598.93304     97.082016   -21.494305
   334.78726   -837.3361   -1651.0421   -1678.5037    4849.304
   771.6065     346.30783   3345.3079    2037.8165    1415.335
  -633.45764   -128.32362     80.403206   438.5205     269.22598
    21.165922  -144.28528    164.87422    611.3063  ]

Execute some random actions by calling the step() function – note that a random action can be generated by calling the sample() function of the environment’s action space:

[8]:

# Run some iterations -- note that autorest=True
for _ in range(20):
    # Pick a random action
    act = env.action_space.sample()

    # Apply the action, get a reward (to be maximized) and
    # new observations (i.e. sensor readings)
    obs, reward, terminated, _, _ = env.step(act)
    print(f"Action: {act}, Reward: {reward}")

Action: [2474.741], Reward: -9.50844955444336
Action: [2717.882], Reward: -9.431923866271973
Action: [7393.1963], Reward: -9.210000038146973
Action: [4436.3945], Reward: -9.08823013305664
Action: [4285.958], Reward: -8.941822052001953
Action: [1819.5602], Reward: -8.986867904663086
Action: [58.87161], Reward: -9.005390167236328
Action: [4916.792], Reward: -8.917482376098633
Action: [4793.775], Reward: -8.808938026428223
Action: [1181.165], Reward: -8.904912948608398
Action: [5630.688], Reward: -8.790962219238281
Action: [2118.5825], Reward: -8.896865844726562
Action: [7936.3438], Reward: -8.670406341552734
Action: [8404.205], Reward: -8.547883987426758
Action: [6086.2837], Reward: -8.462085723876953
Action: [5635.624], Reward: -8.378281593322754
Action: [3317.4182], Reward: -8.381725311279297
Action: [3939.8386], Reward: -8.341522216796875
Action: [3874.364], Reward: -8.501297950744629
Action: [1320.8661], Reward: -8.629948616027832

Do not forget to close the environment by calling the close() function:

[9]:

env.close()