BeerFMTEnv

Beer is one of the oldest and most widely consumed alcoholic drinks in the world, and we have a lot of ways to produce them. This class provides a typical (and simple enough) simulation of the industry-level beer fermentation process. The only input that the simulation takes is the reaction temperature. The end goal in this simulation is to reach the stop condition (finish production) with a certain time limit, the quicker the better.

To better assist any control algorithms, we also provided a canonical production process under this simulation. Please consult the ‘profile_industrial’ in the BeerFMT section here for more details.

BeerFMTEnv module

Following the discription above, we provide APIs as below:

BeerFMT simulates the Beer Fermentation process.

class smpl.envs.beerfmtenv.BeerFMTEnvGym(dense_reward=True, normalize=True, debug_mode=False, action_dim=1, observation_dim=8, reward_function=None, done_calculator=None, max_observations=[15, 15, 15, 150, 150, 10, 10, 200], min_observations=[0, 0, 0, 0, 0, 0, 0, 0], max_actions=[16.0], min_actions=[9.0], observation_name=['X_A', 'X_L', 'X_D', 'S', 'EtOH', 'DY', 'EA', 'time'], action_name=['temperature'], np_dtype=<class 'numpy.float32'>, max_steps=200, error_reward=-200.0)[source]

Bases: smpl.envs.utils.smplEnvBase

done_calculator_standard(current_observation, step_count, reward, update_prev_biomass=False, done=None, done_info=None)[source]
check whether the current episode is considered finished.

returns a boolean value indicated done or not, and a dictionary with information. here in done_calculator_standard, done_info looks like {“terminal”: boolean, “timeout”: boolean}, where “timeout” is true when episode end due to reaching the maximum episode length, “terminal” is true when “timeout” or episode end due to termination conditions such as env error encountered. (basically done)

Parameters
  • current_observation ([np.ndarray]) – This is denormalized observation, as usual.

  • step_count ([int]) – step_count.

  • reward ([float]) – reward.

  • done ([bool], optional) – Defaults to None.

  • done_info ([dict], optional) – how the environment is finished. Defaults to None.

Returns

done and done_info.

Return type

[(float, dict)]

reset(initial_state=None, normalize=None)[source]

required by gym. This function resets the environment and returns an initial observation.

reward_function_standard(previous_observation, action, current_observation, reward=None)[source]

the s, a, r, s, a calculation.

Parameters
  • previous_observation ([np.ndarray]) – This is denormalized observation, as usual.

  • action ([np.ndarray]) – This is denormalized action, as usual.

  • current_observation ([np.ndarray]) – This is denormalized observation, as usual.

  • reward ([float]) – If reward is provided, directly return the reward.

Returns

reward.

Return type

[float]

sample_initial_state()[source]
step(action, normalize=None)[source]

required by gym. This function performs one step within the environment and returns the observation, the reward, whether the episode is finished and debug information, if any.

smpl.envs.beerfmtenv.beer_ode(points, t, sets)[source]

Beer fermentation process