Browse Source

reward model docu

refactoring
Sebastian Junges 7 years ago
parent
commit
de2c4ad8e5
  1. 57
      doc/source/reward_models.rst
  2. 30
      examples/reward_models/01-reward-models.py

57
doc/source/reward_models.rst

@ -2,6 +2,63 @@
Reward Models
**************
In :doc:`getting_started`, we mainly looked at probabilities in the Markov models and properties that refer to these probabilities.
In this section, we discuss reward models.
Exploring reward models
------------------------
.. seealso:: `01-reward-models.py <https://github.com/moves-rwth/stormpy/blob/master/examples/reward_models/01-reward-models.py>`_
We consider the die again, but with another property which talks about the expected reward::
>>> import stormpy
>>> import stormpy.examples
>>> import stormpy.examples.files
>>> program = stormpy.parse_prism_program(stormpy.examples.files.prism_dtmc_die)
>>> prop = "R=? [F \"done\"]"
>>> properties = stormpy.parse_properties_for_prism_program(prop, program, None)
>>> model = stormpy.build_model(program, properties)
>>> assert len(model.reward_models) == 1
The model now has a reward model, as the property talks about rewards.
We can do model checking analogous to probabilities::
>>> initial_state = model.initial_states[0]
>>> result = stormpy.model_checking(model, properties[0])
>>> print("Result: {}".format(result.at(initial_state)))
Result: 3.6666666666666665
The reward model has a name which we can obtain as follows::
>>> reward_model_name = list(model.reward_models.keys())[0]
>>> print(reward_model_name)
coin_flips
We discuss later how to work with multiple reward models.
Rewards come in multiple fashions, as state rewards, state-action rewards and as transition rewards.
In this example, we only have state-action rewards. These rewards are a vector, over which we can trivially iterate::
>>> assert not model.reward_models[reward_model_name].has_state_rewards
>>> assert model.reward_models[reward_model_name].has_state_action_rewards
>>> assert not model.reward_models[reward_model_name].has_transition_rewards
>>> for reward in model.reward_models[reward_model_name].state_action_rewards:
... print(reward)
1.0
1.0
1.0
1.0
1.0
1.0
1.0
0.0
0.0
0.0
0.0
0.0
0.0

30
examples/reward_models/01-reward-models.py

@ -0,0 +1,30 @@
import stormpy
import stormpy.core
import stormpy.examples
import stormpy.examples.files
def example_reward_models_01():
program = stormpy.parse_prism_program(stormpy.examples.files.prism_dtmc_die)
prop = "R=? [F \"done\"]"
properties = stormpy.parse_properties_for_prism_program(prop, program, None)
model = stormpy.build_model(program, properties)
initial_state = model.initial_states[0]
result = stormpy.model_checking(model, properties[0])
print("Result: {}".format(result.at(initial_state)))
assert len(model.reward_models) == 1
reward_model_name = list(model.reward_models.keys())[0]
print(reward_model_name)
assert reward_model_name == "coin_flips"
assert not model.reward_models[reward_model_name].has_state_rewards
assert model.reward_models[reward_model_name].has_state_action_rewards
for reward in model.reward_models[reward_model_name].state_action_rewards:
print(reward)
assert not model.reward_models[reward_model_name].has_transition_rewards
if __name__ == '__main__':
example_reward_models_01()
Loading…
Cancel
Save