You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 

133 lines
3.7 KiB

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Reward Models\n",
"\n",
"In [Getting Started](../getting_started.ipynb), we mainly looked at probabilities in the Markov models and properties that refer to these probabilities.\n",
"In this section, we discuss reward models."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exploring reward models\n",
"\n",
"[01-reward-models.py](https://github.com/moves-rwth/stormpy/blob/master/examples/reward_models/01-reward-models.py)\n",
"\n",
"We consider the die again, but with another property which talks about the expected reward:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"hide-output": false
},
"outputs": [],
"source": [
">>> import stormpy\n",
">>> import stormpy.examples\n",
">>> import stormpy.examples.files\n",
">>> program = stormpy.parse_prism_program(stormpy.examples.files.prism_dtmc_die)\n",
">>> prop = \"R=? [F \\\"done\\\"]\"\n",
"\n",
">>> properties = stormpy.parse_properties(prop, program, None)\n",
">>> model = stormpy.build_model(program, properties)\n",
">>> assert len(model.reward_models) == 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The model now has a reward model, as the property talks about rewards.\n",
"When [Building Models](building_models.ipynb) from explicit sources, the reward model is always included if it is defined in the source.\n",
"We can do model checking analogous to probabilities:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"hide-output": false
},
"outputs": [],
"source": [
">>> initial_state = model.initial_states[0]\n",
">>> result = stormpy.model_checking(model, properties[0])\n",
">>> print(\"Result: {}\".format(round(result.at(initial_state), 6)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The reward model has a name which we can obtain as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"hide-output": false
},
"outputs": [],
"source": [
">>> reward_model_name = list(model.reward_models.keys())[0]\n",
">>> print(reward_model_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We discuss later how to work with multiple reward models.\n",
"Rewards come in multiple fashions, as state rewards, state-action rewards and as transition rewards.\n",
"In this example, we only have state-action rewards. These rewards are a vector, over which we can trivially iterate:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"hide-output": false
},
"outputs": [],
"source": [
">>> assert not model.reward_models[reward_model_name].has_state_rewards\n",
">>> assert model.reward_models[reward_model_name].has_state_action_rewards\n",
">>> assert not model.reward_models[reward_model_name].has_transition_rewards\n",
">>> for reward in model.reward_models[reward_model_name].state_action_rewards:\n",
"... print(reward)"
]
}
],
"metadata": {
"date": 1598188121.7157953,
"filename": "reward_models.rst",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.2"
},
"title": "Reward Models"
},
"nbformat": 4,
"nbformat_minor": 4
}