You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
133 lines
3.7 KiB
133 lines
3.7 KiB
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Reward Models\n",
|
|
"\n",
|
|
"In [Getting Started](../getting_started.ipynb), we mainly looked at probabilities in the Markov models and properties that refer to these probabilities.\n",
|
|
"In this section, we discuss reward models."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Exploring reward models\n",
|
|
"\n",
|
|
"[01-reward-models.py](https://github.com/moves-rwth/stormpy/blob/master/examples/reward_models/01-reward-models.py)\n",
|
|
"\n",
|
|
"We consider the die again, but with another property which talks about the expected reward:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"hide-output": false
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
">>> import stormpy\n",
|
|
">>> import stormpy.examples\n",
|
|
">>> import stormpy.examples.files\n",
|
|
">>> program = stormpy.parse_prism_program(stormpy.examples.files.prism_dtmc_die)\n",
|
|
">>> prop = \"R=? [F \\\"done\\\"]\"\n",
|
|
"\n",
|
|
">>> properties = stormpy.parse_properties(prop, program, None)\n",
|
|
">>> model = stormpy.build_model(program, properties)\n",
|
|
">>> assert len(model.reward_models) == 1"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"The model now has a reward model, as the property talks about rewards.\n",
|
|
"When [Building Models](building_models.ipynb) from explicit sources, the reward model is always included if it is defined in the source.\n",
|
|
"We can do model checking analogous to probabilities:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"hide-output": false
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
">>> initial_state = model.initial_states[0]\n",
|
|
">>> result = stormpy.model_checking(model, properties[0])\n",
|
|
">>> print(\"Result: {}\".format(round(result.at(initial_state), 6)))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"The reward model has a name which we can obtain as follows:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"hide-output": false
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
">>> reward_model_name = list(model.reward_models.keys())[0]\n",
|
|
">>> print(reward_model_name)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We discuss later how to work with multiple reward models.\n",
|
|
"Rewards come in multiple fashions, as state rewards, state-action rewards and as transition rewards.\n",
|
|
"In this example, we only have state-action rewards. These rewards are a vector, over which we can trivially iterate:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"hide-output": false
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
">>> assert not model.reward_models[reward_model_name].has_state_rewards\n",
|
|
">>> assert model.reward_models[reward_model_name].has_state_action_rewards\n",
|
|
">>> assert not model.reward_models[reward_model_name].has_transition_rewards\n",
|
|
">>> for reward in model.reward_models[reward_model_name].state_action_rewards:\n",
|
|
"... print(reward)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"date": 1598188121.7157953,
|
|
"filename": "reward_models.rst",
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.8.2"
|
|
},
|
|
"title": "Reward Models"
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 4
|
|
}
|