diff --git a/doc/source/advanced_topics.rst b/doc/source/advanced_topics.rst index ec89554..df9f41e 100644 --- a/doc/source/advanced_topics.rst +++ b/doc/source/advanced_topics.rst @@ -12,8 +12,8 @@ This guide is a collection of examples meant to bridge the gap between the getti doc/building_models doc/building_dtmcs doc/building_ctmcs - doc/building_mas doc/building_mdps + doc/building_mas doc/engines doc/exploration doc/reward_models diff --git a/doc/source/doc/building_ctmcs.rst b/doc/source/doc/building_ctmcs.rst index 27ecc0b..ee20264 100644 --- a/doc/source/doc/building_ctmcs.rst +++ b/doc/source/doc/building_ctmcs.rst @@ -30,23 +30,22 @@ In this example, we build the transition matrix using a numpy array:: The following function call returns a sparse matrix with default row groups:: >>> transition_matrix = stormpy.build_sparse_matrix(transitions) - >>> print(transition_matrix)< - + >>> print(transition_matrix) 0 1 2 3 ---- group 0/3 ---- - 0 ( 0 1.5 0 0 ) 0 + 0 ( 0 1.5 0 0 ) 0 ---- group 1/3 ---- - 1 ( 3 0 1.5 0 ) 1 + 1 ( 3 0 1.5 0 ) 1 ---- group 2/3 ---- - 2 ( 0 3 0 1.5 ) 2 + 2 ( 0 3 0 1.5 ) 2 ---- group 3/3 ---- - 3 ( 0 0 3 0 ) 3 + 3 ( 0 0 3 0 ) 3 0 1 2 3 Labeling ================ -The state labeling is created analogously to the previous example:: +The state labeling is created analogously to the previous example in :doc:`building_dtmcs`:: >>> state_labeling = stormpy.storage.StateLabeling(4) >>> state_labels = {'empty', 'init', 'deadlock', 'full'} @@ -58,7 +57,7 @@ The state labeling is created analogously to the previous example:: Exit Rates ==================== -Lastly, we equip every state with an exit rate:: +Lastly, we initialize a list to equip every state with an exit rate:: >>> exit_rates = [1.5, 4.5, 4.5, 3.0] diff --git a/doc/source/doc/building_dtmcs.rst b/doc/source/doc/building_dtmcs.rst index 9def2f1..a2dc7bf 100644 --- a/doc/source/doc/building_dtmcs.rst +++ b/doc/source/doc/building_dtmcs.rst @@ -9,29 +9,30 @@ As described in :doc:`../getting_started`, Storm can be used to translate a model description e.g. in form of a prism file into a Markov chain. Here, we use Stormpy to create the single components, to build a DTMC without parsing a model description. -We consider the previous example of the die. +We consider the previous example of the dice. .. seealso:: `01-building-dtmcs.py ` -In the following we create transition matrix, the state labeling and the reward models of a DTMC. +In the following we create the transition matrix, the state labeling and the reward models of a DTMC. First, we import stormpy:: ->>> import stormpy + >>> import stormpy Transition Matrix ===================== We begin by creating the matrix representing the transitions in the model in terms of probabilities. For constructing the transition matrix, we use the SparseMatrixBuilder:: ->>> builder = stormpy.SparseMatrixBuilder(rows = 0, columns = 0, entries = 0, force_dimensions = False, has_custom_row_grouping = False) + >>> builder = stormpy.SparseMatrixBuilder(rows = 0, columns = 0, entries = 0, force_dimensions = False, has_custom_row_grouping = False) Here, we start with an empty matrix to later insert more entries. If the number of rows, columns and entries is known, the matrix can be constructed using these values. For DTMCs each state has at most one outgoing probability distribution. -Thus, we create matrix with trivial row grouping where each group contains one row representing the state action. +Thus, we create a matrix with trivial row grouping where each group contains one row representing the state action. +In :doc:`building_mdps` we will revisit the example of the die, but extend the model with nondeterministic choice. -We specify the transitions of the model, by adding values to the matrix where the column represents the target state. +We specify the transitions of the model by adding values to the matrix where the column represents the target state. All transitions are equipped with a probability defined by the value:: >>> builder.add_next_value(row = 0, column = 1, value = 0.5) @@ -55,12 +56,12 @@ Lastly, we add a self-loop with probability one to the final states:: ... builder.add_next_value(s, s, 1) -Finally, we can build the matrix with updated row and columns count that both coincide with the number of states:: +Finally, we can build the matrix:: - >>> transition_matrix = builder.build(13, 13) + >>> transition_matrix = builder.build() It should be noted that entries can only be inserted in ascending order, i.e. row by row and column by column. -Stormpy provides the possibility to build a sparse matrix using the numpy library +Stormpy provides the possibility to build a sparse matrix using the numpy library (https://numpy.org/ ) Instead of using the SparseMatrixBuilder, a sparse matrix can be build from a numpy array via the method stormpy.build_sparse_matrix. Labeling @@ -77,9 +78,11 @@ In order to specify the state labeling we create an empty labeling for the given Labels can be asociated with states. As an example, we label the state 0 with 'init':: + >>> state_labeling.add_label_to_state('init', 0) >>> print(state_labeling.get_states('init')) bit vector(1/13) [0] + Next, we set the associations between the remaining labels and states.:: >>> state_labeling.add_label_to_state('one', 7) diff --git a/doc/source/doc/building_mas.rst b/doc/source/doc/building_mas.rst index cdc3034..f9dae29 100644 --- a/doc/source/doc/building_mas.rst +++ b/doc/source/doc/building_mas.rst @@ -6,9 +6,11 @@ Markov automata (MAs) Background ===================== -... works similar as explained in as in :doc:`building_dtmcs` +We already saw the process of building CTMCs with MDPs via Stormpy in :doc:`building_ctmcs` and :doc:`building_mdps`. +In this section, we build a small Markarov automata with five states from which the first four are markovian. +Since we covered labeling and exit rates already in the previous examples we omit the description of these components. +The full example can be found here: -Build simple ma example .. seealso:: `01-building-mas.py ` First, we import Stormpy:: @@ -17,20 +19,72 @@ First, we import Stormpy:: Transition Matrix ================== - - - -Labeling -================ - - -Reward models +In :doc:`building_mdps`, we used the SparseMatrixBuilder to create a matrix with a custom row grouping. +In this example, we use the numpy library: + +In the beginning, we create a numpy array that will be used to build the transition matrix of our model.:: + + >>> import numpy as np + >>> transitions = np.array([ + ... [0, 1, 0, 0, 0], + ... [0.8, 0, 0.2, 0, 0], + ... [0.9, 0, 0, 0.1, 0], + ... [0, 0, 0, 0, 1], + ... [0, 0, 0, 1, 0], + ... [0, 0, 0, 0, 1]], dtype='float64') + +When building the matrix we define a custom row grouping by passing a list containing the starting row of each row group in ascending order:: + + >>> transition_matrix = stormpy.build_sparse_matrix(transitions, [0, 2, 3, 4, 5]) + >>> print(transition_matrix) + + 0 1 2 3 4 + ---- group 0/4 ---- + 0 ( 0 1 0 0 0 ) 0 + 1 ( 0.8 0 0.2 0 0 ) 1 + ---- group 1/4 ---- + 2 ( 0.9 0 0 0.1 0 ) 2 + ---- group 2/4 ---- + 3 ( 0 0 0 0 1 ) 3 + ---- group 3/4 ---- + 4 ( 0 0 0 1 0 ) 4 + ---- group 4/4 ---- + 5 ( 0 0 0 0 1 ) 5 + 0 1 2 3 4 + + +Markovian States ================== +Feature of MA: Markovian states. +Those are represented by a BitVector that containing the respective states:: - -Exit Rates -==================== + >>> markovian_states = stormpy.BitVector(5, [1, 2, 3, 4]) Building the Model ==================== - +Now, we can collect all components:: + + >>> components = stormpy.SparseModelComponents(transition_matrix=transition_matrix, state_labeling=state_labeling, + markovian_states=markovian_states) + >>> components.choice_labeling = choice_labeling + >>> components.exit_rates = exit_rates + +Finally, we can build the model:: + + >>> ma = stormpy.storage.SparseMA(components) + >>> print(ma) + -------------------------------------------------------------- + Model type: Markov Automaton (sparse) + States: 5 + Transitions: 8 + Choices: 6 + Markovian St.: 4 + Max. Rate.: 12 + Reward Models: none + State Labels: 2 labels + * deadlock -> 0 item(s) + * init -> 1 item(s) + Choice Labels: 2 labels + * alpha -> 1 item(s) + * beta -> 1 item(s) + -------------------------------------------------------------- diff --git a/doc/source/doc/building_mdps.rst b/doc/source/doc/building_mdps.rst index eefd770..232b514 100644 --- a/doc/source/doc/building_mdps.rst +++ b/doc/source/doc/building_mdps.rst @@ -2,10 +2,11 @@ Discrete-time Markov decision processes (MDPs) *********************************************** - Background ===================== -Same example of die knuth as in dtmc but mdp (model with nondeterminsm) + +In :doc:`building_dtmcs` we modelled Knuth's model of a fair die by the means of a DTMC. +In the following we extend this model with nondeterministic choice by building a Markov Decision process. .. seealso:: `01-building-mdps.py ` @@ -15,25 +16,115 @@ First, we import Stormpy:: Transition Matrix ===================== -more choices than states! +Since we want build a nondeterminstic model, we create a transition matrix with a custom row group for each state:: + + >>> builder = stormpy.SparseMatrixBuilder(rows=0, columns=0, entries=0, force_dimensions=False, has_custom_row_grouping=True, row_groups=0) + +We need more than one row for the transitions starting in state 0 because a nondeterministic choice over the actions is available. +Therefore, we start a new group that will contain the rows representing actions of state 0. +Note that the row group needs to be added before any entries are added to the group:: + + >>> builder.new_row_group(0) + >>> builder.add_next_value(0, 1, 0.5) + >>> builder.add_next_value(0, 2, 0.5) + >>> builder.add_next_value(1, 1, 0.2) + >>> builder.add_next_value(1, 2, 0.8) +For the remaining states, we need to specify the starting rows of row groups:: + + >>> builder.new_row_group(2) + >>> builder.add_next_value(2, 3, 0.5) + >>> builder.add_next_value(2, 4, 0.5) + >>> builder.new_row_group(3) + >>> builder.add_next_value(3, 5, 0.5) + >>> builder.add_next_value(3, 6, 0.5) + >>> builder.new_row_group(4) + >>> builder.add_next_value(4, 7, 0.5) + >>> builder.add_next_value(4, 1, 0.5) + >>> builder.new_row_group(5) + >>> builder.add_next_value(5, 8, 0.5) + >>> builder.add_next_value(5, 9, 0.5) + >>> builder.new_row_group(6) + >>> builder.add_next_value(6, 10, 0.5) + >>> builder.add_next_value(6, 11, 0.5) + >>> builder.new_row_group(7) + >>> builder.add_next_value(7, 2, 0.5) + >>> builder.add_next_value(7, 12, 0.5) + + >>> for s in range(8, 14): + ... builder.new_row_group(s) + ... builder.add_next_value(s, s - 1, 1) + +Build:: + + >>> transition_matrix = builder.build() Labeling ================ +We have seen the construction of a state labeling in previous examples. Therefore we omit the description here. + +Instead we focus on the choices. +Since in state 0 a nondeterministic choice over two actions is available +The number of choices is 14. +To distinguish those we can define a choice labeling:: + + >>> choice_labeling = stormpy.storage.ChoiceLabeling(14) + >>> choice_labels = {'a', 'b'} + + >>> for label in choice_labels: + ... choice_labeling.add_label(label) + +We assign the label 'a' to the first action of state 0 and 'b' to the second. +Recall that those actions where defined in row one and two of the transition matrix respectively:: + + >>> choice_labeling.add_label_to_choice('a', 0) + >>> choice_labeling.add_label_to_choice('b', 1) + >>> print(choice_labeling) + Choice 2 labels + * a -> 1 item(s) + * b -> 1 item(s) Reward models ================== +Reward models, length of vector coincides with number of choices:: - -Exit Rates -==================== + >>> reward_models = {} + >>> action_reward = [0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] + >>> reward_models['coin_flips'] = stormpy.SparseRewardModel(optional_state_action_reward_vector=action_reward) Building the Model ==================== +Collect components:: + >>> components = stormpy.SparseModelComponents(transition_matrix=transition_matrix, state_labeling=state_labeling, reward_models=reward_models, rate_transitions=False) + >>> components.choice_labeling = choice_labeling -POMDPS -==================== -Give -observations as .. \ No newline at end of file +Build the model:: + + >>> mdp = stormpy.storage.SparseMdp(components) + >>> print(mdp) + Model type: MDP (sparse) + States: 13 + Transitions: 22 + Choices: 14 + Reward Models: coin_flips + State Labels: 9 labels + * one -> 1 item(s) + * six -> 1 item(s) + * three -> 1 item(s) + * four -> 1 item(s) + * done -> 6 item(s) + * init -> 1 item(s) + * five -> 1 item(s) + * deadlock -> 0 item(s) + * two -> 1 item(s) + Choice Labels: 2 labels + * a -> 1 item(s) + * b -> 1 item(s) + +Partially observable Markov decision process (POMDPs) +======================================================== + +To build a partially observable Markov decision process, +components.observations can be set to a list of numbers that defines the status of the observables in each state. diff --git a/examples/building_dtmcs/01-building-dtmcs.py b/examples/building_dtmcs/01-building-dtmcs.py index 1548b3d..d7d22da 100644 --- a/examples/building_dtmcs/01-building-dtmcs.py +++ b/examples/building_dtmcs/01-building-dtmcs.py @@ -27,7 +27,7 @@ def example_building_dtmcs_01(): builder.add_next_value(s, s, 1) # Build matrix - transition_matrix = builder.build(13, 13) + transition_matrix = builder.build() print(transition_matrix) # State labeling @@ -38,10 +38,11 @@ def example_building_dtmcs_01(): for label in labels: state_labeling.add_label(label) - # Set label to state + # Set label of state 0 state_labeling.add_label_to_state('init', 0) print(state_labeling.get_states('init')) + # Set remaining labels state_labeling.add_label_to_state('one', 7) state_labeling.add_label_to_state('two', 8) state_labeling.add_label_to_state('three', 9) diff --git a/examples/building_mas/01-building-mas.py b/examples/building_mas/01-building-mas.py index f543ceb..26bf220 100644 --- a/examples/building_mas/01-building-mas.py +++ b/examples/building_mas/01-building-mas.py @@ -1,9 +1,54 @@ import stormpy +import numpy as np -# hybrid_states example def example_building_mas_01(): - print('todo') + # Building the transition matrix using numpy + transitions = np.array([ + [0, 1, 0, 0, 0], + [0.8, 0, 0.2, 0, 0], + [0.9, 0, 0, 0.1, 0], + [0, 0, 0, 0, 1], + [0, 0, 0, 1, 0], + [0, 0, 0, 0, 1] + ], dtype='float64') + + # Build matrix and define indices of row groups (ascending order) + transition_matrix = stormpy.build_sparse_matrix(transitions, [0, 2, 3, 4, 5]) + print(transition_matrix) + + # StateLabeling + state_labeling = stormpy.storage.StateLabeling(5) + # Add labels + state_labels = {'init', 'deadlock'} + # Set labeling of states + for label in state_labels: + state_labeling.add_label(label) + state_labeling.add_label_to_state('init', 0) + + # Choice labeling + choice_labeling = stormpy.storage.ChoiceLabeling(6) + # Add labels + choice_labels = {'alpha', 'beta'} + # Set labeling of choices + for label in choice_labels: + choice_labeling.add_label(label) + choice_labeling.add_label_to_choice('alpha', 0) + choice_labeling.add_label_to_choice('beta', 1) + + exit_rates = [0.0, 10.0, 12.0, 1.0, 1.0] + + markovian_states = stormpy.BitVector(5, [1, 2, 3, 4]) + + # Collect components + components = stormpy.SparseModelComponents(transition_matrix=transition_matrix, state_labeling=state_labeling, + markovian_states=markovian_states) + components.choice_labeling = choice_labeling + components.exit_rates = exit_rates + + # Build the model + ma = stormpy.storage.SparseMA(components) + print(ma) if __name__ == '__main__': diff --git a/examples/building_mdps/01-building-mdps.py b/examples/building_mdps/01-building-mdps.py index 9eee6f2..1cc510b 100644 --- a/examples/building_mdps/01-building-mdps.py +++ b/examples/building_mdps/01-building-mdps.py @@ -1,13 +1,15 @@ import stormpy import numpy as np + # Knuth's model of a fair die using only fair coins def example_building_mdps_01(): nr_states = 13 nr_choices = 14 - # Building the transition matrix using - builder = stormpy.SparseMatrixBuilder(rows = 0, columns = 0, entries = 0, force_dimensions = False, has_custom_row_grouping = True, row_groups = 0) + # Transition matrix with custom row grouping: nondeterministic choice over the actions available in states + builder = stormpy.SparseMatrixBuilder(rows=0, columns=0, entries=0, force_dimensions=False, + has_custom_row_grouping=True, row_groups=0) # New row group, for actions of state 0 builder.new_row_group(0) @@ -15,48 +17,46 @@ def example_building_mdps_01(): builder.add_next_value(0, 2, 0.5) builder.add_next_value(1, 1, 0.2) builder.add_next_value(1, 2, 0.8) - # New row group, for actions of state 1 + # State 1 builder.new_row_group(2) builder.add_next_value(2, 3, 0.5) builder.add_next_value(2, 4, 0.5) - # New row group, for actions of state 2 + # State 2 builder.new_row_group(3) builder.add_next_value(3, 5, 0.5) builder.add_next_value(3, 6, 0.5) - # New row group, for actions of state 3 + # State 3 builder.new_row_group(4) builder.add_next_value(4, 7, 0.5) builder.add_next_value(4, 1, 0.5) - # New row group, for actions of state 4 + # State 4 builder.new_row_group(5) builder.add_next_value(5, 8, 0.5) builder.add_next_value(5, 9, 0.5) - # New row group, for actions of state 5 + # State 5 builder.new_row_group(6) builder.add_next_value(6, 10, 0.5) builder.add_next_value(6, 11, 0.5) - # New row group, for actions of state 6 + # State 6 builder.new_row_group(7) builder.add_next_value(7, 2, 0.5) builder.add_next_value(7, 12, 0.5) - # final states - for s in range(8,14): + # Add transitions for the final states + for s in range(8, 14): builder.new_row_group(s) - builder.add_next_value(s, s-1, 1) - - # Build transition matrix, set overridden_row_count = nr_states, no row group count since nondet otherwise set overrridenRowGroupCount - transition_matrix = builder.build(nr_choices, nr_states) + builder.add_next_value(s, s - 1, 1) + transition_matrix = builder.build() # State labeling state_labeling = stormpy.storage.StateLabeling(nr_states) # Add labels - labels = {'init','one', 'two', 'three', 'four', 'five', 'six', 'done', 'deadlock'} + labels = {'init', 'one', 'two', 'three', 'four', 'five', 'six', 'done', 'deadlock'} for label in labels: state_labeling.add_label(label) - # Set label for single states + # Set labeling of states state_labeling.add_label_to_state('init', 0) state_labeling.add_label_to_state('one', 7) state_labeling.add_label_to_state('two', 8) @@ -75,18 +75,20 @@ def example_building_mdps_01(): for label in choice_labels: choice_labeling.add_label(label) - # Set label for single choice + # Set labels choice_labeling.add_label_to_choice('a', 0) choice_labeling.add_label_to_choice('b', 1) + print(choice_labeling) # Reward models reward_models = {} - # Create a vector representing the state-action rewards. + # Create a vector representing the state-action rewards action_reward = [0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] - reward_models['coin_flips'] = stormpy.SparseRewardModel(optional_state_action_reward_vector = action_reward) + reward_models['coin_flips'] = stormpy.SparseRewardModel(optional_state_action_reward_vector=action_reward) # Collect components - components = stormpy.SparseModelComponents(transition_matrix=transition_matrix, state_labeling=state_labeling, reward_models=reward_models, rate_transitions = False) + components = stormpy.SparseModelComponents(transition_matrix=transition_matrix, state_labeling=state_labeling, + reward_models=reward_models, rate_transitions=False) components.choice_labeling = choice_labeling # Build the model @@ -94,6 +96,5 @@ def example_building_mdps_01(): print(mdp) - if __name__ == '__main__': - example_building_mdps_01() \ No newline at end of file + example_building_mdps_01()