Improving Parallel State-Space Exploration Using Genetic Algorithms

Description of this page
Benchmark Analysis
Benchmark Description
How to replay the benchmark?

Description of this page

This page presents the benchmark used in our paper.

You can download the fork of Spot we use for our experiments [here].

You can download the CSV containing all our experiments for reachability [here].

You can download the CSV containing all our experiments for deadlock detection [here].

You can download all the logs, models and script used for the benchmark at [here].

Benchmark Analysis

This section presents a more detailled analysis of the benchark presented in the paper.

Reachability

First of all we decided to evaluate the footprint overhead for the best fitness function, i.e. the lessthan function. The previous chart was obtained by generating (for each model) a new population. Then, for each state of this new generation we run an exploration algorithm. During this exploration, we compute the number of states that are reachable from the initial state. The chart below presents this result.

With the previous chart, results are present cumulated: this can mask some of the results. The chart below present, for each artificial state, the percent of reachable states (i.e. reachable from the real initial state) that have been discovered during the exploration. The results are presented model per model with a boxplot. We can observe that all almost all models have at least one artificial state where all its successors are reachable from the initial state. Moreover on third of the models have more than a half of their artificial initial state with more than 50 percents of successors that are reachable.

States Comparison (normalized) — Fig1. - Variations in term of reachable states.

Finally the scatter plots below allows to compare the performances for the various settings of the algorithm. All these algorithms are referenced through their command line options. See below for a transcript from options to algorithms (note that options are separated by __)

Fitness Function. In the paper we only report few fitness functions. Nonetheless, during our experiments we tried other functions:

equal: this corresponds to the equality fitness function mentionned in the paper. A state is selected iff the number of its successors is equal to the average number of successors in the initial population.
lessthan: this corresponds to the lessthan fitness function mentionned in the paper. A state is selected iff the number of its successors is less or equal to the average number of successors in the initial population.
lessstrict: a state is selected iff the number of its successors is less (strict) to the average number of successors in the initial population.
greaterthan: this corresponds to the greaterthan fitness function mentionned in the paper. A state is selected iff the number of its successors is greater or equal to the average number of successors in the initial population.
greaterstrict: a state is selected iff the number of its successors is greater (strict) to the average number of successors in the initial population.
greaterbounded: a state is selected iff the number of its successors is greater (strict) to the average number of successors in the initial population.

swarmed-gp-dfs=fitnessfunction is the genetic algorithm presented in the paper; swarmed-dfs is the reachability algorithm presented in the paper.
- -I specify the size of the initial population
- -T specify the thresehold (default is 0.999)
- -G specify the number of generation we use (default is 3)
- -P specify the size of each new generation
- -p specify the number of threads to use

Safety Properties

The scatter plot below allows to compare some variations about the deadlock detection. Note that finding a deadlock can stop the exploration before visiting all states.

Benchmark Description

This benchmark uses 38 models extracted from the BEEM benchmark that can be downloaded [Here]. The table above present all the models we use. We observe that that every type of model of the classification of Pelanek is represented.

*Model*	*Classification*	*Type*
adding.6	puzzle	toy
anderson.3	mutex	simple
at.5	mutex	simple
bakery.6	mutex	simple
blocks.3	ps	toy
brp.6	communication-protocol	simple
brp2.6	communication-protocol	simple
cambridge.7	communication-protocol	complex
collision.4	communication-protocol	simple
elevator.4	controller	simple
elevator_planning.2	ps	toy
extinction.4	leader-election	simple
firewire_tree.5	leader-election	complex
fischer.5	mutex	simple
frogs.5	puzzle	toy
hanoi.3	puzzle	toy
iprotocol.6	communication-protocol	complex
lamport.8	mutex	simple
lamport_nonatomic.4	mutex	simple
lann.6	mutex	simple
leader_election.5	leader-election	simple
lifts.7	controller	complex
lup.4	other	simple
mcs.5	mutex	simple
needham.4	other-protocol	complex
peterson.5	mutex	simple
phils.5	mutex	toy
plc.4	controller	complex
production_cell.6	controller	simple
public_subscribe.4	other-protocol	complex
resistance.2	other	simple
rether.6	communication-protocol	simple
rushhour.4	puzzle	toy
schedule_world.2	ps	simple
sorter.4	controller	simple
synapse.7	other-protocol	complex
szymanski.5	mutex	simple
telephony.4	other-protocol	simple

Replay Benchmark

Before installing Spot, you need to install the DiVine tool patched by the LTSmin Team, since we rely on it to generate the state space of each model. You just have to follow instructions given in section 10 of [LTSmin Homepage]. Nonetheless, this tools does not handle out-of-bound exceptions. Since our algorithms may generate states that are not reachable, we provided two patches to apply to this tool to handle this problem just apply patch1 and patch2

Download and install our fork of Spot thought classic steps:

autoreconf -vfi
./configure
make

Once all previous steps have been done, you should find a binary file modelcheck in the spot/tests/ltsmin directory. This is the binary file we use for our experiments.