User Tools

Site Tools


bpmn-leaks-when-analysis

BPMN leaks-when analysis

In BPMN leaks-when analysis the input is a BPMN model with annotations in pseudocode to write out how different components are related.

Annotating the model

The BPMN leaks-when analysis is based on the pseudocode with the syntax specified in the following. It expects that most tasks are specified and the tasks that use the input data objects specify the fields of the input data.

BPMN leaks when analysis is available through PE-BPMN editor. It allows to run the analysis as well as to attach scripts to tasks.

For an example please see this model in the BPMN leaks-when view in Pleak.

Running the analysis

BPMN leaks-when analysis is accessible from the Disclosure editor and can be activated by clicking the respective button.

The analysis output is a table that summarizes which components of the inputs flow to the outputs. The cells in the table summarize the conditions and filters that the data passes. For data that always flows to the output there were no restrictive filters in the flow. The cell value never means that this output is not affected by that input. Finally, the if condition indicates that the flow is conditional and the passed filters can be seen when hovering over the cell. In addition, everything sent through the network (message flows) is summarized in a separate row for the potential network observer.

Syntax

Computation scripts are only added to tasks. Most tasks are expected to have the respective output data object defined by the scripts. The only difference are the sending tasks that have no script and the task before an exclusive gateway (that task defines the predicate for the gateway). In addition, if PE-BPMN stereotypes are used then some of them result in default scripts - e.g. encryption tasks.

The main functions of interest are names filter_<filterName> (or hash_<filterName>) where the prefix filter (or hash) distinguishes these as the filtering functions collected to the analysis output with the predicate data.

In general, the lines of code look like: <output_data_name>.<field_in_output_data> = function(<input_data_name>.<field_in_input_data>, …, <input_data_name>.<field_in_input_data>)

Instead of the function it is also possible to give constant values to the fields or copy some field of the input. <output_data_name>.<field_in_output_data>=<input_data_name>.<field_in_input_data> <output_data_name>.<field_in_output_data>=<constant>

Constants can be either numeric or “strings” (with quotation marks). For filters the function must be called filter_<filterName> (or hash_<filterName>) so that the analyzer recognizes it as a filter.

For the predicate tasks (before the starting exclusive gateway) the script is just is_<Predicate_name>(<input_data_name>.<field_in_input_data>, …, <input_data_name>.<field_in_input_data>) where the predicate must start with “is_” and there is no output data object.

In addition to the filters the lists have special syntax where new objects can be appended and taken out later.

Lists

Association lists are a GADT “(keytype * valuetype) assoclist” with the following six operations:

  • nil : (keytype * valuetype) assoclist
  • add: keytype → valuetype → (keytype * valuetype) assoclist → (keytype * valuetype) assoclist
  • update : keytype → valuetype → (keytype * valuetype) assoclist → (keytype * valuetype) assoclist
  • find : keytype → (keytype * valuetype) assoclist → valuetype
  • endsWith : keytype → (keytype * valuetype) assoclist → bool /* note that this probably used in the opposite direction in the model */
  • contains : keytype → (keytype * valuetype) assoclist → bool

They satisfy the following equalities:

  • endsWith(_, nil) = false
  • contains(_, nil) = false
  • endsWith(K, add(K', _, L)) = (K == K')
  • endsWith(K, update(, _, L)) = endsWith(K, L)
  • contains(K, add(K', _, L)) = (K == K') || contains(K, L)
  • contains(K, update(_, _, L)) = contains(K, L)
  • find(_, nil) = bottom
  • find(K, add(K', V, L)) = if K == K' then V else find(K, L)
  • find(K, update(K', V, L)) = if contains(K, L) then (if K = K' then V else find(K, L)) else bottom

Note that currently the list predicates contains and endsWith are not considered predicates in the BPMN leaks-when output. If the answer is clear then only the respective branch is analyzed, in case the predicate can not be simplified to a boolean value then both branches are analyzed and the predicate just does not show up in the analysis outcome.

Examples of List Operations in Pleak models

Model Restrictions

  • No cycles in the model
  • Only one regular start event (other pools must start with message start event). Additional start events may be present in subprocesses.
  • Task before a starting exclusive gateway has no output and can only contain a predicate script
  • Sending task also has no output and no script, the sent data is the input to the task
  • Data is received through message catch events
  • All other tasks should have some data inputs and at least one output
  • All data objects on the model must have unique names
    • except the data sent over the network - the uniqueness is still important for the analyzer but there is a translator component that can add this
    • Except copies of secret keys (data with the same name will be treated as the same data)
  • Data object names can have no spaces (use _ for example)
  • All message flows can carry one data object
  • All pools must have names
  • Each data object is written at most once (has one incoming data association)
    • With the exception of when some constant value is written in branching, this model for example
    • Or some other value is written in branching like that where both branches write the same fields (however, these cases may prove difficult for the analyzer)
  • There can be a message flow from party A to party B only if either A started B (sent the first message to B-s message start event) or vice versa.

Tips for Successful Usage

  • From the analyzer perspective it is important to make sure that each new data field is computed using all the fields in the input data that are relevant
  • From the user perspective it is important to give all filters and predicates good names that reflect their purpose (since these appear in the analysis output)
  • Stereotypes only apply to *.data fields of the data objects. All other fields are not considered protected. If more fields need protection then additional steps to copy and rename fields must be taken.
    • Stereotypes and lists do not mix well (e.g. protecting lists). It should be ok to add protected elements to lists
  • The restriction to having only one start event can be replaced by choosing one real start event and replacing other start events with message start events. Then the process with real start event should start with sending dummy messages to the introduced message start events.
  • The restriction on the network structure can be lifted by introducing a new pool for a network relay that relays all the communication and starts all the parties but does nothing else.

Stereotype support

BPMN leaks-when analysis has limited support for PE-BPMN stereotypes.

  • Encryption
    • SK/PK/ABBE Encrypt/Decrypt
  • SecureChannel
  • Protect/OpenConfidentiality
  • PETComputation

Using the stereotypes means that the script in BPMN leaks-when is left empty by the user and there is a PE-BPMN stereotype attached to the task in the PE-BPMN editor. The roles of the inputs (e.g. plaintext and the key) are determined based on the stereotype settings.

For encryption stereotypes we assume that the input data has a field called “data” and the key data object has a field called “key”. Hence, the encryption task encrypts the field “data” with the key from field “key” and puts the result into the output field “data”. The rest of the fields in the plaintext input are copied to the output with the same names as they had for the input. Decryption works analogously with the fields with the same name. Decryption only succeeds when the right key is used.

For Protect/OpenConfidentiality and PETComputation stereotypes all fields of the output data marked as private are considered to have the protection.

Note that the model that correctly uses stereotypes for BPMN leaks-when analysis may not be valid for PE-BPMN analysis as the latter does not work with the inner structures of the data.

Also note that an input data object should not be directly used as a plaintext input to encryption - it should be preceded by some (potentially dummy) data processing task to ensure that the fields of the encryption input are defined.

In the analysis results for the network messages the data protected by a stereotype does not appear, unless the data needed to open it (e.g. the decryption key) also appears. If both the key and the ciphertext are available then we assume that the data is seen on the network.

The stereotypes that are not supported by the analysis can be present on the model, but the tasks with these stereotypes are considered as regular tasks by the analyzer and unsupported stereotypes do not affect the analysis outcome.

Source code

The source code of the analysis tool is available at pleak-leaks-when-analysis repository. The user interface of the analysis tool is accessible through pleak-sql-editor and PE-BPMN editor.

bpmn-leaks-when-analysis.txt · Last modified: 2021/02/04 14:39 by pullonen