User Tools

Site Tools


sql-derivative-sensitivity-analyser_demo

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
sql-derivative-sensitivity-analyser_demo [2019/01/09 14:41]
alisa [Running sensitivity analysis]
sql-derivative-sensitivity-analyser_demo [2021/06/14 11:22] (current)
alisa [Running guessing advantage analysis]
Line 56: Line 56:
  
 <​code>​ <​code>​
-// a longer description of the norm 
 u1 = lp 2.0 latitude longitude; u1 = lp 2.0 latitude longitude;
 u2 = scaleNorm 0.2 u1; u2 = scaleNorm 0.2 u1;
Line 67: Line 66:
  
 ==== Setting up tasks ==== ==== Setting up tasks ====
-Click on the task ''​Estimate the first arrival''​. We need to insert here a query. Let the query return the earliest time when some ship arrives at the port located at the point (0,0). We assume that each ship starts moving at its maximum speed.+Click on the task ''​Estimate the first arrival''​. We need to insert here a query, and a schema of the table that results from executing that query. Let the query return the earliest time when some ship arrives at the port located at the point (0,0). We assume that each ship starts moving at its maximum speed
 + 
 +The output table schema defines a table that contains just a single column.
 <​code>​ <​code>​
-create ​or replace function ​min_time() returns TABLE ( +create ​table min_time(cnt INT8); 
-  ​cnt INT8 +</​code> ​
-) as $$ +
  
 +The output table query describes how the arrival time is computed from ship location and its speed.
 +
 +<​code>​
 SELECT SELECT
-    MIN ((ship.latitude ^ 2 + ship.longitude ^ 2) ^ 0.5 / ship.max_speed)+    MIN ((ship.latitude ^ 2 + ship.longitude ^ 2) ^ 0.5 / ship.max_speed) ​AS cnt
 FROM FROM
     ship     ship
-$$ language SQL;+;
 </​code>​ </​code>​
  
 ==== Running sensitivity analysis ==== ==== Running sensitivity analysis ====
  
-We are now ready to run the analysis. Click the blue button //​Analyze//​. Let us first set ε = 1 and β = 0.1. Click the green button //Run Analysis//. The most interesting value in the output that we see is the //relative error//. This can be interpreted as an upper bound on the relative distance of the noisy output from the actual output, which holds with probability ​80%. There is unfortunately no strict upper bound on the additive noise, and it can potentially be infinite, though with negligible probability. Hence, we can only give a probabilistic upper bound on the noise, which is in our case hard-coded to 80%.+We are now ready to run the analysis. Click the blue button //​Analyze//​. Let us first set ε = 1 and β = 0.1, and set the slider "​Confidence level of estimated noise" to 90%. Click the green button //Run Analysis//. The most interesting value in the output that we see is the //relative error//. This can be interpreted as an upper bound on the relative distance of the noisy output from the actual output, which holds with probability ​90%. There is unfortunately no strict upper bound on the additive noise, and it can potentially be infinite, though with negligible probability. Hence, we can only give a probabilistic upper bound on the noise.
  
 We can now play around with the model and see how the error can be reduced. We can now play around with the model and see how the error can be reduced.
Line 93: Line 96:
 ==== Running guessing advantage analysis ==== ==== Running guessing advantage analysis ====
  
-Our model is also ready for [[sql-guessing-advantage-analyser|guessing advantage analysis]]. Open the model in GA analyzer mode (e.g. clicking the button //Change analyzer//​). We see that the table schemas, the data, and the query are the same as they were in the combined sensitivity analyzer.+Our model is also ready for [[sql-guessing-advantage-analyser|guessing advantage analysis]]. Open the model in GA analyzer mode (e.g. clicking the button //Change analyzer//​). We see that the table schemas, the data, and the query are the same as they were in the combined sensitivity analyzer. However, there is no tab for table norm anymore.
  
-Click the blue button //​Analyze//​. We need to specify what the attacker already knows and what he is trying to guess.+=== Table constraints ===
  
-=== Attacker settings ===+From the data table, we can already infer possible values of ship locations. Let both ''​latitude''​ and ''​longitude''​ be bounded by the range (-300,300). We insert the following code into the tab //Table constraints//​ which becomes visible after clicking the table ''​ship''​. 
 +<​code>​ 
 +latitude range -300 300; 
 +longitude range -300 300; 
 +</​code>​
  
-From the data table, we can already infer possible values of ship locationsLet both ''​latitude'' ​and ''​longitude''​ be bounded by the range (-300,300). We insert the following code into the window that opens after clicking //​Attacker ​settings// button.+=== Attacker goal === 
 + 
 +Click the blue button //Guessing Advantage analysis//We need to specify what the attacker already knows and what he is trying to guess. 
 + 
 +If the attacker guesses the location preciselythis is bad. However, it can be bad even if he guesses the location precisely enough. Let us assume that we want to avoid guessing within 5 units of precision. The attacker goal is stated in form of an SQL query with some additional special syntax for approximation. We insert the following code into the window that opens after clicking //​Attacker ​goal// button.
 <​code>​ <​code>​
-ship.latitude range -300 300; +SELECT 
-ship.longitude range -300 300;+ship.longitude approx 5 AND 
 +ship.latitude approx 5 
 +FROM ship;
 </​code>​ </​code>​
  
-=== Sensitive attributes ===+In this example, the attacker wins if he guesses both the latitude and the longitude withing 5 units of precision. Hence, the set of successful guesses looks like a 10-unit square centered around the actual location. Intuitively,​ we would like it to be a circle. We can state attacker'​s goal as approximating the location w.r.t. Euclidean distance, i.e. l_2-norm.
  
-If the attacker guesses the location precisely, this is bad. However, it can be bad even if he guesses the location precisely enough. Let us assume that we want to avoid guessing within 5 units of precision. We insert the following code into the window that opens after clicking //Sensitive attributes//​ button. 
 <​code>​ <​code>​
-leak +SELECT 
-ship.latitude ​approx ​5; +(ship.longitude, ​ship.latitude) approxWrtLp(2) ​
-ship.longitude approx 5; +FROM ship;
-cost +
-100+
 </​code>​ </​code>​
  
-In this example, the leakage cost is set to 100. The cost does not affect the noise level in anyway, and its only goal is to give alternative interpretation to the guessing advantageIt can always be set to default value 100, so that the average cost would equal the advantage.+It is possible that not all ships are private, and the attacker is only interested in some of themWe can add a filter ​to attacker'​s goal. Let us select only those ships that have sufficiently much cargo on them.
  
 +<​code>​
 +SELECT
 +(ship.longitude,​ ship.latitude) approxWrtLp(2) 5
 +FROM ship
 +WHERE cargo >= 50;
 +</​code>​
  
-We can now play around with the model and see how the error can be reduced.+=== Experiments === 
 +We can play around with the model and see how the error depends on different parameters.
   * Increasing allowed guessing advantage decreases the error. At extreme cases, we get the error ∞ if we want advantage 0%, and the error 0 if we allow advantage 100% (more precisely, if we allow posterior probability 100%, which happens for even a smaller advantage).   * Increasing allowed guessing advantage decreases the error. At extreme cases, we get the error ∞ if we want advantage 0%, and the error 0 if we allow advantage 100% (more precisely, if we allow posterior probability 100%, which happens for even a smaller advantage).
-  * Try to decrease the allowed guessing ​radius ​(e.g. set it to 1). In general, it becomes more difficult for the attacker to make a guess, so the error decreases.+  * Try to decrease the allowed ​precision of guessing (e.g. change ​it from 5 to 1). In general, it becomes more difficult for the attacker to make a guess, so the error decreases.
   * Try to increase and decrease the initially known ranges on latitude and longitude. While it directly affects the prior probability (which can be viewed by clicking //View more// in the analysis result), the upper bound on posterior probability may change less. Technically,​ differential privacy makes the "​sensitive area" similar to its neighbouring surroundings,​ and not the entire set of possible values, so increasing the range may have little effect on the posterior probability. As the result, if the advantage level is kept the same, increasing the range may also increase the error.   * Try to increase and decrease the initially known ranges on latitude and longitude. While it directly affects the prior probability (which can be viewed by clicking //View more// in the analysis result), the upper bound on posterior probability may change less. Technically,​ differential privacy makes the "​sensitive area" similar to its neighbouring surroundings,​ and not the entire set of possible values, so increasing the range may have little effect on the posterior probability. As the result, if the advantage level is kept the same, increasing the range may also increase the error.
sql-derivative-sensitivity-analyser_demo.1547037674.txt.gz · Last modified: 2019/10/01 13:53 (external edit)