• Keine Ergebnisse gefunden

12.1 Security in Databases 12.1 Security in Databases 12.1 Security in Databases 12.1 Security in Databases Relational Database Systems 2 12 Security

N/A
N/A
Protected

Academic year: 2021

Aktie "12.1 Security in Databases 12.1 Security in Databases 12.1 Security in Databases 12.1 Security in Databases Relational Database Systems 2 12 Security"

Copied!
11
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Silke Eckstein Andreas Kupfer

Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de

Relational Database Systems 2

12. Security

12.1 Security in databases 12.2 Access control

12.3 Statistical database security

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 2

12 Security

Database security comprises a set of measures, policies, and mechanisms

–To provide secrecy, integrity, and availability of data –To combat threats to the system, both malicious and

accidental

Secrecy(or confidentiality)

–Protection of data from unauthorized disclosure

Integrity

–Only authorized users should be allowed to modify data

Availability

–Making data availableto the authorized users and application programs

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 3

12.1 Security in Databases

–“[…] the many public and painful disclosures, especially security breaches that have dramatically affected brand image and the financial health of many public companies. IT risk, specifically data security, has truly become a board-level discussion.”

AMR Research: “Governance, Risk and Compliance Spending Report 2008-2009”, 2008 –“21% of enterprises are worried about a decline in

stock price [resulting from a security breach]”

Forrester Research: “Aligning Data Protection Priorities With Risks”, April 2006

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 4

12.1 Security in Databases

Database design has to consider

–The possible attacks and vulnerabilities –The risks to which the data is exposed

The protection which security gives is usually directed against two classes of users

–Stop users without database access from having any form of access

–Stop users with database access from performing actions on the database which are not required to perform their duties

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 5

12.1 Security in Databases

–“The most prevalent attack style, responsible for 39%

of data thefts, was authorized users exploiting their privileges.“

Forrester Research: “Aligning Data Protection Priorities With Risks”, April 2006

–“According to the 2007 Annual Study: Cost of a Data Breach: Data breach incidents cost companies $197 per compromised customer record in 2007, compared to $182 in 2006.“

Ponemon Press: “Ponemon Study Shows Data Breach Costs Continue to Rise”, November 2007

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 6

12.1 Security in Databases

(2)

"You have zero privacy anyway.

Get over it!”

–Scott McNealy (Jan, 1999) Chairman and Co-Founder Sun Microsystems, Inc

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 7

12.1 What to do?

Social Security Number Data Theft at University of Texas, Austin

Chronology

Mar 02, 2003: Initial observation of

high-volume database access from off-campus

Mar 03, 2003: Law enforcement contacted

Mar 04, 2003: Evidence points to UT student

Mar 05, 2003: Two residences searched: Austin, Houston

Mar 05, 2003: Austin American-Statesmanbreaks story

Mar 14, 2003: UT undergraduate student charged

Sep 06, 2005: The student was sentenced to five years probation and ordered to pay $170,056 restitution for accessing protected computers without authorization, and possession of stolen social security numbers (misuse of the numbers could not be proven)

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8

12.1 What to do?

Restrict access to the physical location of the data

–Administrative and external control measures to prevent accessto the physical resources

Rooms, storage facilities, terminals,…

–Does not prevent misuse by authorized personnel

Access restrictions are very difficult to uphold in the case of Web-accessible databases

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 9

12.1 Basic Measures

Data encryption

–Often it is hard to prevent people from copying the database and then hacking into the copy at another location

–It is easier to simply make copying the data a useless activity by encryptingthe data

Authentification

–Verify the user’s identitybefore allowing access by something the user is acquainted with or physical characteristics of the user

Passwords, codes, fingerprints, signature,…

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 10

12.1 OS/DBMS-level Protection

Audit trails

–If someone does penetrate the DBMS, it is useful to find out how they did it and what was accessed or altered

Audit trails can be set up selectively to minimize disk usage, identify system weaknesses, and finger malicious users

Logging phase: all request and respective results are logged for each user

Reporting phase: collected information in the log are checked to detect possible violations or attacks

Trails can even detect violation attempts executed through sequences of queries

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 11

12.1 Auditing Mechanisms

Access control (authorization) ensures that all direct accesses to database objects occur exclusively according to the modes and rules given by security policies

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 12

12.2 Access Control

access request

DBMS

authorization system

data other DBMS components

control procedures access

rules

security policies access

access permitted/

denied

(3)

• Access control policies specify, if and how users can accesseach database object

–In closed systemsonly explicitly authorized accesses are allowed

–In open systemsall accesses that are not explicitly forbidden are allowed

–In multi-level protectionsystems access is defined using several classification levels to allow/limit access

Data can e.g., beunclassified, confidential, secret, top secret, etc. and users are assigned a certain security clearance

• The policies also specifyif and howaccess rights can betransferred

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 13

12.2 Access Control Policies

Besides authentification, access control also may include access limitation

Minimum privilege policy

All users should access only the minimum quantity of information needed for their activity

Sometimes this is hard to predict and overly restrictive

‘Need to know’ policyMaximum privilege policy

All data of a certain type can be accessed, thus the sharing is maximized

‘Maximum availability’ policy

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 14

12.2 Access Control Policies

The granularity in specifying access control in the database can be

–The entire database –A set of relations –An individual relation –A set of records in a relation –An individual record

–A set of attributes of all records –An attribute of an individual record

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 15

12.2 Access Control Policies

Restricting the granularity is usally performed by creating specific views containg only the data that should be visible

–CREATE VIEW addresses AS SELECT name, address FROM employee

WHERE department = ‘finance’

–Access to this view means vertically and horizontally restricted access on the employee table

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 16

12.2 Access Control Policies

• Access can be granted toindividual users, groups, application programs, etc.

• The administrationof access control policies and access rights can either be

Centralized, where all rights are controlled by the DBADecentralized, where different DBAs are responsible for

different database instances

Cooperative, where a predefined group of users has to agree on granted access

Based on ownership, where the creator of a database object as default owner can control the respective access rights

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 17

12.2 Access Control Policies

• In most commercial DBMS, there is a two layer approach to naming relations

The DBMS has a number of database instances, for which DBA has permission to create and delete databases, and to grant users access to databases

Each database is a flat name space: users with the necessary permission can create tables and viewsin a database.

• Because it is a flat name space, all table names must be unique within a database

the database login name is often taken as the username

table and view names are prepended with the name of the user, who created it

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 18

12.2 Access Control Policies

(4)

Discretionary Access Control

–Grants privileges to users, including the capability to access specific data files, records, or fields in a specific mode

Mandatory Access Control

–Classifies users and data into multiple levels of security, and then enforces appropriate rules

Role-based Access Control

–Access privileges are associated with the role of the person in the organization

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 19

12.2 Types of Access Control

Discretionary policies require that for each user authorization rules are defined specifying the privileges owned on the database objects

–Access requests are checked against the granted privileges

–Discretionary means that the possibility for users to grant/revokerights exists (usually

based on ownership)

–By grants access privileges can be propagatedthrough the system

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 20

12.2 Discretionary Access Control

• The SQL GRANT/REVOKE statement can be used to grant privileges to users

GRANT privilegesONtable(s)/column(s) TOgrantees

[WITH GRANT OPTION ]

REVOKEprivileges ONtable(s)/column(s) FROM grantees

Possible privilegesare:

SELECT- user can retrieve data –UPDATE- user can modify existing data –DELETE - user can remove dataINSERT- user can insert new data

REFERENCES- user can make references to the table

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 21

12.2 Discretionary Access Control

The WITH GRANT OPTION permits the propagation of rights to other users

–Allows other users to look after permissions for certain tables

E.g., allowing a manager to control access to a table for their subordinates

The list of grantees does not need not be (a set of) usernames

–It is permitted to specify PUBLIC, which means that the privileges are granted to everyone

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 22

12.2 Discretionary Access Control

Checking discretionary access control is often implemented by an authorization matrix

–The rowsrepresent users –The columnsrepresent the

database objects –The fieldscontain the

respective privileges

Similar concept in file security

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 23

12.2 Discretionary Access Control

• The authorization matrix model can be extended by predicatesthat have to be satisfied in order to use the authorization

Data-dependent: e.g., constraints on the values of the accessed data (access only employee records where salary

< 100,000)

Time-dependent: authorized access only between 9:00 am and 5:00 pm

Context-dependent: e.g., a user might have read rights on individual colums, but not on joins between them –History-dependent: constraints dependent on

previously performed accesses

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 24

12.2 Discretionary Access Control

(5)

Problem: revocation of propagated privileges

–Access to data might be needed only for a limited

period of time

Solution:temporarily grant some privileges to a user –In SQL a REVOKEcommand is included to cancel

privileges

–If a privilege is granted with GRANT option to an account, this account can also grant that privilege on the relation to other accounts

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 25 WV 12.1

12.2 Problems

• Suppose that B is given the GRANT OPTION by A and that B then grants the privilege on R to a third account C, also with GRANT OPTION

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 26

12.2 Problems

R

data owner

read R read R

A B

C

• Privileges on R can propagate to other accounts without the knowledge of the owner of R!

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 27

12.2 Problems

R

data owner

read R read R

read R

• If the owner now revokes the privilege granted to B, all the propagated privileges should automatically be revoked by the system

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 28

12.2 Problems

R

data owner

revoke read R

revoke read R

revoke read R

• If a user received a privilege from two or more sources, the user will continue to have the privilege until all the sources revoke the privilege

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 29

12.2 Problems

R

data owner

read R

revoke read R read R

read R

Problem: the flow of information from some database object into a less secure database object

–Discretionary access models do not impose any restriction on the usage once data has been obtained by a user

–The disseminationof data is not controlled

Users with a read privilege can copy read data to their own table, on which they have full rights

Maliciousness within the system can occur via Trojan horses

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 30 WV 12.1

12.2 Problems

(6)

• Consider a malicious userhaving only a privilege to create tables in a database

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 31

12.2 Problems

database corrupted

corrupted

application R

R’

grant

read on R

read R

read R

write to R’

create R’

and read from R’

A solution to this problems are so-called flow controls that regulate the distribution of information among accessible objects

–A flowbetween two database objects A and B occurs when a statement reads from A and writes into B –Flow controls check that information contained in

some objects does not flow explicitly(by copy) or implicitly (via intermediate objects) into less protected objects

Otherwise a user might get something from the less protected object that he/she would not have gotten from the original object

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 32

12.2 Problems

Mandatory Access Control maps objects onto a classification of the respective sensitivity

–All system data has to be classified, user are assigned a certain clearance level by some central authority –Access to data is determined by

a mandatory policy through the comparison of requester level and item level

Most prominent example is the Bell-LaPadula model (1973) to formalize the U.S. Department of Defense multilevel security policy

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 33

12.2 Mandatory Access Control

Secrecy is expressed as a set of rules (axioms) that must be satisfied at all times

The control is based on security levels for each database item (object) and clearances for users (subjects) consisting of

–A classificationfrom an ordered set

E.g., top secret, secret, confidential, unclassified –A set of categoriesfrom a non-hierarchical set

E.g., administration, finance, human resources, etc.

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 34

12.2 Mandatory Access Control

The set of security levels thus forms a lattice

–The lattice is partially ordered according to a

dominance relationship

–A security level (class1, {cat1,…,catn}) dominatesa security level (class2, {cat1,…,catm}) if and only if class1≥ class2and {cat1,…,catn} ⊇{cat1,…,catm} –E.g.,

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 35

12.2 Mandatory Access Control

secret, {finance, marketing}

confidential, {finance} confidential, {marketing}

unclassified, {finance}

top secret, {administration}

unclassified, {administration}

Subjects are active elements of the system

–As in the discretionary case, object owners can

grant/revoke privilegesto/from subjects

Privileges are stored in an access matrix

–Subjects can execute actions (read, write, update,…) only with respect to the subject‘s clearance and the object‘s security level

–When entering the system each subject logs on with a certain current levelwhere always current level ≤ clearanceholds

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 36

12.2 Mandatory Access Control

(7)

• The secrecy is maintained, if three axioms are satisfied

Simple security property: Any subject may have

Read or write access to an object, only if the clearance of the subject dominates the security level of the object*-property: An untrusted subject may have

Append (insert) access on an object, only if the security level of the object dominates the current level of the subject

Write access on an object, only if the object’s security level and the subject’s current level are equal

Read access on an object, only if the security level of the object is dominated by the current level of the subject

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 37

12.2 Mandatory Access Control

Discretionary security property:

Every current access has to be present in the access matrix, i.e., a subject can only perform accesses it is actually authorized for

Moreover, security classifications cannot simply be changed

Tranquility principle:

No subject can modifythe classification of an active object

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 38

12.2 Mandatory Access Control

The secrecy is maintained by the simple security property (no read-up)

–An object with higher security level can be neither read nor modified (except for appending data)

The star property (no write-down) enforces a simple flow control

–Although lower security objects can be read, their data cannot be written to any object that has a level lower than the current level

This prevents Trojan horse attacks

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 39

12.2 Mandatory Access Control

• The Bell-LaPadula model succeeds in achieving secrecy, but cannot protect a system from unauthorized modifications of information

–A similar principle like for data secrecy can also be applied for data integrity (e.g., the Biba model(1977))

• There are also several models combiningboth data secrecy and integrity

–The Dion model(1981) basically combines the principles of controlling secrecy of the Bell-LaPadula model with the principles of strict integrity of the Biba model

–The SeaView security model (1990) adapted the policies specifically for use in relational databases

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 40

12.2 Mandatory Access Control

The advantages of mandatory models derive from their suitability to environments, where user and objects can be classified

–‘Mandatory’ implies that systems should be able to enforce an access control policy that is mandated by some regulation that must be absolutely enforced

E.g., in 1995, US President Bill Clinton signed Executive Order 12958which created new standards for the process of classifying government documents

However, they often are overly strict

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 41

12.2 Mandatory Access Control

Organizations often rely on role-based access control

–Each role is created by the administrator

–The permissions to perform certain operations are assigned to specific roles

–Each user is granted/revoked roles

Role-based access control differs from traditional access control systems

–It assigns permissions to specific operations with meaning in the organization, rather than to low level data objects

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 42

12.2 Role-Based Access Control

(8)

• An example operation might be to create a 'credit account' transactionin a financial application and assign it to the role of ‘bank clerk’

–The assignment of permission to perform a particular operation is meaningful, because the operations are fine grained and have meaning within the application –In contrast, traditional access control is

used to grant or deny write access to a particular system file, but it cannot say in what ways that file could be changed

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 43

12.2 Role-Based Access Control

A database can also be used for statistical purposes without granting access to individual records

–Statistical operations allow a viewon the actual data – Special protection techniques have to be applied to

protect the individual data records

‘Reengineering’ of actual individual values is sometimes possible

Statistical inference, especially taking advantage of sequences of statistical queries, must be prevented

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 44

12.3 Statistical Database Security

• In any case a statistical filterfor queries is needed –Permits only statistical queries, while preventing

access to individual records

E.g., allow a ‘COUNT’ query for the number of employees whose salary is higher than 100.000 $, but deny queries selecting individualshaving that characteristic

• But statistical filters are not sufficientto prevent interference

–E.g., first get the average salary of employees with job description ‘manager’ and then count their number

If the number is 1, you do exactly know how much your manager earns…

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 45

12.3 Statistical Database Security

Security measures have to be taken on top of the authorization measures presented before

A statistical database is

Positively compromised, if a user finds out that an individual has a specific characteristic (value)

Negatively compromised, if a user finds out that a given individual does not have a certain characteristic

Also a simple anonymization of data does not suffice to protect individuals

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 46

12.3 Statistical Database Security

„Your data is safe with us” – The tale of the anonymous dataset

–Example: The life of AOL user #4417749

Setting: AOL Search

–One of the major web search and content portals –AOL serves million searches per day

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 47

12.3 Anonymization

• AOL has a privacy policy promising they won’t publish your identify

• However, internally recordsare kept of all user searches

–Search records are very valuable for improving algorithms

• On 4thAugust 2006, an anonymousdataset was published for free use by the IR research community

–Contained searches of 650,000 users over a 3-month period

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 48

12.3 Anonymization

(9)

• Data set contained –Anonymous user id

Just an incrementing number –Query text

As the user typed it –Query time and date –Result rank

Rank of the result the user clicked on –Result URL

• AOL acted on clear consciences to help out free search algorithm research

–But…

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 49

12.3 Anonymization

The data set spread very fast

Unfortunately, anonymizing data is not that easy

–New York Times, among others, reconstructedsingle

user’s identitiesand personal profiles

Cross-matchedall records and combined them with public available sources

Phonebooks, Business Directories, Classified Ads, Classified Ads

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 50

12.3 Anonymization

Most prominent example: User #4417749

–Thelma Arnold, 62-year-old, widowed, lives

in Lilburn, Georgia

–Is looking for a new partner in his 60s

–Has at least one dog randomly pissing on furniture –Has problem with trembling fingers and aches in her

back

–Is worried about the safety of her neighborhood –Wonders about problems of the world, like hunger in

Africa or children in war-torn Iraq

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 51

12.3 Anonymization

AOL immediately removed the dataset

–But still around on various mirrors and databases

“Browse others AOL data – hours of fun guaranteed”

In September 2006, a class action lawsuit was filed

–Case still running

–Seeks at least $5,000 for each person involved

3.250 Billion Dollars!

What to learn?

Proper data anonymization IS very important!

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 52

12.3 Anonymization

Anonymization: Typical (Bad) Cases

Removal of personal identifiers –Safe?

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 53

12.3 Anonymization

Name Age Sex Zip

Karl 19 M 38114

Anna 21 F 30167

Otto 33 M 38005

Public Data

Age Sex Zip Disease Cure

19 M 38114 Hepatitis Yes

21 F 30167 Hepatitis Yes

33 M 38005 Aids No

“Anonymous” Hospital Data Real Identity – No matching should be possible

Anonymization: Typical (Bad) Cases

Removing data details

–Safe??

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 54

12.3 Anonymization

Name Age Sex Zip

Karl 19 M 38114

Anna 21 F 30167

Otto 33 M 38005

Public Data

Age Sex Zip Disease Cure

18-20 M 381* Hepatitis Yes

18-20 F 301* Hepatitis Yes

30-35 M 380* Aids No

“Anonymous” Hospital Data

(10)

• How to protect private content, but preserve useful context?

–Compromise between encryption and plain data sharing –Algorithmic techniques to separate content & context

• With proliferation of data collection devices, privacy is disappearing

–Scale

–Insider vs. outsider protection

–Some data mining is useful, others are harmful –E.g., recent AOL searches trace release

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 55

12.3 Anonymization

Why? Applications!

Scenarios:

Network content anonymization

Also traffic and connection statistics

–Share network traces with packet payloads, enable home troubleshooting or malicious content detection(e.g., worms)

Online behavior shared analysis: efficiency, self- improvement.

–Voice anonymization, image/ video anonymization –Medical, biological, sensor data, …

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 56

12.3 Anonymization

Approaches for Privacy Preservation

Fight Data Mining Approaches

–Modify data in such a way that certain rules cannot be inferred

Cryptographic / probabilistic approaches

–Query responses just give probabilistic results –Multiple public keys for a single user allow aggregation

of data only in certain cases

Statistical Approaches

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 57

12.3 Anonymization

Example Idea: Slice data into tiny content blocks

–Statistic approach –Reconstructing data

computationally hard

Data analysis still possible

Frequency statistics

Word frequencies of the UN Charta

Short pattern matching

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 58

12.3 Anonymization

The majority of inference protection techniques can be classified as

Conceptual techniques

Involve the conceptual level of the underlying databaseRestriction-based techniques

Deny statistical queries working on too small or too large subsets of the data

Perturbation-based techniques

Introduce modification to the data which change individual values, but should have hardly any effect on the statistics

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 59

12.3 Inference Protection Techniques

A good example for conceptual techniques is the lattice model

–Statistics over relational tables can be represented as a lattice, where vertexes reflect different

combinations of attributes –E.g., latticefor table T

with three attributes A, B, and C

–By aggregating over some attribute less dimensional tables are obtained

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 60

12.3 Conceptual Techniques

Tall

TA TB TC

TBC TAC

TAB

TABC

aggregation

(11)

• The lattice can be used to study inference protection mechanisms

–A statistic is considered to be harmful, if the n- respondent, k%-dominance criterion applies

i.e., nor fewer records represent more than k%of the total with n and k being fixed but secret values

–Consequently, for any vertex of the lattice e.g., a ‘count’

statistic holding a query set of size 1 is harmful

By using operations involving vertexes at different levels the user can disclose sensitive statistics

–Generally, it is possible to permit a statistic in a vertex of the lattice, if the individual is not identified in some parent table in the lattice

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 61

12.3 Conceptual Techniques

The general aim of these techniques is to restrict statistical queries that could compromise the database

–The simplest restriction technique controls the size of the query setassociated with a query

Suppose that for some individual a user knows a certain characteristic ‘Ai= x’ and the respective count statistic is 1;

then more information can be disclosed by issuing queries COUNT(Ai= x AND Aj= y) to find out about the Ajvalue, etc.

For some secret parameter ka statistical query is only permittedif the size of the query set is both larger than k and smaller than (database_size -k)

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 62

12.3 Restriction-based Techniques

• However, this simple technique is not safe e.g., against tracker-based attacks

–A tracker is a set of formulas to pad out small size query sets with additional records to fulfill the size restriction

Assume an individual can be uniquely identified by the characteristics (Ai= x AND Aj= y AND Ak= z)

A tracker could be (Ai= x), (Ai= x AND NOT Aj= y AND NOT Ak= z)

The forbidden statistics COUNT (Ai= x AND Aj= y AND Ak= z) could be calculated by COUNT (Ai= x) – COUNT (Ai= x AND NOT Aj= y AND NOT Ak= z)

Having that statistics more information can be obtained

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 63

12.3 Restriction-based Techniques

• One way to deal with trackers, is to generalizethe query set size criterion to all logical combinations

–For a query on (A1= a AND A2= b AND…AND An= z) all 2ncombinations

(NOT A1= a AND A2= b AND…AND An= z), (A1= a AND NOT A2= b AND…AND An= z),

have to fulfill the query set size restriction

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 64

12.3 Restriction-based Techniques

• A prime example for perturbation-based techniques is data swapping

The idea is to exchange attribute values between the records of the original database in such a way that

The new database has no common records with the original database

While the statistics (up to a certain number of attributes involved in the statistics) stay correct

• A second technique are random sample queries that are performed only on a random sample of the database

• Another technique is result rounding, where the response is perturbed

Before being released the response values are rounded up or down to the nearest multiple of a certain base b

Users can then deduce the true value only within some interval

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 65

12.3 Perturbation-based Techniques

Specific attacks, however can still disclose information

–E.g., consider a user knows that an individual record matches a characteristic Ai= x and that the relative frequency of having that value is 1/database_size –The attacker can now discover whether the record

shows the additional characteristic Aj= y by requesting the relative frequency of (Ai= x AND Aj= y )

–If the value is still 1/database_size, it has the characteristic, if the value is 0 it does not…

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 66

12.3 Perturbation-based Techniques

Referenzen

ÄHNLICHE DOKUMENTE

Aus der Bedingung der Flächengleichheit ergibt sich für die andere Kathete des zweiten Dreiecks 1 2 und für die Hypotenuse 5 2.. Ent- sprechend fahren wir

In dieser Situation liegen der Mittelpunkt dieses dritten Kreises und die beiden Schnitt- punkte der ersten beiden Kreise im Verhältnis des goldenen Schnittes. Der längere Abschnitt,

In einem räumlichen kartesischen Koordinatensystem ist ein Lämpel-Würfel ein Würfel mit ganzzahligen Eckpunktkoordinaten und ganzzahligen Kantenlängen (L.. Die drei Vektoren

Nun lassen wir die überflüssigen Linien weg und zerlegen das rote Kreuz in 12 Drei- ecke, die zu den gelben und grünen Dreiecken kongruent sind.. Zerlegung

Based on the database schema build from your CREATE TABLE statements from Exercise 1 b), provide SQL CREATE VIEW statements for the following tasks:.. a) Create the

In the file src/main/Main.java the application will try to connect to a database and then it will attempt to create a MovieExplorer object with the obtained Connection..

Intitute (with a property leadBy to Person and some useful properties to Lecture and University). The property supervises which is a subproperty of

Also, you are re- quired to comment sections of your program code (using Java comments) so that it is easy to un- derstand by others (in particular by your tutor).. If your code