NÓBREGA, Thiago Pereira da.
Resumo:
The Privacy Preserve Record Linkage (PPRL) aims to identify entities, that can not
have their information disclosed (e.g., Medical Records), which correspond to the same
real-world object across different databases. It is crucial to the PPRL tasks that it is executed without revealing any information between the participants (database owners) during the PPRL task, to preserve the privacy of the original data. At the end of a PPRL task, each participant identifies which entities in its database are present in the databases of the other participants. Thus, before starting the PPRL task, the participants must agree on the entity and its attributes, to be compared in the task. In general, this agreement requires that participants have to expose their schemas, sharing (meta-)information that can be used to break the privacy of the data. This work proposes a semiautomatic approach to identify similar attributes (attribute pairing) to identify the entities attributes. The approach is inserted as a preliminary step of the PPRL (Handshake), and its result (similar attributes) can be used by subsequent steps (Blocking and Comparison). In the proposed approach, the participants generate a privacy-preserving representation (Data Signatures) of the attributes values that are sent to a trusted third-party to identify similar attributes from different data sources. Thus, by eliminating the need to share information about their schemas, consequently, improving the security and privacy of the PPRL task. The evaluation of the approach points out that the quality of attribute pairing is equivalent to a solution that does not consider data privacy, and is capable of preserving data privacy.