Bioinformatics is essential for the management of data in modern biology and medicine. By finding more secure solutions we will be creating greater opportunity and more efficiency in medical research. ~Audrey Bentley

Bioinformatics is essential for the management of data in modern biology and medicine. By finding more secure solutions we will be creating greater opportunity and more efficiency in medical research. ~Audrey Bentley

Privacy-Preserving String Searching...

I’ve touched on the privacy concerns and risks that are associated with genetic testing in my previous blogs, and I also touched a bit on the use of Homomorphic Encryption (HE) to increase privacy protection. HE is a form of encryption that allows data to be encrypted and outsourced to commercial clouds to be processed, all while remaining encrypted. There are also other things that can be done in addition to HE. I would also include an efficient string data structure such as the Burrows-Wheeler Transform (BWT). String Search is an integral function in the field of genomic Bioinformatics. Sring-searching is also sometimes called “string-matching” and these string algorithms are used to try and find a place where one or several strings (think of patterns) are found within a larger text. In addition to the BWT I would also recommend an approach called Oblivious Transfer (based on additive HE) to mask the sequence query and the particular area of genomic interest in the positional queries.

Burrows-Wheeler Transform

BWT is a data transformation algorithm that restructures data in such a way that the message is compressed. Burrows-Wheeler is most importantly used in biological sciences because genomes don’t have a lot of runs but do have many repeats…Long strings using the ACTG alphabet. (A=adenine, C=Cystosine, T=Thiamine, G=Guanine) The point of the BWT is to build a bundle where the rows are all cyclic shifts of the input string in dictionary order, and then return the last column of the cluster that often has long runs of identical characters. This is beneficial because the string becomes more compressible for other algorithms. Another huge benefit to using the Burrows-Wheeler Transform is it’s reversible with minimal data overhead.



This image shows the BWT transform for banana$.

This image shows the BWT transform for banana$.


Oblivious Transfer

The Oblivious Transfer (OT) Protocol is extremely helpful when it comes to security and preserving the privacy of data because the sender learns nothing about the other’s input except for the one that is selected. There are two parties involved: The Sender (P1) and the other is known as the Receiver(P2). One approach to OT is for the receiver to generate two public-private key-parts but with one of the public keys to not have a valid private key..the receiver then sends both keys to the sender who encypts its inputs with the public keys and sends them to the receiver. The receiver will only be able to properly decrypt one of the cipher-texts.


Science is fun. Science is curiosity. We all have natural curiosity. Science is a process of investigating. It’s posing questions and coming up with a method. It’s delving in.
— Sally Ride

Consumer DNA-testing company Vitagene left 3,000 user’s private info exposed for years...

Consumer Genetic Testing and Privacy Concerns Part 2. Updated July 14th