Chinese researchers develop fuzzy search algorithm for encrypted cloud data

By Darren Pauli, 6 Oct 2014

Chinese researchers from Nanjing University have developed an encrypted search mechanism which they say is both more productive and secure than existing systems.

Existing systems can search encrypted data only for exact keyword matches and nothing similar. Authors of such systems can employ fuzziness to detect phrases (such as “did you mean ***”) but at the expense of accuracy.

Li Chen, Xingming Sun, Zhihua Xia, and Qi Liu of the Nanjing University of Information Science said their system dubbed Latent Semantic Analysis (LSA) spat out both exact matching files and those close to it through fuzzy searches.

"For example, when the user inputs the keyword “automobile” to search files, the proposed method returns not only the files containing 'automobile', but also the files including the term 'car'," the quartet wrote in their paper An Efficient and Privacy-Preserving Semantic Multi-Keyword Ranked Search over Encrypted Cloud Data (PDF).

"The proposed scheme could return not only the exact matching files, but also the files including the terms latent semantically associated [sic] to the query keyword."

Performance was measured through a combination of recall and precision the team dubbed the “F-measure”. Their system's F-measure left existing exact matching systems in the dust, they said.

"For a clear comparison, our proposed scheme attains score higher than the original MRSE (multi-keyword ranked search) in F-measure ... our method achieves remarkable result."

The search was secure in that queries made against the encrypted data were not able to be read in clear text, or inferred, and were protected with dummy values.

"Taking security and privacy into consideration, we employ a secure splitting k-NN (nearest neighbour) technique to encrypt the index and the queried vector, so that we can obtain the accurate ranked results and protect the confidence of the data well,” added the researchers.

Semantic search sycophants can consult the technical paper (PDF) for further details, including the formulae used. ®

