Why your Encrypted DB is not secure

FRIDAY. JANUARY 10, 2021 •

Presentation slides I created for this paper when taking UC Berkeley's CS 294-171 Privacy Preserving Systems class (Fall 2020), taught by Professor Natacha Crooks.

This paper contextualizes the efficacy of encrypted databases when it comes to ensuring privacy and security. In general, what the paper finds is that even with the latest set of research put forth, being able to deliver practical and provable security guarantees within database systems is very difficult. Often times, better protections make intrusion attempts more difficult, but are by no means rendered unbreakable. The paper references two previous papers that discuss how when enough queries and responses can be observed by an attacker, even with encryption, reconstruction attacks, which recovers secret information, is absolutely possible. Porting existing applications to work well with encrypted databases without unintentionally leaking information is not as straightforward as it sounds. The main focus of this paper is that much research tends to focus on weaker guarantees surrounding “snapshot attackers” who only have a single observation of a compromised system. The paper demonstrates how even security primitives in this context can be broken, often because the theoretical abstractions do not completely capture practical systems with actual information.

The authors start the paper by explaining what snapshots are and existing encrypted databases. Notably, existing encrypted databases like CryptDB and Seabed are mentioned. Snapshot attacks target specific components of a running DBMS, primarily the volatile DB state in RAM/CPU registers, persistent disk state, and volatile/persistent OS state. Across all these components, there are possible attacks that the authors point out. For instance, disk theft involves grabbing information from non-volatile memory, although this would only be successful if full disk encryption is not used. The popular SQL injection vulnerability can be used to create exploits that either read information or hand over control of a process to the attacker. Other vulnerabilities include VM image leaks and full-system compromises. The authors also demonstrate how log files can be used to recreate the operations and the order in which they occurred. Further information leaks are also pointed out in in-memory data structures and diagnostic tables. Finally, the authors leave an open-ended question with a disappointing answer. When faced with the task of designing a database with security in mind from the ground up, it seems that there are inherent system designs in caching and performance that may make certain primitives unachievable.

While I understand that the code may not have been released to discourage any malicious users, I think it would’ve been interesting if the actual code for carrying out the attacks they’ve listed were included for a dummy database. I think being able to have code samples would validate a lot of the arguments they made, while potentially providing an extra facet of insight for spurring more research to try to resolve such attacks. I think when discussing what information such attacks can retrieve, it’s simply detailed as “important” or “private”. I’m curious to know whether this refers to the actual records themselves, the metadata, or database configurations. Again, more real code would help. Finally, stating how feasible these attacks would be or how much effort it would require would help quantify the urgency of such threats.

I think this paper opens up a lot of questions, and in its entirety can be treated as prompts for potential extensions. The authors present a very grim outlook at the end, mentioning that achieving full confidentiality with a database designed from the ground up would be difficult. I think looking into some of the attacks and understanding what parts of the design of encrypted databases could lead to such vulnerabilities would be worthwhile. Furthermore, while a traditional database designed from scratch might run into unsolvable issues, perhaps rethinking the design of a DBMS itself could open doors for architectures that are more conducive to enforcing security and privacy primitives that were originally unresolvable. An example of this is Write-Only ORAM, where, put simply, excluding memory reads made obliviousness enforcable.