PIR with compressed queries and amortized query processing

TUESDAY. JANUARY 19, 2021 •

While PIR techniques serve as key building blocks for privacy preservation, the cost associated with such algorithms can become very expensive. As seen in previous work, these methods are often layered and used in repeated manners, which can easily a buildup in run time. As PIR gains popularity, reducing the communication cost would make its usage in production systems much more feasible. This paper in particular targets two problems. The first is that the query / request could be very large, and the second is that the computation of such a query could be very expensive.

The paper makes two contributions to resolve the aforementioned problem. They first propose a technique for compressing queries and reducing associated network costs. Next, with the help of encoding, the amount of CPU costs for query processing can be reduced by batching queries together. The general idea is that the server will be able to perform the encryption and decryption of the indices corresponding to certain results without having to know the actual index itself. The authors first bring up fully homomorphic encryption which, while correct, require extremely expensive functions. Multiplication also requires larger security parameters. Now the goal becomes implementing decompression without needing the multiplication operation. Adopting the “lattice cryptosystems support substitution” operation, the authors design a function called Extract which is orders of magnitude cheaper than multiplication of ciphertexts.

The paper evaluates the system with respect to three questions. First, does the compression reduce network cost? Second, does batching work well in practical scenarios? Third, what is the end-to-end impact? I thought part one was addressed quite well, with both benchmarks and graphics. However, I thought the second two parts were relatively weaker. The system wasn’t really tested on anything in production (unlike the Popcorn paper) and the end-to-end evaluation did not seem complete.

The overall server processing time still has a high overall cost that seems tied to the intrinsic PIR scheme. It would be interesting to see if parallelization or some rearchitecting of the query processing manager could lead to faster times with the same PIR building blocks. I think the authors’ original problem, while worthwhile, does not feel fully addressed in this paper. They say that they want to close the gap of a large performance disparity between CPIR systems and widespread adoption. Actually deploying SealPIR to a real system and providing validation numbers beyond sandboxed tests would be really interesting to check whether this system delivers on what it promised.