• [digest] 2024 Week 30 (2/2)

    From IACR ePrint Archive@21:1/5 to All on Mon Jul 29 02:26:25 2024
    [continued from previous message]

    In this paper, we investigate to derive a higher performance for the pairing computation on GG22D7-457. We first propose novel formulas of the super-optimal pairing on this curve by utilizing a $2$-isogeny as GLV-endomorphism. Besides, this tool
    can be generalized to more generic families of pairing-friendly curves with $n$-isogenies as endomorphisms. In our paper, we provide the explicit formulas for the super-optimal pairings exploiting $2, 3$-isogenies. Finally, we make a concrete
    computational cost analysis and implement the pairing computations on curve GG22D7-457 using our approaches. In terms of Miller function evaluation, employing the techniques in this paper obtain a saving of $24.44\% $ in $\mathbb{F}_p$-
    multiplications compared to the optimal ate pairing. As for the running time, the experimental results illustrate that the Miller loop on GG22D7-457 by utilizing our methods is $26.0\%$ faster than the state-of-the-art. Additionally, the performance
    of the super-optimal pairing on GG22D7-457 is competitive compared to the well-known pairing-friendly curves at the 192-bit security level. These results show that GG22D7-457 becomes an attractive candidate for the pairing-based protocols. Furthermore,
    our techniques have the potential to enhance the applications of super-optimal pairings on more pairing-friendly curves.



    ## 2024/1196

    * Title: Client-Aided Privacy-Preserving Machine Learning
    * Authors: Peihan Miao, Xinyi Shi, Chao Wu, Ruofan Xu
    * [Permalink](https://eprint.iacr.org/2024/1196)
    * [Download](https://eprint.iacr.org/2024/1196.pdf)

    ### Abstract

    Privacy-preserving machine learning (PPML) enables multiple distrusting parties to jointly train ML models on their private data without revealing any information beyond the final trained models. In this work, we study the client-aided two-server setting
    where two non-colluding servers jointly train an ML model on the data held by a large number of clients. By involving the clients in the training process, we develop efficient protocols for training algorithms including linear regression, logistic
    regression, and neural networks. In particular, we introduce novel approaches to securely computing inner product, sign check, activation functions (e.g., ReLU, logistic function), and division on secret shared values, leveraging lightweight computation
    on the client side. We present constructions that are secure against semi-honest clients and further enhance them to achieve security against malicious clients. We believe these new client-aided techniques may be of independent interest.

    We implement our protocols and compare them with the two-server PPML protocols presented in SecureML (Mohassel and Zhang, S&P'17) across various settings and ABY2.0 (Patra et al., Usenix Security'21) theoretically. We demonstrate that with the assistance
    of untrusted clients in the training process, we can significantly improve both the communication and computational efficiency by orders of magnitude. Our protocols compare favorably in all the training algorithms on both LAN and WAN networks.



    ## 2024/1197

    * Title: Optimizing Rectangle and Boomerang Attacks: A Unified and Generic Framework for Key Recovery
    * Authors: Qianqian Yang, Ling Song, Nana Zhang, Danping Shi, Libo Wang, Jiahao Zhao, Lei Hu, Jian Weng
    * [Permalink](https://eprint.iacr.org/2024/1197)
    * [Download](https://eprint.iacr.org/2024/1197.pdf)

    ### Abstract

    The rectangle attack has shown to be a very powerful form of cryptanalysis against block ciphers. Given a rectangle distinguisher, one expects to mount key recovery attacks as efficiently as possible. In the literature, there have been four algorithms
    for rectangle key recovery attacks. However, their performance varies from case to case. Besides, numerous are the applications where the attacks lack optimality. In this paper, we delve into the rectangle key recovery and propose a unified and generic
    key recovery algorithm, which supports any possible attacking parameters. Not only does it encompass the four existing rectangle key recovery algorithms, but it also reveals five new types of attacks that were previously overlooked. Further, we put
    forward a counterpart for boomerang key recovery attacks, which supports any possible attacking parameters as well. Along with these new key recovery algorithms, we propose a framework to automatically determine the best parameters for the attack. To
    demonstrate the efficiency of the new key recovery algorithms, we apply them to \serpent, \aes-192, \craft, \skinny, and \deoxysbc-256 based on existing distinguishers, yielding a series of improved attacks.



    ## 2024/1198

    * Title: ECO-CRYSTALS: Efficient Cryptography CRYSTALS on Standard RISC-V ISA
    * Authors: Xinyi Ji, Jiankuo Dong, Junhao Huang, Zhijian Yuan, Wangchen Dai, Fu Xiao, Jingqiang Lin
    * [Permalink](https://eprint.iacr.org/2024/1198)
    * [Download](https://eprint.iacr.org/2024/1198.pdf)

    ### Abstract

    The field of post-quantum cryptography (PQC) is continuously evolving. Many researchers are exploring efficient PQC implementation on various platforms, including x86, ARM, FPGA, GPU, etc. In this paper, we present an Efficient CryptOgraphy CRYSTALS (ECO-
    CRYSTALS) implementation on standard 64-bit RISC-V Instruction Set Architecture (ISA). The target schemes are two winners of the National Institute of Standards and Technology (NIST) PQC competition: CRYSTALS-Kyber and CRYSTALS-Dilithium, where the two
    most time-consuming operations are Keccak and polynomial multiplication. Notably, this paper is the first to deploy Kyber and Dilithium on the 64-bit RISC-V ISA. Firstly, we propose a better scheduling strategy for Keccak, which is specifically tailored
    for the 64-bit dual-issue RISC-V architecture. Our 24-round Keccak permutation (Keccak-$p$[1600,24]) achieves a 59.18% speed-up compared to the reference implementation. Secondly, we apply two modular arithmetic (Montgomery arithmetic and Plantard
    arithmetic) in the polynomial multiplication of Kyber and Dilithium to get a better lazy reduction. Then, we propose a flexible dual-instruction-issue scheme of Number Theoretic Transform (NTT). As for the matrix-vector multiplication, we introduce a row-
    to-column processing methodology to minimize the expensive memory access operations. Compared to the reference implementation, we obtain a speedup of 53.85%$\thicksim$85.57% for NTT, matrix-vector multiplication, and INTT in our ECO-CRYSTALS. Finally,
    our ECO-CRYSTALS implementation for key generation, encapsulation, and decapsulation in Kyber achieves 399k, 448k, and 479k cycles respectively, achieving speedups of 60.82%, 63.93%, and 65.56% compared to the NIST reference implementation. Similarly,
    our ECO-CRYSTALS implementation for key generation, sign, and verify in Dilithium reaches 1,364k, 3,191k, and 1,369k cycles, showcasing speedups of 54.84%, 64.98%, and 57.20%, respectively.



    ## 2024/1199

    * Title: On degrees of carry and Scholz's conjecture
    * Authors: Theophilus Agama
    * [Permalink](https://eprint.iacr.org/2024/1199)
    * [Download](https://eprint.iacr.org/2024/1199.pdf)

    ### Abstract

    Exploiting the notion of carries, we obtain improved upper bounds for the length of the shortest addition chains $\iota(2^n-1)$ producing $2^n-1$. Most notably, we show that if $2^n-1$ has carries of degree at most $$\kappa(2^n-1)=\frac{1}{2}(\iota(n)-\
    lfloor \frac{\log n}{\log 2}\rfloor+\sum \limits_{j=1}^{\lfloor \frac{\log n}{\log 2}\rfloor}\{\frac{n}{2^j}\})$$ then the inequality $$\iota(2^n-1)\leq n+1+\sum \limits_{j=1}^{\lfloor \frac{\log n}{\log 2}\rfloor}\bigg(\{\frac{n}{2^j}\}-\xi(n,j)\bigg)+\
    iota(n)$$ holds for all $n\in \mathbb{N}$ with $n\geq 4$, where $\iota(\cdot)$ denotes the length of the shortest addition chain producing $\cdot$, $\{\cdot\}$ denotes the fractional part of $\cdot$ and where $\xi(n,1):=\{\frac{n}{2}\}$ with $\xi(n,2)=\{\
    frac{1}{2}\lfloor \frac{n}{2}\rfloor\}$ and so on.



    ## 2024/1200

    * Title: Depth-Aware Arithmetization of Common Primitives in Prime Fields
    * Authors: Jelle Vos, Mauro Conti, Zekeriya Erkin
    * [Permalink](https://eprint.iacr.org/2024/1200)
    * [Download](https://eprint.iacr.org/2024/1200.pdf)

    ### Abstract

    A common misconception is that the computational abilities of circuits composed of additions and multiplications are restricted to simple formulas only. Such arithmetic circuits over finite fields are actually capable of computing any function, including
    equality checks, comparisons, and other highly non-linear operations. While all those functions are computable, the challenge lies in computing them efficiently. We refer to this search problem as arithmetization. Arithmetization is a key problem in
    secure computation, as techniques like homomorphic encryption and secret sharing compute arithmetic circuits rather than the high-level programs that programmers are used to. The objective in arithmetization has typically been to minimize the number of
    multiplications (multiplicative size), as multiplications in most secure computation techniques are significantly more expensive to compute than additions. However, the multiplicative depth of a circuit arguably plays an even more important role in
    deciding the computational cost: For homomorphic encryption, it strongly affects the choice of cryptographic parameters and the number of bootstrapping operations required, which are orders of magnitude more expensive to compute than multiplications. In
    fact, if we can limit the multiplicative depth of a circuit such that we do not need to perform any bootstrapping, we can omit the large bootstrapping keys required to perform them all together. We argue that arithmetization should be treated as a multi-
    objective minimization problem, in which a trade-off can be made between a circuit's multiplicative size and depth. We present efficient depth-aware arithmetization methods for many primitive operations such as exponentiation, univariate functions,
    equality checks, comparisons, and ANDs and ORs, which take into account that squaring can be cheaper than arbitrary multiplications, and we study how to compose them.



    ## 2024/1201

    * Title: Designing a General-Purpose 8-bit (T)FHE Processor Abstraction
    * Authors: Daphné Trama, Pierre-Emmanuel Clet, Aymen Boudguiga, Renaud Sirdey * [Permalink](https://eprint.iacr.org/2024/1201)
    * [Download](https://eprint.iacr.org/2024/1201.pdf)

    ### Abstract

    Making the most of TFHE programmable bootstrapping to evaluate functions or operators otherwise difficult to perform with only the native addition and multiplication of the scheme is a very active line of research. In this paper, we systematize this
    approach and apply it to build an 8-bit FHE processor abstraction, i.e., a software entity that works over FHE-encrypted 8-bits data and presents itself to the programmer by means of a conventional-looking assembly instruction set. In doing so, we
    provide several homomorphic LUT dereferencing operators based on variants on the tree-based method and show that they are the most efficient option for manipulating encryptions of 8-bit data (optimally represented as two base 16 digits). We then
    systematically apply this approach over a set of around 50 instructions, including, notably, conditional assignments, divisions, or fixed-point arithmetic operations. We then conclude the paper by testing the approach on several simple algorithms,
    including the execution of a neuron with a sigmoid activation function over 16-bit precision. Finally, this work reveals that a very limited set of functional bootstrapping patterns is versatile and efficient enough to achieve general-purpose FHE
    computations beyond the boolean circuit approach. As such, these patterns may be an appropriate target for further works on advanced software optimizations or hardware implementations.



    ## 2024/1202

    * Title: Prover - Toward More Efficient Formal Verification of Masking in Probing Model
    * Authors: Feng Zhou, Hua Chen, Limin Fan
    * [Permalink](https://eprint.iacr.org/2024/1202)
    * [Download](https://eprint.iacr.org/2024/1202.pdf)

    ### Abstract

    In recent years, formal verification has emerged as a crucial method for assessing security against Side-Channel attacks of masked implementations, owing to its remarkable versatility and high degree of automation. However, formal verification still
    faces technical bottlenecks in balancing accuracy and efficiency, thereby limiting its scalability. Former efficient tools like \textsf{maskVerif} and CocoAlma are often inaccurate when verifying schemes utilizing properties of Boolean functions. Later,
    SILVER addressed the accuracy issue, albeit at the cost of significantly reduced speed and scalability compared to \textsf{maskVerif}. Consequently, there is a pressing need to develop formal verification tools that are both efficient and accurate for
    designing secure schemes and evaluating implementations.
    This paper's primary contribution lies in proposing several approaches to develop a more efficient and scalable formal verification tool called \textsf{Prover}, which is built upon SILVER.
    Firstly, inspired by the auxiliary data structures proposed by Eldib et al. and optimistic sampling rule of maskVerif, we introduce two reduction rules aimed at diminishing the size of observable sets and secret sets in statistical independence
    checks. These rules substantially decrease, or even eliminate, the need for repeated computation of probability distributions using Reduced Ordered Binary Decision Diagrams (ROBDDs), a time-intensive procedure in verification.
    Subsequently, we integrate one of these reduction rules into the uniformity check to mitigate its complexity.
    Secondly, we identify that variable ordering significantly impacts efficiency and optimize it for constructing ROBDDs, resulting in much smaller representations of investigated functions. Lastly, we present the algorithm of \textsf{Prover}, which
    efficiently verifies the security and uniformity of masked implementations in probing model with or without the presence of glitches.
    Experimental results demonstrate that our proposed tool
    \textsf{Prover} offers a superior balance between efficiency and accuracy compared to other state-of-the-art tools (CocoAlma, maskVerif, and SILVER). It successfully verifies a design that SILVER could not complete within the allocated time, whereas
    CocoAlma and maskVerif encounter issues with false positives.



    ## 2024/1203

    * Title: Preservation of Speculative Constant-time by Compilation
    * Authors: Santiago Arranz Olmos, Gilles Barthe, Lionel Blatter, Benjamin Grégoire, Vincent Laporte
    * [Permalink](https://eprint.iacr.org/2024/1203)
    * [Download](https://eprint.iacr.org/2024/1203.pdf)

    ### Abstract

    Compilers often weaken or even discard software-based countermeasures commonly used to protect programs against side-channel attacks; worse, they may also introduce vulnerabilities that attackers can exploit. The solution to this problem is to develop
    compilers that preserve these countermeasures. Prior work establishes that (a mildly modified version of) the CompCert and Jasmin formally verified compilers preserve constant-time, an information flow policy that ensures that programs are protected
    against cache side-channel attacks. However, nothing is known about preservation of speculative constant-time, a strengthening of the constant-time policy that ensures that programs are protected against Spectre v1 attacks. We first show that
    preservation of speculative constant-time fails in practice by providing examples of secure programs whose compilation is not speculative constant-time using GCC (GCC -O0 and GCC -O1) and Jasmin. Then, we define a proof-of-concept compiler that distills
    some of the critical passes of the Jasmin compiler and use the Coq proof assistant to prove that it preserves speculative constant-time. Finally, we patch the Jasmin speculative constant-time type checker and demonstrate that all cryptographic
    implementations written in Jasmin can be fixed with minimal impact.



    ## 2024/1204

    * Title: A fast heuristic for mapping Boolean circuits to functional bootstrapping
    * Authors: Sergiu Carpov
    * [Permalink](https://eprint.iacr.org/2024/1204)
    * [Download](https://eprint.iacr.org/2024/1204.pdf)

    ### Abstract

    Functional bootstrapping in FHE schemes such as FHEW and TFHE allows the evaluation of a function on an encrypted message, in addition to noise reduction. Implementing programs that directly use functional bootstrapping is challenging and error-prone. In
    this paper, we propose a heuristic that automatically maps Boolean circuits to functional bootstrapping instructions. Unlike other approaches, our method does not limit the encrypted data plaintext space to a power-of-two size, allowing the instantiation
    of functional bootstrapping with smaller parameters. Furthermore, the negacyclic property of functional bootstrapping is exploited to extend the plaintext space. Despite the inherently greedy nature of the heuristic, experimental results show that the
    mapped circuits exhibit a significant reduction in evaluation time. Furthermore, our heuristic was able to achieve a $45\%$ improvement in evaluation time when applied to manually implemented Trivium and Kreyvium circuits.



    ## 2024/1205

    * Title: Analysis of One Scheme for User Authentication and Session Key Agreement in Wireless Sensor Network Using Smart Card
    * Authors: Zhengjun Cao, Lihua Liu
    * [Permalink](https://eprint.iacr.org/2024/1205)
    * [Download](https://eprint.iacr.org/2024/1205.pdf)

    ### Abstract

    We show that the Chunka-Banerjee-Goswami authentication and
    key agreement scheme [Wirel. Pers. Commun., 117, 1361-1385, 2021] fails to keep user anonymity, not as claimed. It only keeps pseudonymity. Anonymous actions are designed to be unlinkable to any entity, but pseudonymous actions can be traced back to a
    certain entity. We also find the scheme is insecure against offline dictionary attack.



    ## 2024/1206

    * Title: Applying Post-Quantum Cryptography Algorithms to a DLT-Based CBDC Infrastructure: Comparative and Feasibility Analysis
    * Authors: Daniel de Haro Moraes, Joao Paulo Aragao Pereira, Bruno Estolano Grossi, Gustavo Mirapalheta, George Marcel Monteiro Arcuri Smetana, Wesley Rodrigues, Courtnay Nery Guimarães Jr., Bruno Domingues, Fábio Saito, Marcos Simplício
    * [Permalink](https://eprint.iacr.org/2024/1206)
    * [Download](https://eprint.iacr.org/2024/1206.pdf)

    ### Abstract

    This article presents an innovative project for a Central Bank Digital Currency (CBDC) infrastructure. Focusing on security and reliability, the proposed architecture: (1) employs post-quantum cryptography (PQC) algorithms for long-term security, even
    against attackers with access to cryptographically-relevant quantum computers; (2) can be integrated with a Trusted Execution Environment (TEE) to safeguard the confidentiality of transaction contents as they are processed by third-parties; and (3) uses
    Distributed Ledger Technology (DLT) to promote a high level of transparency and tamper resistance for all transactions registered in the system. Besides providing a theoretical discussion on the benefits of this architecture, we experimentally evaluate
    its components. Namely, as PQC algorithms, we consider three signature schemes being standardized by the National Institute of Standards and Technology (NIST), CRYSTALS-Dilithium, Falcon, and SPHINCS+. Those algorithms are integrated into the Hyperledger
    Besu (DLT) and executed both inside and outside an Intel SGX TEE environment. According to our results, CRYSTALS-Dilithium-2 combined with classical secp256k1 signatures leads to the shortest execution times when signing blocks in the DLT, reaching 1.
    68ms without the TEE, and 2.09ms with TEE. The same combination also displays the best results for signature verifications, achieving 0.5ms without a TEE and 1.98ms with a TEE. We also describe the main aspects of the evaluation methodology and the next
    steps in validating the proposed infrastructure. The conclusions drawn from our experiments is that the combination of PQC and TEE promises highly secure and effective DLT-based CBDC scenarios, ready to face the challenges of the digital financial future
    and potential quantum threats.



    ## 2024/1207

    * Title: What Have SNARGs Ever Done for FHE?
    * Authors: Michael Walter
    * [Permalink](https://eprint.iacr.org/2024/1207)
    * [Download](https://eprint.iacr.org/2024/1207.pdf)

    ### Abstract

    In recent years, there have been several constructions combining FHE with SNARGs to add integrity guarantees to FHE schemes. Most of these works focused on improving efficiency, while the precise security model with regards to client side input privacy
    has remained understudied. Only recently it was shown by Manulis and Nguyen (Eurocrypt'24) that this combination does not yield IND-CCA1 security. So an interesting open question is: does the SNARG actually add any meaningful security to input privacy?
    We address this question in this note and give a security definition that meaningfully captures the security of the FHE plus SNARG construction.



    ## 2024/1208

    * Title: Hᴇᴋᴀᴛᴏɴ: Horizontally-Scalable zkSNARKs via Proof Aggregation
    * Authors: Michael Rosenberg, Tushar Mopuri, Hossein Hafezi, Ian Miers, Pratyush Mishra
    * [Permalink](https://eprint.iacr.org/2024/1208)
    * [Download](https://eprint.iacr.org/2024/1208.pdf)

    ### Abstract

    Zero-knowledge Succinct Non-interactive ARguments of Knowledge (zkSNARKs) allow a prover to convince a verifier of the correct execution of a large computation in private and easily-verifiable manner. These properties make zkSNARKs a powerful tool for
    adding accountability, scalability, and privacy to numerous systems such as blockchains and verifiable key directories. Unfortunately, existing zkSNARKs are unable to scale to large computations due to time and space complexity requirements for the
    prover algorithm. As a result, they cannot handle real-world instances of the aforementioned applications.

    In this work, we introduce Hᴇᴋᴀᴛᴏɴ, a zkSNARK that overcomes these barriers and can efficiently handle arbitrarily large computations. We construct Hᴇᴋᴀᴛᴏɴ via a new "distribute-and-aggregate" framework that breaks up large
    computations into small chunks, proves these chunks in parallel in a distributed system, and then aggregates the resulting chunk proofs into a single succinct proof. Underlying this framework is a new technique for efficiently handling data that is
    shared between chunks that we believe could be of independent interest.

    We implement a distributed prover for Hᴇᴋᴀᴛᴏɴ, and evaluate its performance on a compute cluster. Our experiments show that Hᴇᴋᴀᴛᴏɴ achieves strong horizontal scalability (proving time decreases linearly as we increase the number of
    nodes in the cluster), and is able to prove large computations quickly: it can prove computations of size $2^{35}$ gates in under an hour, which is much faster than prior work.

    Finally, we also apply Hᴇᴋᴀᴛᴏɴ to two applications of real-world interest: proofs of batched insertion for a verifiable key directory and proving correctness of RAM computations. In both cases, Hᴇᴋᴀᴛᴏɴ is able to scale to handle
    realistic workloads with better efficiency than prior work.



    ## 2024/1209

    * Title: Collaborative CP-NIZKs: Modular, Composable Proofs for Distributed Secrets
    * Authors: Mohammed Alghazwi, Tariq Bontekoe, Leon Visscher, Fatih Turkmen
    * [Permalink](https://eprint.iacr.org/2024/1209)
    * [Download](https://eprint.iacr.org/2024/1209.pdf)

    ### Abstract

    Non-interactive zero-knowledge (NIZK) proofs of knowledge have proven to be highly relevant for securely realizing a wide array of applications that rely on both privacy and correctness. They enable a prover to convince any party of the correctness of a
    public statement for a secret witness. However, most NIZKs do not natively support proving knowledge of a secret witness that is distributed over multiple provers. Previously, collaborative proofs [51] have been proposed to overcome this limitation. We
    investigate the notion of composability in this setting, following the Commit-and-Prove design of LegoSNARK [17]. Composability allows users to combine different, specialized NIZKs (e.g., one arithmetic circuit, one boolean circuit, and one for range
    proofs) with the aim of reducing the prove generation time. Moreover, it opens the door to efficient realizations of many applications in the collaborative setting such as mutually exclusive prover groups, combining collaborative and single-party proofs
    and efficiently implementing publicly auditable MPC (PA-MPC).

    We present the first, general definition for collaborative commit-and-prove NIZK (CP-NIZK) proofs of knowledge and construct distributed protocols to enable their realization. We implement our protocols for two commonly used NIZKs, Groth16 and
    Bulletproofs, and evaluate their practicality in a variety of computational settings. Our findings indicate that composability adds only minor overhead, especially for large circuits. We experimented with our construction in an application setting, and
    when compared to prior works, our protocols reduce latency by 18–55× while requiring only a fraction (0.2%) of the communication.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)