17 results sorted by ID
Possible spell-corrected query: barreto multiplication
Multiplying Polynomials without Powerful Multiplication Instructions (Long Paper)
Vincent Hwang, YoungBeom Kim, Seog Chung Seo
Implementation
We improve the performance of lattice-based cryptosystems Dilithium on Cortex-M3 with expensive multiplications. Our contribution is two-fold: (i) We generalize Barrett multiplication and show that the resulting shape-independent modular multiplication performs comparably to long multiplication on some platforms without special hardware when precomputation is free. We call a modular multiplication “shape-independent” if its correctness and efficiency depend only on the magnitude of moduli...
A Fast and Efficient SIKE Co-Design: Coarse-Grained Reconfigurable Accelerators with Custom RISC-V Microcontroller on FPGA
Jing Tian, Bo Wu, Lang Feng, Haochen Zhang, Zhongfeng Wang
Implementation
This paper proposes a fast and efficient FPGA-based hardware-software co-design for the supersingular isogeny key encapsulation (SIKE) protocol controlled by a custom RISC-V processor. Firstly, we highly optimize the core unit, the polynomial-based field arithmetic logic unit (FALU), with the proposed fast convolution-like multiplier (FCM) to significantly reduce the resource consumption while still maintaining low latency and constant time for all the four SIKE parameters. Secondly, we pack...
Split Gröbner Bases for Satisfiability Modulo Finite Fields
Alex Ozdemir, Shankara Pailoor, Alp Bassa, Kostas Ferles, Clark Barrett, Işil Dillig
Implementation
Satisfiability modulo finite fields enables automated verification for cryptosystems. Unfortunately, previous solvers scale poorly for even some simple systems of field equations, in part because they build a full Gröbner basis (GB) for the system. We propose a new solver that uses multiple, simpler GBs instead of one full GB. Our solver, implemented within the cvc5 SMT solver, admits specialized propagation algorithms, e.g., for understanding bitsums. Experiments show that it solves...
A Survey of Polynomial Multiplications for Lattice-Based Cryptosystems
Vincent Hwang
Implementation
We survey various mathematical tools used in software works multiplying polynomials in
\[
\frac{\mathbb{Z}_q[x]}{\left\langle {x^n - \alpha x - \beta} \right\rangle}.
\]
In particular, we survey implementation works targeting polynomial multiplications in lattice-based cryptosystems Dilithium, Kyber, NTRU, NTRU Prime, and Saber with instruction set architectures/extensions Armv7-M, Armv7E-M, Armv8-A, and AVX2.
There are three emphases in this paper: (i) modular arithmetic, (ii)...
Barrett Multiplication for Dilithium on Embedded Devices
Vincent Hwang, YoungBeom Kim, Seog Chung Seo
Implementation
We optimize the number-theoretic transforms (NTTs) in Dilithium — a digital signature scheme recently standardized by the National Institute of Standards and Technology (NIST) — on Cortex-M3 and 8-bit AVR. The core novelty is the exploration of micro-architectural insights for modular multiplications. Recent work [Becker, Hwang, Kannwischer, Yang and Yang, Volume 2022 (1), Transactions on Cryptographic Hardware and Embedded Systems, 2022] found a correspondence between Montgomery and Barrett...
Homomorphic Polynomial Public Key Cryptography for Quantum-secure Digital Signature
Randy Kuang, Maria Perepechaenko, Mahmoud Sayed, Dafu Lou
Cryptographic protocols
In their 2022 study, Kuang et al. introduced the Multivariable Polynomial Public Key (MPPK) cryptography, a quantum-safe public key cryptosystem leveraging the mutual inversion relationship between multiplication and division. MPPK employs multiplication for key pair construction and division for decryption, generating public multivariate polynomials. Kuang and Perepechaenko expanded the cryptosystem into the Homomorphic Polynomial Public Key (HPPK), transforming product polynomials over...
Schnorr protocol in Jasmin
José Bacelar Almeida, Denis Firsov, Tiago Oliveira, Dominique Unruh
Implementation
We implement the Schnorr protocol in assembler via the Jasmin toolchain, and prove the security (proof-of-knowledge and zero-knowledge properties) and the absence of leakage through timing side-channels of that implementation in EasyCrypt.
In order to do so, we provide a semantic characterization of leakage-freeness for probabilistic Jasmin programs (that are not constant-time). We design a library for multiple-precision integer arithmetic in Jasmin -- the "libjbn'' library. Among others,...
Improved Plantard Arithmetic for Lattice-based Cryptography
Junhao Huang, Jipeng Zhang, Haosong Zhao, Zhe Liu, Ray C. C. Cheung, Çetin Kaya Koç, Donglong Chen
Implementation
This paper presents an improved Plantard's modular arithmetic (Plantard arithmetic) tailored for Lattice-Based Cryptography (LBC). Based on the improved Plantard arithmetic, we present faster implementations of two LBC schemes, Kyber and NTTRU, running on Cortex-M4. The intrinsic advantage of Plantard arithmetic is that one multiplication can be saved from the modular multiplication of a constant. However, the original Plantard arithmetic is not very practical in LBC schemes because of the...
Faster Kyber and Dilithium on the Cortex-M4
Amin Abdulrahman, Vincent Hwang, Matthias J. Kannwischer, Amber Sprenkels
Implementation
This paper presents faster implementations of the lattice-based schemes Dilithium and Kyber on the Cortex-M4. Dilithium is one of the three signature finalists in the NIST post-quantum project (NIST PQC), while Kyber is one of the four key-encapsulation mechanism (KEM) finalists.
Our optimizations affect the core polynomial arithmetic using the number-theoretic transform (NTT) of both schemes. Our main contributions are threefold: We present a faster signed Barrett reduction for Kyber,...
Neon NTT: Faster Dilithium, Kyber, and Saber on Cortex-A72 and Apple M1
Hanno Becker, Vincent Hwang, Matthias J. Kannwischer, Bo-Yin Yang, Shang-Yi Yang
Implementation
We present new speed records on the Armv8-A architecture for the lattice-based schemes Dilithium, Kyber, and Saber. The core novelty in this paper is the combination of Montgomery multiplication and Barrett reduction resulting in “Barrett multiplication” which allows particularly efficient modular one-known-factor multiplication using the Armv8-A Neon vector instructions. These novel techniques combined with fast two-unknown-factor Montgomery multiplication, Barrett reduction sequences, and...
Timing attacks and local timing attacks against Barrett’s modular multiplication algorithm
Johannes Mittmann, Werner Schindler
Implementation
Montgomery’s and Barrett’s modular multiplication algorithms are widely used in modular exponentiation algorithms, e.g. to compute RSA or ECC operations. While Montgomery’s multiplication algorithm has been studied extensively in the literature and many side-channel attacks have been detected, to our best knowledge no thorough analysis exists for Barrett’s multiplication algorithm. This article closes this gap. For both Montgomery’s and Barrett’s multiplication algorithm, differences of the...
Efficient Software Implementation of the SIKE Protocol Using a New Data Representation
Jing Tian, Piaoyang Wang, Zhe Liu, Jun Lin, Zhongfeng Wang, Johann Großschädl
Implementation
Thanks to relatively small public and secret keys, the Supersingular Isogeny Key Encapsulation (SIKE) protocol made it into the third evaluation round of the post-quantum standardization project of the National Institute of Standards and Technology (NIST). Even though a large body of research has been devoted to the efficient implementation of SIKE, its latency is still undesirably long for many real-world applications. Most existing implementations of the SIKE protocol use the Montgomery...
Efficient reductions in cyclotomic rings - Application to R-LWE based FHE schemes
Jean-Claude Bajard, Julien Eynard, Anwar Hasan, Paulo Martins, Leonel Sousa, Vincent Zucca
Implementation
With Fully Homomorphic Encryption (FHE), it is possible to process encrypted data without having an access to the private-key. This has a
wide range of applications, most notably the offloading of
sensitive data processing. Most research on FHE has focused on the
improvement of its efficiency, namely by introducing schemes based on
the Ring-Learning With Errors (R-LWE) problem, and techniques such as batching, which allows for the encryption of
multiple messages in the same ciphertext. Much...
Efficient Finite field multiplication for isogeny based post quantum cryptography
Angshuman karmakar, Sujoy Sinha Roy, Frederik Vercauteren, Ingrid Verbauwhede
Public-key cryptography
Isogeny based post-quantum cryptography is one of the most
recent addition to the family of quantum resistant cryptosystems. In
this paper, we propose an efficient modular multiplication algorithm for
primes of the form $p = 2 \cdot 2^a \cdot 3^b - 1$ with b even, typically used in such
cryptosystem. Our modular multiplication algorithm exploits the special structure present in such primes. We compare the efficiency of our
technique with Barrett reduction and Montgomery multiplication. Our
C...
Double-Speed Barrett Moduli
Rémi Géraud, Diana Maimut, David Naccache
Public-key cryptography
Modular multiplication and modular reduction are the atomic constituents of most public-key cryptosystems. Amongst the numerous algorithms for performing these operations, a particularly elegant method was proposed by Barrett. This method builds the operation $a \bmod b$ from bit shifts, multiplications and additions in $\mathbb{Z}$. This allows building modular reduction at very marginal code or silicon costs by leveraging existing hardware or software multipliers.
This paper presents a...
Modular Hardware Architecture for Somewhat Homomorphic Function Evaluation
Sujoy Sinha Roy, Kimmo Järvinen, Frederik Vercauteren, Vassil Dimitrov, Ingrid Verbauwhede
Implementation
We present a hardware architecture for all building blocks required in polynomial ring based fully homomorphic schemes and use it to instantiate the somewhat homomorphic encryption scheme YASHE. Our implementation is the first FPGA implementation that is designed for evaluating functions on homomorphically encrypted data (up to a certain multiplicative depth) and we illustrate this capability by evaluating the SIMON-64/128 block cipher in the encrypted domain. Our implementation provides a...
Accelerating Fully Homomorphic Encryption over the Integers with Super-size Hardware Multiplier and Modular Reduction
Xiaolin Cao, Ciara Moore, Maire O’Neill, Elizabeth O’Sullivan, Neil Hanley
Implementation
A fully homomorphic encryption (FHE) scheme is envisioned as being a key cryptographic tool in building a secure and reliable cloud computing environment, as it allows arbitrarily evaluation of a ciphertext without revealing the plaintext. However, existing FHE implementations remain impractical due to their very high time and resource costs. Of the proposed schemes that can perform FHE to date, a scheme known as FHE over the integers has the ad-vantage of comparatively simpler theory, as...
We improve the performance of lattice-based cryptosystems Dilithium on Cortex-M3 with expensive multiplications. Our contribution is two-fold: (i) We generalize Barrett multiplication and show that the resulting shape-independent modular multiplication performs comparably to long multiplication on some platforms without special hardware when precomputation is free. We call a modular multiplication “shape-independent” if its correctness and efficiency depend only on the magnitude of moduli...
This paper proposes a fast and efficient FPGA-based hardware-software co-design for the supersingular isogeny key encapsulation (SIKE) protocol controlled by a custom RISC-V processor. Firstly, we highly optimize the core unit, the polynomial-based field arithmetic logic unit (FALU), with the proposed fast convolution-like multiplier (FCM) to significantly reduce the resource consumption while still maintaining low latency and constant time for all the four SIKE parameters. Secondly, we pack...
Satisfiability modulo finite fields enables automated verification for cryptosystems. Unfortunately, previous solvers scale poorly for even some simple systems of field equations, in part because they build a full Gröbner basis (GB) for the system. We propose a new solver that uses multiple, simpler GBs instead of one full GB. Our solver, implemented within the cvc5 SMT solver, admits specialized propagation algorithms, e.g., for understanding bitsums. Experiments show that it solves...
We survey various mathematical tools used in software works multiplying polynomials in \[ \frac{\mathbb{Z}_q[x]}{\left\langle {x^n - \alpha x - \beta} \right\rangle}. \] In particular, we survey implementation works targeting polynomial multiplications in lattice-based cryptosystems Dilithium, Kyber, NTRU, NTRU Prime, and Saber with instruction set architectures/extensions Armv7-M, Armv7E-M, Armv8-A, and AVX2. There are three emphases in this paper: (i) modular arithmetic, (ii)...
We optimize the number-theoretic transforms (NTTs) in Dilithium — a digital signature scheme recently standardized by the National Institute of Standards and Technology (NIST) — on Cortex-M3 and 8-bit AVR. The core novelty is the exploration of micro-architectural insights for modular multiplications. Recent work [Becker, Hwang, Kannwischer, Yang and Yang, Volume 2022 (1), Transactions on Cryptographic Hardware and Embedded Systems, 2022] found a correspondence between Montgomery and Barrett...
In their 2022 study, Kuang et al. introduced the Multivariable Polynomial Public Key (MPPK) cryptography, a quantum-safe public key cryptosystem leveraging the mutual inversion relationship between multiplication and division. MPPK employs multiplication for key pair construction and division for decryption, generating public multivariate polynomials. Kuang and Perepechaenko expanded the cryptosystem into the Homomorphic Polynomial Public Key (HPPK), transforming product polynomials over...
We implement the Schnorr protocol in assembler via the Jasmin toolchain, and prove the security (proof-of-knowledge and zero-knowledge properties) and the absence of leakage through timing side-channels of that implementation in EasyCrypt. In order to do so, we provide a semantic characterization of leakage-freeness for probabilistic Jasmin programs (that are not constant-time). We design a library for multiple-precision integer arithmetic in Jasmin -- the "libjbn'' library. Among others,...
This paper presents an improved Plantard's modular arithmetic (Plantard arithmetic) tailored for Lattice-Based Cryptography (LBC). Based on the improved Plantard arithmetic, we present faster implementations of two LBC schemes, Kyber and NTTRU, running on Cortex-M4. The intrinsic advantage of Plantard arithmetic is that one multiplication can be saved from the modular multiplication of a constant. However, the original Plantard arithmetic is not very practical in LBC schemes because of the...
This paper presents faster implementations of the lattice-based schemes Dilithium and Kyber on the Cortex-M4. Dilithium is one of the three signature finalists in the NIST post-quantum project (NIST PQC), while Kyber is one of the four key-encapsulation mechanism (KEM) finalists. Our optimizations affect the core polynomial arithmetic using the number-theoretic transform (NTT) of both schemes. Our main contributions are threefold: We present a faster signed Barrett reduction for Kyber,...
We present new speed records on the Armv8-A architecture for the lattice-based schemes Dilithium, Kyber, and Saber. The core novelty in this paper is the combination of Montgomery multiplication and Barrett reduction resulting in “Barrett multiplication” which allows particularly efficient modular one-known-factor multiplication using the Armv8-A Neon vector instructions. These novel techniques combined with fast two-unknown-factor Montgomery multiplication, Barrett reduction sequences, and...
Montgomery’s and Barrett’s modular multiplication algorithms are widely used in modular exponentiation algorithms, e.g. to compute RSA or ECC operations. While Montgomery’s multiplication algorithm has been studied extensively in the literature and many side-channel attacks have been detected, to our best knowledge no thorough analysis exists for Barrett’s multiplication algorithm. This article closes this gap. For both Montgomery’s and Barrett’s multiplication algorithm, differences of the...
Thanks to relatively small public and secret keys, the Supersingular Isogeny Key Encapsulation (SIKE) protocol made it into the third evaluation round of the post-quantum standardization project of the National Institute of Standards and Technology (NIST). Even though a large body of research has been devoted to the efficient implementation of SIKE, its latency is still undesirably long for many real-world applications. Most existing implementations of the SIKE protocol use the Montgomery...
With Fully Homomorphic Encryption (FHE), it is possible to process encrypted data without having an access to the private-key. This has a wide range of applications, most notably the offloading of sensitive data processing. Most research on FHE has focused on the improvement of its efficiency, namely by introducing schemes based on the Ring-Learning With Errors (R-LWE) problem, and techniques such as batching, which allows for the encryption of multiple messages in the same ciphertext. Much...
Isogeny based post-quantum cryptography is one of the most recent addition to the family of quantum resistant cryptosystems. In this paper, we propose an efficient modular multiplication algorithm for primes of the form $p = 2 \cdot 2^a \cdot 3^b - 1$ with b even, typically used in such cryptosystem. Our modular multiplication algorithm exploits the special structure present in such primes. We compare the efficiency of our technique with Barrett reduction and Montgomery multiplication. Our C...
Modular multiplication and modular reduction are the atomic constituents of most public-key cryptosystems. Amongst the numerous algorithms for performing these operations, a particularly elegant method was proposed by Barrett. This method builds the operation $a \bmod b$ from bit shifts, multiplications and additions in $\mathbb{Z}$. This allows building modular reduction at very marginal code or silicon costs by leveraging existing hardware or software multipliers. This paper presents a...
We present a hardware architecture for all building blocks required in polynomial ring based fully homomorphic schemes and use it to instantiate the somewhat homomorphic encryption scheme YASHE. Our implementation is the first FPGA implementation that is designed for evaluating functions on homomorphically encrypted data (up to a certain multiplicative depth) and we illustrate this capability by evaluating the SIMON-64/128 block cipher in the encrypted domain. Our implementation provides a...
A fully homomorphic encryption (FHE) scheme is envisioned as being a key cryptographic tool in building a secure and reliable cloud computing environment, as it allows arbitrarily evaluation of a ciphertext without revealing the plaintext. However, existing FHE implementations remain impractical due to their very high time and resource costs. Of the proposed schemes that can perform FHE to date, a scheme known as FHE over the integers has the ad-vantage of comparatively simpler theory, as...