Bootstrapping Optimization Techniques for the FINAL Fully Homomorphic Encryption Scheme


1. Introduction

FHE is an encryption scheme that evaluates encrypted data without decrypting it, and the decrypted results are the same as the outputs of the corresponding operations on the plaintext. Fully homomorphic encryption can achieve ciphertext protection of the whole life cycle of data. It has a wide range of application requirements in artificial intelligence, privacy computing, cloud computing, blockchain security, anonymous authentication and other fields.

In 2009, Gentry presented the first feasible homomorphic encryption scheme, and also presented a theoretical framework for constructing homomorphic encryption schemes. The theoretical framework constructs a somewhat homomorphic encryption scheme, and then implements fully homomorphic encryption through bootstrapping. Considering the security of the scheme, some homomorphic encryption schemes need to introduce noise. The noise reduces the security of the encryption scheme to computationally hard mathematical problems (e.g., LWE), thereby preventing the secret key from being easily compromised. The noise grows slightly during homomorphic addition, and explosively during homomorphic multiplication, and once the noise exceeds a certain threshold, a decryption error will occur. Therefore, we must carry out noise reduction operations to ensure proper decryption. Bootstrapping is the most effective operation to reduce ciphertext noise. In his groundbreaking work, Gentry established the following fundamental theorem: If a scheme can homomorphically evaluate its own decryption circuit (i.e., supports bootstrapping), then it can be transformed into a fully homomorphic encryption (FHE) scheme. This theorem demonstrates that bootstrapping serves as a sufficient condition for achieving FHE. Critically, no known techniques—such as modulus-switching or key-switching—can fully eliminate noise accumulation without relying on bootstrapping. However, bootstrapping computation is expensive and becomes the bottleneck of computation performance of all homomorphic encryption algorithms.

There are two major lines of FHE schemes. The first is BGV and its variants [1,2,3,4,5]. These works encrypt a vector of N messages using an RLWE ciphertext. The somewhat homomorphic encryption scheme used in the schemes enable leveled fully homomorphic encryption (leveled FHE) by introducing noise reduction techniques (modulus-switching and the rescale technique), that is, the circuit can homomorphically execute any bounded depth of the circuit with the circuit depth as the input parameter. Therefore, in the process of bootstrapping, the security assumption of the sparse subsets sum is no longer relied on to compress the decryption circuit. In this way, the security of schemes are based on lattice problems with quasi-polynomial approximation factors. Since the ciphertext of the schemes is a vector form, it is necessary to introduce the “evaluation key” in the process of homomorphic evaluation. Moreover, this kind of scheme supports packaging and parallel operation, enabling the simultaneous processing of multiple plaintext bits, thereby possessing significant application potential. At present, the open-source libraries HElib, SEAL and HEAAN based on BGV, BFV and CKKS, are widely used.
The second line of work starts with FHEW [6] and TFHE [7,8] and is later improved by FINAL [9]. These works are based on GSW and its variants and support bootstrapping in tens of milliseconds. The noise growth in the homomorphic evaluation is asymmetric. And this makes previous noise reduction techniques no longer applicable. Therefore, Gentry et al. [10] proposed a flattening technique to achieve leveled fully homomorphic encryption. Moreover, according to the characteristics of noise growth, Alperin-Sheriff et al. designed a fast bootstrapping algorithm [11]. At present, the open source libraries designed based on this kind of scheme mainly include FHEW and TFHE. Among them, TFHE only takes 13 ms to bootstrap one bit of information. In 2022, Charlotte Bonte et al. [9] proposed an NTRU-based GSW-like scheme (NGS) and constructed a FINAL fully homomorphic encryption scheme. The FINAL scheme outperformed TFHE with 28% faster bootstrapping.

At present, bootstrapping is the only way to achieve homomorphic encryption, and it is also the main reason for the low efficiency of homomorphic encryption. Therefore, bootstrapping is the core and key of the whole homomorphic encryption scheme.

Notably, schemes such as TFHE and FHEW present stronger optimization incentives compared to BGV-type frameworks. This is primarily due to their inherently frequent bootstrapping requirements in practical applications (e.g., deep circuit evaluations), where even marginal reductions in the per-bootstrap overhead yield significant cumulative gains. Furthermore, TFHE’s and FHEW’s noise growth profiles are more sensitive to parameter adjustments, amplifying the impact of noise optimizations on overall feasibility and performance. In contrast, BGV’s leveled architecture inherently limits bootstrapping frequency, diminishing the relative urgency of optimizing its bootstrap mechanics. Critically, the FINAL bootstrapping scheme—specifically designed as an optimized variant of TFHE’s bootstrapping—demonstrates even greater potential for refinement.

In this work, our contributions are as follows.

(1) We adopt the ellipsoidal Gaussian sampling method to select keys f and g for the NGS basic encryption scheme. In the original FINAL scheme, the standard deviations of keys f and g are the same, while the bootstrapping noise is more dependent on the standard deviation of key g. The larger the standard deviation of g, the larger the bootstrapping noise. Hence, in the optimization scheme, the bootstrapping optimization technique based on the ellipsoidal Gaussian is used to make the standard deviation of key f and g samples expand and shrink by the same multiple, respectively, and as the standard deviation of g becomes smaller, the bootstrapping noise becomes smaller. Therefore, we can better adjust the parameters of the FINAL scheme within a certain noise bound.

(2) We adjust the parameters of the FINAL scheme to analyze the influence of different decomposition bases and their weights on the bootstrapping noise and efficiency, and propose an optimization scheme based on the FINAL scheme. The optimization scheme was 33.3% faster than the original FINAL scheme.

The organizational structure of this paper is as follows. Section 3 introduces preliminary knowledge. Section 4 describes the bootstrapping procedure of the FINAL scheme and provides a more detailed analysis of the bootstrapping noise. Then, we propose the bootstrapping optimization technique based on the ellipsoidal Gaussian. Section 5 explains the effects of different parameters on the FINAL scheme and how to optimize the parameters of the FINAL scheme, and introduces a bootstrapping optimization technique based on the ellipsoidal Gaussian. Section 6 shows the security analysis of the scheme, and Section 7 summarizes this paper.

2. Related Work

In recent years, research on lattice-based fully homomorphic encryption (FHE) schemes has flourished and numerous schemes have been proposed, and there have been improvements and innovations based on them.

Gentry [12] constructed the first fully homomorphic encryption algorithm in 2009, which was truly groundbreaking and opened the curtain on research into fully homomorphic encryption. In 2013, Gentry et al. [10] proposed a GSW for fully homomorphic encryption based on approximate eigenvectors without the key-switching technique for dimensional reduction. Brakerski et al. [2] proposed the BGV fully homomorphic encryption algorithm in 2014, based on the learning with errors (LWE) assumption, which uses dimension reduction and modulus reduction to control noise, breaking the original Gentry compression circuit to control noise. In 2015, Ducas and Micciancio [6] proposed the FHEW scheme, which uses a variant of the GSW scheme as an accumulator and uses this accumulator to refresh the (bootstrapping) ciphertexts, which improves the bootstrapping efficiency. Chillotti et al. [7] proposed the TFHE scheme at Asiacrypt 2016, which replaced the product of TGSW ciphertext (matrix) and TGSW ciphertext (matrix) in the FHEW scheme with the external product of TGSW ciphertext (matrix) and TLWE ciphertext (vector). The time required for bootstrapping decreases from less than 1 s to less than 0.1 s, and the size of the bootstrapping key reduces from 1 GB to 24 MB. In 2017, Chillotti et al. [8] proposed a blind rotation algorithm and used two different techniques (called horizontal packing and vertical packing) to improve the evaluation of parity circuits. Charlotte Bonte et al. [9] proposed an NTRU-based GSW-like scheme (NGS) and constructed a FINAL fully homomorphic encryption scheme by combining the LWE encryption scheme with the NGS bootstrapping scheme in 2022. The FINAL scheme outperforms TFHE with 28% faster bootstrapping and 45% smaller bootstrapping and key-switching keys.
Based on a recent, more detailed analysis of the FINAL scheme [9], we find that the bootstrapping noise of the FINAL scheme is more dependent on the standard deviation of key g of the NGS scheme, and the scheme still has space for optimization in the choice of parameters. Based on this, we can perform optimization to improve the efficiency of bootstrapping of the FINAL scheme.

3. Preliminaries

3.1. Symbols and Parameters

The mathematical notations used in this paper are described in Table 1.

3.2. Gaussian Distribution and Sub-Gaussian Distribution

Definition 1 

(Discrete Gaussian distribution [13]). For c R , c is the center of the Gaussian distribution, and the Gaussian parameter is σ R + . The discrete Gaussian distribution is defined as follows:

ρ σ , c ( x ) = exp ( x c 2 2 · σ 2 )

where σ , c R > 0 is the standard deviation, and c is the mean. Therefore, ρ σ , c ( Z ) = i = ρ σ , c ( i ) . The probability of x Z is given by ρ σ , c ( x ) / ρ σ , c ( Z ) , and if c = 0 , then the distribution is represented by χ σ .

Definition 2 

(Sub-Gaussian distribution). The random variable V over R is α sub-Gaussian when it satisfies the condition of

E [ exp ( t · V ) ] 1 2 exp ( α 2 · t 2 )

where t R . From the definition, it can be seen that V a r ( V ) α 2 .

Lemma 1 

([14]). If x is a discrete random vector over R n , let the component of x be x i , and  x i be α i sub-Gaussian; then, x is said to be an α sub-Gaussian distribution, where α = max i [ n ] α i .

3.3. Digital Decomposition

Given the integer q and the decomposition base B, let l = log B q , and define g q , B = ( B 0 , , B l 1 ) ; for  k Z q , define its decomposition under base B as g 1 ( k ) = ( k 0 , , k l 1 ) , and obtain

g 1 ( k ) · g = k 0 · B 0 + + k l 1 · B l 1 = k

For f R Q , define g 1 ( f ) = i = 0 N 1 g 1 ( f i ) X i , and obtain

g 1 f · g = i = 0 N 1 g 1 f i · g · X i = i = 0 N 1 f i , 0 , , f i , l 1 · B 0 , , B l 1 · X i = i = 0 N 1 f i · X i = f

The digital decomposition g 1 can be deterministic or randomized [14,15].

3.4. NTRU Problems

Definition 3 

(NTRU [9]). Let N > 0 and Q > 1 be integers, R : = Z X / < X N + 1 > . Let the real number σ > 0 , g , f χ σ N and f be invertible in R Q .

The ( n , q , σ ) NTRU computational problem aims to recover f and g under a given h = g · f 1 mod Q .

The ( n , q , σ ) NTRU decision problem aims to distinguish between χ σ N distribution and uniform random distribution on R Q .

Damien Stehlé and Ron Steinfeld [16] showed that if the secret key polynomials are selected by rejection from discrete Gaussians, then the public key, which is their ratio, is statistically indistinguishable from uniform over its domain at Eurocrypt 2011. Pellet-Mary et al. [17] reduced the worst-case approximate Shortest Vector Problem over ideal lattices to an average-case search variant of the NTRU problem at Asiacrypt 2021. Therefore, the cryptographic scheme based on the NTRU problem also has provable security.

3.5. LWE Problems

Definition 4 

(LWE [18]). Let N > 0 and q > 1 be integers, and χ be an error probability distribution on Z q . Secret vector s Z q n , random vector a Z q n , error vector e χ , and output LWE distribution: ( a , b = a · s + e mod q ) Z q n × Z q .

The ( n , q , χ , m ) LWE computational problem aims to solve the secret vector s given ( a , b ) .

The ( n , q , χ , m ) LWE decision problem aims to distinguish between LWE distribution and uniform distribution on Z q n × Z q .

More detailed descriptions are as follows.

Error Distribution ( χ ): Typically a discrete Gaussian distribution centered at 0 with standard deviation σ . The error magnitude is bounded to ensure decryptability in encryption schemes.

Parameters and Hardness: Dimension (n): Higher n increases security but reduces efficiency. Modulus (q): A prime or power-of-two, balancing noise growth and security. Hardness Reduction: LWE is as hard as solving worst-case lattice problems (e.g., GapSVP, SIVP).

Decision-to-Search Reduction: The decisional LWE problem is polynomially equivalent to the search variant under certain parameter conditions [18]. This equivalence underpins the semantic security of LWE-based cryptosystems.

3.6. LWE-Based Encryption Scheme

LWE . SercretKeygen ( 1 λ ) : randomly and uniformly select the vector s Z q n , and output key s .

LWE . Enc ( s , m ) : m 0 , 1 , randomly and uniformly select the vector a Z q n , select the error vector e χ , calculate b a · s e , and output c 1 = a , c 2 = ( Δ · m b ) Z q , i.e., ( c 1 , c 2 ) . Among them, Δ : = q / 4 .

LWE . Dec ( s , c ) : input ( s , c ) , and output 4 q ( ( c 2 + c 1 · s ) mod q ) mod 2 .

Here, represents the rounding symbol.

3.7. FHE

The essence of fully homomorphic encryption is homomorphic evaluation, i.e., the results of plaintext calculations are equivalent to the results of ciphertext calculations after decryption. A homomorphic encryption scheme consists of four algorithms: a key generation algorithm, an encryption algorithm, a decryption algorithm and an evaluation algorithm.

KeyGen ( 1 λ ) : Input security parameter λ , and output public key pk, evaluation key evk for ciphertext calculation, and private key sk.

Enc ( pk , m ) : Given m 0 , 1 , use the public key to encrypt message m and output ciphertext c.

Dec ( sk , c ) : Use private key sk to decrypt ciphertext c and recover message m.

Eval ( evk , F , c 1 , , c l ) : Use the evaluation key evk, and input c 1 , , c l and F to evaluation algorithm Eval, where F : 0 , 1 l 0 , 1 , and output ciphertext c.

3.8. Bootstrapping

All common FHE schemes are based on noise encryption (noise guarantees the security of the fresh encryption), where the evaluation of homomorphic operations increases the noise magnitude and reduces the quality, i.e., calculates the budget. The primary use of bootstrapping is to convert exhausted ciphertexts into “equivalent” refreshed ciphertexts. Exhausted ciphertexts contain high noise and cannot be operated on further, whereas refreshed ciphertexts can support further homomorphic operations.

As shown in Figure 1, in a fully homomorphic encryption scheme, we deal with a homomorphic evaluation of the decryption procedure, i.e., bootstrapping, which uses an encrypted secret key and a ciphertext to generate an equivalent ciphertext that we can further compute on. The encrypted secret key, also called a bootstrapping or refreshing key, is provided by the secret key holder as part of the public key material. Bootstrapping strategies are divided into gate bootstrapping [6], circuit bootstrapping [8] and programmable bootstrapping [19].
Gate bootstrapping refers to the fact that this fast bootstrapping is performed to refresh the ciphertext after every gate evaluation. And circuit bootstrapping allows us to execute a larger leveled circuit between each bootstrapping [8]. This paper focuses exclusively on gate bootstrapping.

3.9. Key Circular and Key Circular Security

Encrypt key s k to generate a bootstrapping key, and if the key is still secure after being encrypted by the corresponding public key, then the encryption scheme can achieve circular security. The circular secure encryption scheme can resist related-key attacks. Given the security parameter λ , let K be the key space and f be the function from K to C . A homomorphic encryption scheme H E = K e y G e n , E n c , D e c , E v a l about f satisfies circular security if for all polynomial-time adversaries A , the probability of adversary A winning the following game is negligible:

(1) The challenger computes p k , s k , e v k K e y G e n λ and randomly chooses a bit b 0 , 1 .

(2) Let f + : M × M M be a function, and compute f + x , y = x + y M . The challenger computes challenge ciphertext c* and sends it to adversary A .

c * = E v a l e v k f + , E n c p k 0 , f s k b = 0 E n c p k 0 C otherwise

(3) Adversary A guesses a bit then outputs b 0 , 1 . The advantage of adversary A is defined as A d v A = Pr b = b 1 / 2 . For the FHE scheme based on LWE, E v a l e v k f + , E n c p k 0 , f s k can be viewed as the ciphertext of f s k [20,21].

This paper assumes that the improvement scheme based on FINAL is key circular security.

3.10. NGS: NTRU-Based GSW-like Scheme

The NGS scheme, inspired by the simplified variant of GSW proposed in [6], has the same quasi-additive noise growth as the GSW scheme and is used as a bootstrapping accumulator. Meanwhile, in order to accelerate bootstrapping, the NGS scheme defines an external product that is the product of scalar NTRU ciphertexts and vector NTRU ciphertexts. Due to the fact that the NGS ciphertexts are never decrypted in the bootstrapping pipeline, we omit the decryption procedure.
The NGS [9] encryption scheme based on NTRU has two encryption functions. The first one encrypts a plaintext m which is a ternary polynomial as an element of R Q , and the other encrypts it as a vector over R Q . To simplify the noise analysis, the scheme assumes that all vector ciphertexts encrypted messages belong to the monomial set M .

3.10.1. Algorithm Description

NGS . ParamGen ( 1 λ ) : Input security parameter λ , and output ( N , Q , ς , B , l ), where B is the decomposition base of the decomposed ciphertext, l = log B ( Q ) .

NGS . KeyGen : Input ( N , Q , ς , B , l ), and output sk : = f = 1 + 4 · f , where f = 1 + 4 · f , f χ ς N and f 1 R Q ;

NGS . EncS ( m , sk ) : Input ( m , sk ), and output c = g / f + Δ · m R Q , where m is the plaintext of the ternary polynomial to be encrypted, g χ ς N , Δ : = Q / 4 , and c denotes a scalar encryption of m. NGS . EncVec ( m , sk ) : Input ( m , sk ), and output c = g / f + g · m R Q l , where m M = ± b · X k : b 0 , 1 , k N is a monomial, defined as g : = ( g 0 , , g l 1 ) , g i χ ς N ( 0 i l 1 ), and c is a vector encryption of m.

3.10.2. External Product

Let a scalar ciphertext be c = g / f + Δ · μ R Q , where μ is a ternary polynomial, and a vector ciphertext is c = g / f + g · ν R Q l , where ν M defines the external product of c and c :

c c = g 1 ( c ) · c R Q

i.e.,  c c = g 1 ( c ) · g / f + g 1 ( c ) · g · ν = ( g 1 ( c ) · g + g · ν ) / f + Δ · μ · ν .

3.10.3. Noise Analysis

Assuming that for all a R Q , g q , B 1 ( a ) is γ sub-Gaussian, where γ = O ( B ) .

Definition 5 

(Noise of scalar ciphertext [9]). Let a scalar ciphertext be c = g / f + Δ · m R Q , and define the noise of c as e r r ( c ) = c · f Δ · m R Q .
Definition 6 

(Noise of vector ciphertext [9]). Let a vector ciphertext be c = g / f + g · m R Q l , and define the noise of c as e r r ( c ) = c · f g · m · f R Q , which is treated as a polynomial with coefficients in [ Q / 2 , Q / 2 ] over Z X .
Lemma 2 

(Bound on the noise of a scalar ciphertext [9]). Let a scalar ciphertext be c = g / f + Δ · m R Q if m is a monomial in the form of ± b · X k (where b 0 , 1 ); then,

V a r ( e r r ( c ) ) V a r ( g ) + 4 · ς 2

If m is a ternary polynomial of N 1 degree, then

V a r ( e r r ( c ) ) V a r ( g ) + 4 · N · ς 2

Lemma 3 

(Bound on the noise of a vector ciphertext [9]). Let a vector ciphertext be c = g / f + g · m R Q l ,

e r r ( c ) = ( g / f + g · m ) · f g · m · f = g

Then, according to Lemma 1, we have V a r ( c ) σ g 2 .

Lemma 4 

(Noise growth of external product [9]). Assuming that a scalar ciphertext is c = g / f + Δ · μ R Q and a vector ciphertext is c = g / f + g · ν R Q l , ν M , define c m u l t = c c ; then,

V a r ( e r r ( c m u l t ) ) N · l · γ 2 · V a r ( g ) + 4 · ς 2 + ν 2 2 · V a r ( g ) N · l · γ 2 · V a r ( e r r ( c ) ) + V a r ( c )

Lemma 5 

(Noise of a sequence of external products [9]). Let c = g / f + Δ · m R Q (m is a ternary polynomial). Let c i = g i / f + g · m i R Q l ( m i M ). If  c = c i = 1 k c i , then

V a r ( e r r ( c ) ) N · l · γ 2 · i = 1 k V a r ( g i ) + V a r ( g ) + 4 · ς 2

3.10.4. Modulus-Switching

Definition 7 

(ModSwitch). Let Q , q Z and 1 < q < Q , and the randomized rounding function · Q : q : Z Q Z q , i.e.,  z Q : q : = q · z / Q + B , where B 0 , 1 is a Bernoulli random variable, Pr B = 1 = ( q · z / Q ) q · z / Q [ 0 , 1 ] . Note that ε : = z Q : q ( q · z / Q ) is a 1-Gaussian distribution. The above definition can be extended to the form of polynomials, vectors and matrices, i.e.,

ModSwitch ( c ) = i = 0 N 1 [ c i ] Q : q X i R q

Lemma 6 

([9]). If c = g / f + Δ · m R Q , then ModSwitch ( c ) is a scalar encryption of m under R q and

V a r ( e r r ( ModSwitch ( c ) ) ) ( q / Q ) 2 · V a r ( e r r ( c ) ) + 1 + 16 · N · ς 2

3.10.5. Key-Switching from the NGS to the Base Scheme

After modulus-switching, the generated NTRU ciphertext c = g / f + ε + Δ · m R q needs to conduct key-switching in order to obtain the ciphertext in LWE form. Key-switching is divided into a key generation algorithm and a key-switching algorithm.

Key-switching key generation algorithm: Let ( A , b ) be the LWE sample after the encryption of key s , and  c = g / f + ε + Δ · m R Q be the scalar NGS ciphertext after the encryption of key f, where ε is the rounding error after modulus-switching. The key-switching key is defined as follows:

ks k NTRU LWE : = ( A , b : = A · s + e + P · f 0 )

where A Z q ( N · L ) × n , e χ σ e N · L , f 0 : = col 0 ( Φ ( f ) ) Z N , P : = I N g q , B k s k Z ( N · L ) × N .

Key-switching algorithm: During key-switching from NTRU to LWE, i.e., given the NTRU ciphertext c = g / f + ε + Δ · m R Q , the ciphertext is converted into LWE ciphertext through key-switching— KeySwitc h NTRU LWE ( c , ks k NTRU LWE ) :

  • ks k NTRU LWE : = ( A , b : = A · s + e + P · f 0 ) .

  • a KeySwitch ( c , A )

  • b KeySwitch ( c , b )

  • output c : = ( a , b )

That is, we decompose the coefficient vector of c and multiply it by both components of ks k NTRU LWE . Thus, we define y : = g 1 ( ϕ ( c ) ) Z N · L , i.e., c : = ( a , b ) = ( y · A , y · b ) Z q n + 1 ,

b = a · s + y · e + g o + ε · ( ( 1 , 0 ) + 4 · ϕ ( f ) ) + 4 · ε · ϕ ( m ) · ϕ ( f ) + Δ · m 0

where ε 1 / 2 , 1 / 2 , m 0 is the constant term of m.

3.11. Ellipsoidal Discrete Gaussian Sampling

Yu Yang et al. [22] proposed that it is beneficial to replace spherical discrete Gaussian sampling with suitable ellipsoidal discrete Gaussian sampling when obtaining GPV signatures at Crypto 2022. Making the half that actually occurs in signatures shorter reduces signature size at essentially no security loss (in a suitable range of parameters).

For the NTRU encryption scheme, ellipsoidal Gaussian sampling is compared to spherical sampling by changing the original standard deviation ( σ f , σ g ) of the key ( f , g ) sample to ( β · σ f , β 1 · σ g ) , where σ f = σ g , β R and β > 1 . This makes the effect on security after ellipsoidal Gaussian sampling 4 or 5 contrasts lower than that of spherical Gaussian sampling, which is negligible, and the second component of key ( f , g ) is shortened to the original β 1 . Where β is the distortion factor and twisted paradigm for ellipsoidal Gaussian sampling.

4. Bootstrapping Optimization of FINAL Scheme

In FINAL [9], two fully homomorphic encryption schemes are mentioned, one based on the MNTRU basic encryption scheme and the other based on the LWE basic encryption scheme, both of which use the NGS scheme to bootstrap, where the latter is more efficient than the former.

Therefore, this paper focuses on a fully homomorphic encryption scheme based on the LWE basic encryption scheme and denoted by the FINAL scheme.

Figure 2 shows a graphical representation of the FINAL bootstrapping, with a detailed description as follows.

The LWE key is binary. During the bootstrapping, by using a binary CMux encryption gate, at the end of the main loop of the refresh procedure, we obtain NTRU ciphertexts in the form of c = g / f + ε + Δ · m R Q . Then, the NTRU ciphertexts are converted into LWE ciphertexts through modulus-switching and key-switching. Below is the definition of the binary CMux gate ( ε is the rounding error after modulus-switching).

Definition 8 

(The binary CMux gate). For a given f i 0 , 1 , we define key b s k i :

s i = 0 b s k i : = N G S . E n c V e c ( 0 ) s i = 1 b s k i : = N G S . E n c V e c ( 1 ) .

Then, our CMux gate is defined as

CMux i ( c i ) = 1 + ( X c i 1 ) · b s k i .

4.1. Bootstrapping

Bootstrapping consists of two parts: the bootstrapping key generation algorithm (Algorithm 1) and the bootstrapping algorithm (Algorithm 2). The algorithms are described as follows.

Algorithm 1 Bootstrapping Key Generation

Input: Secret key s Z q n of LWE basic encryption scheme

Output: Bootstrapping key b s k

1:

( s 0 , , s n 1 ) ϕ ( s )

2:

for  i 0 to n 1  do

3:

    Compute b s k i according to Equation (1)

4:

end for

5:

Return b s k : = b s k i : 0 i n 1

Algorithm 2 Bootstrapping Algorithm

Input: LWE ciphertext ct = ( ct . a , ct . b ) Z q n + 1 encrypting m 0 , 1 , bootstrapping keys b s k i 0 i n 1 and key-switching key ksk NTRU LWE .

Output: LWE ciphertext ct Z q n + 1 encrypting the same m.

  1:

( c 0 , , c n 1 , c n ) 2 · N · ct q

  2:

A C C 0 Q 8 · X N / 2 · i = 0 N 1 X i

  3:

for  i 0 to n 1  do

  4:

     c Mux CMux i ( c i )

  5:

     A C C A C C c Mux

  6:

end for

  7:

A C C A C C + Q 8 · i = 0 N 1 X i

  8:

A C C ModSwitch ( A C C )

  9:

ct KeySwitch ( A C C , ksk NTRU LWE )

10:

return 
ct

4.2. Bootstrapping Noise Analysis

The noise generated by bootstrapping affects the correctness of decryption. When the noise exceeds this noise bound (≥ q / 8 ), the scheme cannot decrypt properly. The analysis of the noise generated by bootstrapping is as follows.

The first step of bootstrapping, i.e., ( c 0 , , c n 1 , c n ) 2 · N · ct q , is to change the components of ct Z q n + 1 from module q to module 2 · N , and use c t 2 N to represent the result vector, i.e., c t 2 N = 2 · N · ct q ,

ct · ( ϕ ( s ) , 1 ) q 2 · N · c t 2 N · ( ϕ ( s ) , 1 ) = q 2 · N · ε · ct · ( ϕ ( s ) , 1 ) ,

where ε 1 / 2 . Therefore,

ct · ( ϕ ( s ) , 1 ) q 2 · N · c t 2 N · ( ϕ ( s ) , 1 ) q 4 · N · ε · ct · ( ϕ ( s ) , 1 ) .

The second step of bootstrapping is to assign values to the test vector, A C C 0 = Q 8 · X N / 2 · i = 0 N 1 X i , without involving error analysis.

The third step of bootstrapping is to perform the following loop on i from 0 to n 1 : c Mux CMux i ( c i ) , A C C A C C c Mux . Among them, the noise generated by c Mux CMux i ( c i ) , combined with Equation (2), can be obtained as follows:

Var ( err ( CMux i ) ) X c i 1 2 2 · Var ( err ( b s k i ) ) 2 · Var ( err ( b s k ) ) ,

where err ( CMux i ) represents the noise in the CMux gate and err(bsk) represents the noise in the NGS vector ciphertext. Regarding A C C A C C c Mux , from Lemma 5 and Equation (20), the noise introduced by this sequence of external products can be obtained as follows:

V a r ( e r r ( A C C ) ) n · N · l · γ 2 · V a r ( e r r ( c Mux ) + V a r ( e r r ( A C C 0 ) ) .

And from A C C 0 = Q 8 · X N / 2 · i = 0 N 1 X i and Lemma 2, it follows that

V a r ( e r r ( A C C ) ) 2 n · N · l · γ 2 · V a r ( e r r ( b s k ) ) + 4 · N · σ f 2 .

where σ f is the standard deviation of f.

The fourth step of bootstrapping involved performing polynomial addition without involving the generation of noise.

The fifth step of bootstrapping, A C C M o d S w i t c h ( A C C ) : From Lemma 6 and Equation (22), the noise after modulus-switching is as follows: V a r ( e r r ( A C C ) 2 · ( q / Q ) 2 · n · N · l · γ 2 · V a r ( e r r ( b s k ) ) + 2 · ( q / Q ) 2 · N · ς 2 + 1 + 16 · N · ς 2 . Then, from Lemma 3, we obtain

V a r ( e r r ( A C C ) 2 · ( q / Q ) 2 · N · σ f 2 + 1 + 2 n · ( q / Q ) 2 · N · l · γ 2 · σ g 2 + 16 · N · σ f 2 .

In the sixth step of bootstrapping, c KeySwitc h NTRU LWE ( A C C , ks k NTRU LWE ) : we see that the noise of the resulting LWE encryption equals y · e + g o + ε + 4 · ε · ϕ ( f ) + 4 · ε · ϕ ( m ) · ϕ ( f ) , as defined in [7]. Combined with Equation (23), the variance of the noise is as follows:

V a r ( e r r ( c ) ) = V a r ( y · e ) + V a r ( g o ) + V a r ( 4 · ε · ϕ ( f ) ) + V a r ( 4 · ε · ϕ ( m ) · ϕ ( f ) ) + V a r ( ε ) N · L · V a r ( y ) · V a r ( e ) + V a r ( g ) + 1 + 16 · N · V a r ( ε ) · V a r ( f ) + 4 · m 2 2 · σ f 2 2 · ( q / Q ) 2 · N · σ f 2 + 1 + 16 · N · σ f 2 + N · L · γ B k s k 2 · σ e 2 + 2 · ( q / Q ) 2 · n · N · l · γ 2 · σ g 2 .

where σ f , σ g and σ e are the standard deviations of f, g and e sampling, respectively, and γ B k s k = O ( B k s k ) . Due to the use of two decomposition bases B 1 , B 2 in the bootstrapping procedure, the noise becomes

V a r ( e r r ( ct ) ) N L γ B k s k 2 σ e 2 + 2 ( q / Q ) 2 N σ f 2 + 1 + 2 ( q / Q ) 2 n 1 N l 1 γ B 1 2 σ g 2 + 16 N σ f 2 + 2 ( q / Q ) 2 n 2 N l 2 γ B 2 2 σ g 2 .

where γ B i = O ( B i ) and l i = log B i ( Q ) , i 1 , 2 .

According to Equation (24), we find that the larger the decomposition bases, the larger the bootstrapping noise. And the noise introduced by the bootstrapping depends more on the standard deviation of g than on the standard deviation of f. However, the standard deviation of f and g is the same in the FINAL scheme. Hence, we introduce a bootstrapping optimization technique based on the ellipsoidal Gaussian to optimize the FINAL scheme.

4.3. Bootstrapping Optimization Technique Based on Ellipsoidal Gaussian

Combined with the noise analysis in the above section and reference [22], the sampling of f and g selects an ellipsoidal Gaussian distribution, i.e., the standard deviations of the Gaussian distribution of f and g sampling are denoted as β · σ f and β 1 · σ g , respectively. This greatly reduces the noise generated in the bootstrapping procedure while ensuring that the impact on safety can be ignored. Combined with Equation (24), the bootstrapping noise becomes:

V a r ( e r r ( ct ) ) N L γ B k s k 2 σ e 2 + 16 N β 2 σ f 2 + 2 ( q / Q ) 2 n 1 N l 1 γ B 1 2 β 2 σ g 2 + 2 ( q / Q ) 2 n 2 N l 2 γ B 2 2 β 2 σ g 2 + 2 ( q / Q ) 2 N β 2 σ f 2 + 1 .

We calculate the bootstrapping noise when β is different. We refer to the parameter settings in the FINAL scheme as shown in Table 2. σ f = σ g = 2 3 , σ e = 4.39 , B k s k = 3 , L : = log B k s k ( q ) = log 3 92683 = 11 , n 1 = 140 , n 2 = 470 , l i : = log B i ( Q ) , l 1 = 7 , l 2 = 5 , B 1 = 8 , B 2 = 16 . Because γ = O ( B ) , assuming γ B 1 = B 1 = 8 , γ B 2 = B 2 = 16 and γ B k s k = B k s k = 3 . The bootstrapping noise at different β values is obtained as follows.
As can be seen from Figure 3 and Table 3, when other parameters are the same, the larger the disturbance factor β , the less noise is introduced by the bootstrapping and the less noise reduction is caused by the bootstrapping. When β changes from 3 to 4, the size of bootstrapping noise reduction is already small, and when β changes from 4 to 5, the reduction ratio of the variance of bootstrapping noise is less than 1%. The significance of optimization is minimal. However, with the increase in β , key g becomes more sparse, and the scheme becomes more vulnerable to key recovery attacks. Therefore, in this paper, we set β = 4 as the optimal case for analysis.

The distortion factor β is introduced during bootstrapping; it only reduces the size of the bootstrapping noise, and does not affect the noise bound, i.e., the optimization scheme can be decrypted normally as long as the bootstrapping noise is within the noise bound.

From this section and Equation (25), we know that the optimization scheme can adopt larger decomposition bases than the original FINAL scheme. In the next section, we analyze the impact of different parameters on the bootstrapping noise and efficiency of the scheme, where the impacts of the decomposition bases are analyzed in detail.

5. Parameter Optimization of FINAL Scheme

5.1. Computation Overhead

The performance of the fully homomorphic encryption scheme is impacted by polynomial multiplications, and the number of polynomial multiplications in the FINAL scheme [9] is closely related to the decomposition bases of the bootstrapping, as follows:

In the FINAL scheme, formula ( ( X c i 1 ) · A C C ) b s k i + A C C is used to compute A C C CMux i ( c i ) in the bootstrapping algorithm (Algorithm 2). Assuming that the bootstrapping keys are fast Fourier transformed in advance, the external product requires l i + 1 FFTs and l i Hadamard vector products, where l i is the length of b s k i .

Let n 1 = 140 , n 2 = 470 , l 1 = 7 , l 2 = 5 ; then, we require l 1 · n 1 + l 2 · n 2 = 3330 ring multiplications and ( l 1 + 1 ) · n 1 + ( l 2 + 1 ) · n 2 = 3940 FFTs per bootstrapping, where the ring is a polynomial ring.

From the relationship between the total number of ring multiplications and the total number of FFTs, it is clear that the total number of FFTs is always n = n 1 + n 2 = 610 more than the total number of ring multiplications (therefore, in this paper, we only consider the total number of ring multiplications).

5.2. Memory Overhead

In terms of the memory overhead, the bootstrapping key consists of blind rotation keys (n NGS vector ciphertexts) and a key-switching key. The optimized scheme modifies only the dimensions of the blind rotation keys, which can be represented by the number of ring polynomials ( R Q )—specifically quantified as n 1 × l 1 + n 2 × l 2 . Since this dimension aligns with the number of ring polynomial multiplications required for computation, the memory overhead of blind rotation keys can equivalently be characterized by the count of such ring polynomial multiplications.

5.3. Optimization of Parameters

The smaller l 1 , l 2 are, the smaller the total number of ring multiplications for fixed n 1 , n 2 . And l i : = log B i ( Q ) , so when n 1 and n 2 are fixed, the larger the decomposition bases B 1 , B 2 are, the smaller the total number of ring multiplications.

Similarly, when the decomposition bases B 1 , B 2 are fixed, i.e., l 1 , l 2 are fixed, the smaller n 1 , n 2 are, the smaller the total number of ring multiplications, and the more efficiently the scheme runs (where n 1 , n 2 is the number of NGS ciphertexts corresponding to the decomposition bases B 1 , B 2 , respectively). Therefore, we need to choose B 1 , B 2 and n 1 , n 2 values that make the total number of polynomial products as small as possible.

In summary, in order to improve the efficiency of bootstrapping, we choose the the largest decomposition bases B 1 , B 2 and n 2 as possible within the noise bound (where B 1 < B 2 ).

This section proposes the adaptive adjustment technique on the decomposition bases, combined with the bootstrapping optimization technique based on the ellipsoidal Gaussian mentioned in Section 3; finally, we can obtain the optimization scheme. Firstly, the disturbance factor β was introduced to reduce the noise introduced by the bootstrapping, and then, based on β , adaptively adjusted the decomposition bases B 1 , B 2 and the corresponding n 1 , n 2 , respectively. Finally, the efficiency of the scheme was improved.

The decomposition basis of the blind rotation key in this article is chosen as a power of 2. The core motivation for this choice lies in the synergy between hardware-friendly design and simplification in number theory. By reducing the complexity of modular arithmetic, optimizing storage architecture and leveraging inherent binary properties, they significantly enhance the practical efficiency of integer factorization algorithms.

5.4. Result

When β = 1 , i.e., under the original FINAL scheme, we run 1000 tests, and the scheme runs normally when the noise generated by bootstrapping is < q / 8 . And when the noise exceeds this noise bound (≥ q / 8 ), the scheme cannot decrypt properly.

As can be seen from Figure 4, when β = 1 , B 1 = 8 , B 2 = 16 , 32 , 64 , the noise is almost always within the noise bound, regardless of the value of n 1 , n 2 . When n 1 = 140 , n 2 = 470 , B 1 = 8 , B 2 = 16 , the average bootstrapping time is 41.4 ms, and the total number of ring multiplications is 3330; when n 1 = 140 , n 2 = 470 , B 1 = 8 , B 2 = 64 , the average bootstrapping time is 38.3 ms, and the total number of ring multiplications is 2860.
From Figure 5, when β = 1 , B 1 = 8 , B 2 = 128 , n 1 = 470 , n 2 = 140 , the bootstrapping noise is within the noise bound, so the scheme can use these parameters, and the average bootstrapping time is 44.8 ms, and the total number of ring multiplications is 3710. On the contrary, when n 1 = 140 , n 2 = 470 , the bootstrapping noise exceeds the noise bound, and the scheme does not decrypt properly.
From Figure 6, it can be seen that when β = 2 , B 1 = 8 , B 2 = 128 , whatever the value of n 1 , n 2 is, it is within the noise bound. The average bootstrapping times are 44.5 ms and 33.1 ms when n 1 = 470 , n 2 = 140 and n 1 = 140 , n 2 = 470 , respectively.
Since the decomposition size l i is the same when the decomposition base B i is 128, 256 and 512, it can be considered to be one case. From Figure 7, when β = 2 , B 1 = 128 , B 2 = 1024 , the optimal value of n i is n 1 = 605 , n 2 = 5 . Meanwhile, the average time is 27.85 ms, and the total number of ring multiplications is 1825. When β = 3 , B 1 = 128 , B 2 = 1024 , the optimal value of n i is n 1 = 590 , n 2 = 20 , the average running time is 27.81 ms and the total number of ring multiplications is 1810 at this time. When β = 4 , B 1 = 128 , B 2 = 1024 , the optimal value of n i is n 1 = 550 , n 2 = 60 , the average running time is 27.6 ms and the total number of ring multiplications is 1770.

In summary, we can see that changing the value of β only affects the noise introduced by bootstrapping, and the larger β , the smaller the bootstrapping noise. The optimization scheme reduces the bootstrapping noise by increasing β , and then selects a larger decomposition base B i and its corresponding n i to reduce the total number of ring multiplications and improve the bootstrapping efficiency of the scheme.

Table 4 illustrates the time required for bootstrapping schemes with different parameters and their corresponding optimization ratios, measured on a 13th Gen Intel(R) Core(TM) i7-13700F processor within an Ubuntu 20.04 virtual machine environment. The experiment shows that in the case of β = 4 , when B 1 = 128 , B 2 = 1024 , n 1 = 553 and n 2 = 57 , the bootstrapping noise is just within the noise bound. In view of the bootstrapping noise we calculated being larger than the actual value, we take n 1 = 550 as the optimal parameter. That is to say, when n 1 > 550 , the total number of ring multiplications becomes larger and the efficiency becomes lower. When n 1 < 550 , the bootstrapping noise exceeds the noise bound and does not meet the requirements. To sum up, in the case of β = 4 , when B 1 = 128 , B 2 = 1024 , n 1 = 550 and n 2 = 60 , the bootstrapping of the optimization scheme is the fastest, at 33.3% faster than the original FINAL scheme, and the memory overhead of blind rotation keys have been optimized by 47%.

6. Security Analysis

The optimized FINAL scheme retains the architectural framework of its original counterpart, modifying only the parameter configurations and the standard deviation criteria for key selection. Crucially, since the original FINAL scheme demonstrates circular security—a property ensuring resilience against key-dependent cryptographic attacks—the optimized variant inherently preserves this security guarantee. This preservation arises directly from the structural invariance of the framework, where parameter adjustments and key distribution refinements operate within the original scheme’s security-proven boundaries, thereby avoiding perturbations to its cyclic security invariants. The following sections analyze different attacks [22,23,24,25,26,27].

6.1. Key-Recovery Attacks

The most efficient attacks come from lattice reduction, aiming at finding a relatively short vector in the spanned lattice. Lattice reduction consists of constructing the algebraic lattice over R spanned by the vectors ( Q , 0 ) and ( g / f , 1 ) . After using lattice reduction on this basis, we enumerate all lattice points in a ball of radius 2 N · σ f , g , centered on the origin. With significant probability, we are therefore able to find ( f , g ) .

Classically, sieving on this projected lattice will recover all vectors of a norm smaller than 4 / 3 · λ , where λ is the norm of the ( 2 N b ) th Gram–Schmidt vector of the reduced basis, where b represents the BKZ blocksize.

The projection of the key is among them; when

b · σ f , g 4 / 3 · λ .

we can recover the secret key vector ( f , g ) with high probability.

For the best known lattice reduction algorithm, DBKZ [28], we obtain

λ = b 2 π e 1 N / b q .

and

( b / 2 π e ) 1 N / b q = ( 3 / 4 ) · b · σ f , g .

It is then easy to deduce the BKZ blocksize b and, therefore, to derive the security level.

However, in reference [22], when we introduce the ellipsoid Gaussian sampling technique, the standard deviation of key g becomes β 1 · σ g , so the sparsity of key g can be used for further attacks, and we can guess the position of some zeros in the vector with a reasonable probability.

If such a guess of positions, say, L 1 , , 2 n , appears to be correct, we can intersect the NTRU lattice with Z L ¯ , where L ¯ refers to the complement of the set L in the overset 1 , , 2 n . The dimension of the lattice changes from 2 n to 2 n | L | . As such, this final lattice reduction part is now easier and thus faster.

The volume in the twisted norm · β : the volume is scaled by the determinant of the intersected Gram matrix G = D i a g ( L ¯ ) Q β D i a g ( L ¯ ) , which is exactly β k (where k = | L | , Q β = T β t T β and T β = β I d β 1 I d ).

All in all, the (normalized) volume of the intersected lattice for the twisted norm is q d 2 d k β k 2 d k .

Therefore, Equation (28) becomes

( b / 2 π e ) 1 N / b q d 2 d k β k 2 d k = ( 3 / 4 ) · b · σ f , g

The concrete security is shown in Table 5.
k denotes the number of guessed zero positions, and there exists a trade-off between the probability of guessing correctly (the more zeroes there are to guess, the harder it becomes to correctly guess their positions) and the time required by the lattice reduction [22]. Therefore, the value of k should be chosen within a moderate range, and k should neither be too small nor excessively large. For experimental validation, we suggest testing k = 40 and k = 60 , respectively. As can be seen from Table 5, the ellipsoidal Gaussian sampling method, that is, adding disturbance factor β to samples f and g, the bootstrapping noise is greatly reduced at essentially no security loss.
Table 5 demonstrates that the optimized FINAL scheme resists key-recovery attacks. Further rigorous analysis in Section 6.2 and Section 6.3 establishes its resilience against algebraic attacks, as well as dense, high-rank sublattice, and subfield lattice attacks, thereby addressing its adversarial robustness in practical settings.

6.2. Dense, High-Rank Sublattice and Subfield Lattice Attacks

The overstretched parameters of NTRU are vulnerable against subfield lattice attacks. Ducas and van Woerden [29] showed that the concrete prediction put the fatigue point at q 0.004 · n 2.484 for n > 100 at Asiacrypt 2021. In this paper, we take Q = 912,829, N = 1024 , satisfying Q < 0.004 · N 2.484 . In our optimization scheme, the NTRU parameters lie outside the overstretched range.

6.3. Algebraic Attacks

The bootstrapping schemes are defined over algebraic lattices, which have a rich structure that could, in principle, lead to improved attacks. However, there is no known way to improve attack algorithms for their general lattice equivalent by more than polynomial factors in an asymptotic sense, and they do not affect our concrete security levels.

6.4. The Choice of Polynomial Rings

In this paper, we choose rings of the form R Q = R / Q R = Z Q [ X ] / < X N + 1 > , with N being a power of 2, and do not choose prime cyclotomic rings of the form Z Q [ X ] / < X n 1 + + n + 1 > , with n being a prime. This is because compared with prime cyclotomic rings, although Z Q [ X ] / < X N + 1 > rings are very scarce, there are limitations on the selection of rings, but the operations of the rings are simpler and more efficient, so we choose rings of the form Z Q [ X ] / < X N + 1 > .

6.5. Security Comparisons with TFHE Bootstrapping

The optimized FINAL scheme integrates NTRU (approximate GCD problem over polynomial rings) and LWE (learning with errors), with security relying on the NTRU assumption of the computational hardness of decomposing short vectors in polynomial lattices, and the LWE/RLWE assumption of Indistinguishability between noisy linear equations and random ones. The advantage is that NTRU’s compact key properties may help reduce the risks of key exposure.

TFHE hinges on RLWE (Ring-LWE) and RGSW (ring-based Gentry–Sahai–Waters encryption), with security derived from RLWE hardness and the resilience of blind rotation against cryptanalysis. Its advantage is that it has been extensively studied with no critical vulnerabilities reported.

The optimized FINAL bootstrapping and TFHE bootstrapping each have their own pros and cons in terms of security. FINAL’s dual assumptions (NTRU + LWE) theoretically enhance security by requiring adversaries to compromise both. TFHE’s reliance on a single RLWE assumption ensures clearer security bounds but lacks NTRU’s key compactness benefits.

7. Conclusions

In this paper, we analyzed the FINAL scheme and optimized it accordingly.

Firstly, we optimized the bootstrapping procedure using the ellipsoidal Gaussian. By analyzing the noise introduced by bootstrapping, it was found that the standard deviation of the Gaussian distribution of key g has a greater impact on the bootstrapping noise compared to that of key f, where f and g are the secret keys of the basic NGS encryption scheme. Therefore, ellipsoidal Gaussian sampling was performed on f and g, where the standard deviations of the Gaussian distribution of f and g were denoted as β · σ f and β 1 · σ g . Meanwhile, from Table 5 and Figure 2, we find that upon adding the disturbance factor β to samples f and g, the security of the scheme is only reduced by a few bits, but the noise generated by bootstrapping is greatly reduced.

Secondly, we performed the adaptive adjustment technique on the decomposition bases. Since the efficiency of the fully homomorphic encryption scheme is mainly limited by ring multiplication and FFTs, the FINAL scheme changes the number of ring multiplications and FFTs by changing the decomposition bases B 1 , B 2 of NGS and their corresponding values of n 1 , n 2 , thus affecting the efficiency of the bootstrapping. Meanwhile, the decomposition bases of the FINAL scheme are limited by the bootstrapping noise. Therefore, we combine the bootstrapping optimization technique based on the ellipsoidal Gaussian to reduce the bootstrapping noise and expand the selection range of decomposition bases by selecting an appropriate β value. Finally, based on β , we adaptively adjust the decomposition bases B 1 , B 2 and n 2 , so that they are as large as possible within the noise bound. This method reduces the total number of ring multiplications and FFTs. This greatly improves the efficiency of the bootstrapping, with 33.3% faster bootstrapping than the original FINAL scheme.



Source link

Meng Wu www.mdpi.com