0% found this document useful (0 votes)

43 views

Privacy-Preserving and Regular Language Search Over Encrypted Cloud Data

This document summarizes a research article that proposes a new technique called privacy-preserving functional encryption based search to allow regular language searches over encrypted cloud data. Regular language searches provide a more expressive and precise way to retrieve data compared to keyword searches. The technique aims to guarantee privacy of both the search contents and outsourced data, while not adding additional local search burden to the user. It analyzes the security and performance of the proposed system, showing it is provably secure and more efficient than some existing searchable encryption systems with high expressiveness.

Uploaded by

Renuka C

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views

Privacy-Preserving and Regular Language Search Over Encrypted Cloud Data

Uploaded by

Renuka C

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2016.2581316, IEEE
Transactions on Information Forensics and Security

Privacy-Preserving and Regular Language Search

over Encrypted Cloud Data
Kaitai Liang, Xinyi Huang, Fuchun Guo, and Joseph K. Liu

Abstract—Using cloud-based storage service, users can re- confidentiality of the data, and how to retrieve the data. For
motely store their data to clouds but also enjoy the high quality the first question, we usually tackle it by leveraging existing
data retrieval services, without the tedious and cumbersome local encryption cryptographic mechanisms, such that all outsourced
data storage and maintenance. However, the sole storage service
cannot satisfy all desirable requirements of users. Over the last data are encrypted and unaccessible to cloud servers. The
decade, privacy-preserving search over encrypted cloud data has encryption technology, with no doubt, enables us to protect
been a meaningful and practical research topic for outsourced the confidentiality of the data. However, it limits the flexibility
data security. The fact of remote cloud storage service that of data retrieve to some extent. The premise of encryption
users cannot have full physical possession of their data makes technique is to prevent a ciphertext holder from gaining access
the privacy data search a formidable mission. A naive solution
is to delegate a trusted party to access the stored data and to the underlying knowledge of data. Without any knowledge
fulfill a search task. This, nevertheless, does not scale well in related to the data, it looks impossible for a cloud server to
practice as the fully data access may easily yield harm for user fulfill any data retrieval task. A naive solution here for data
privacy. To securely introduce an effective solution, we should retrieval is to allow the server to fully access the data, allocate
guarantee the privacy of search contents, i.e. what a user wants the data and next return it to user. Nevertheless, this disgraces
to search, and return results, i.e. what a server returns to the
user. Furthermore, we also need to guarantee privacy for the the meaning of encryption.
outsourced data, and bring no additional local search burden To support data retrieval without loss of confidentiality,
to user. In this paper, we design a novel privacy-preserving Searchable Encryption (SE) mechanisms (e.g. [27], [9]) have
functional encryption based search mechanism over encrypted been proposed in the literature. SE has been studied and widely
cloud data. A major advantage of our new primitive compared employed in real-world applications where data search is out-
to the existing public key based search systems is that it supports
an extreme expressive search mode, regular language search. sourced to untrusted cloud servers. SE allows a server to search
Our security and performance analysis show that the proposed in encrypted data on behalf of a data owner without accessing
system is provably secure and more efficient than some searchable the information of the data and search query contents. In an
systems with high expressiveness. SE, a user encrypts a file database and its search keywords,
Keywords: Regular language, secure data search, cloud. and next uploads them to a cloud server. When retrieving a file,
the user delivers a token related to the keyword to the server
I. I NTRODUCTION so that the server then locates the corresponding encrypted file
from the encrypted database. The flexibility and scalability of
Much like the popularity of portable personal electronic a SE system mainly depend on how we design search token
devices, cloud storage service has been booming over the last as well as search keyword.
decade. Its outstanding advantages, such as considerable stor- From practical point of view, a more expressive search query
age space, flexible accessibility and convenient data retrieval, yields a more precise data retrieval. We take an Electronic
strongly catch the attention of Internet users. Accordingly, to Health Records (EHRs) search as an example. In an EHRs
date not only individuals but also industries prefer to remotely system, a patient’s medical record is usually encrypted and
store their data to cloud servers, such that they can get rid of stored in a storage system. We suppose there is a patient
the burden of local data management and maintenance. This Alice’s encrypted record which is tagged with a keyword index
makes cloud storage service share a great piece of market cut “Alice”. To search the medical record of Alice from its storage
in the field of data management even in the ear of big data. system, a hospital needs to find a file matching the keyword
Remotely data storage delivers convenience to Internet users “Alice”. However, “Alice”, the search index, is quite common
and meanwhile, brings security concerns. The fact that users in usual. There are probably 10,000 patients associated with
cannot have full physical possession of their data immediately the same keyword. This definitely increases the workload of
rises up two serious practical questions: how to guarantee the the hospital to locate the real “Alice” file they need from the
K. Liang is with the Department of Computer Science, Aalto University, rest of other encrypted records (with the same keyword).
Finland (e-mail: [email protected]). To enhance the search expressiveness, one may replace a
X. Huang is the corresponding author, with Nanjing University of Informa- single keyword index with access formula, such as (“Alice”
tion Science and Technology, Nanjing, Jiangsu, 210044, China; with School
of Mathematics and Computer Science, Fujian Normal University, Fuzhou, AND “1990” AND “CrystalLake”) or (“Alice” AND “Age <
Fujian, China, 350117. (e-mail: [email protected]). 20” AND “Student ∈ NYU”). Actually, the most powerful
F. Guo is with School of Computer Science and Software Engineering, Uni- expressive way to represent a search query is to leverage
versity of Wollongong, NSW 2522, Australia (e-mail: [email protected]).
J. K. Liu is with Monash University, Australia. (e-mail: regular language. Using regular language to describe a data to
[email protected]). be encrypted is extremely common in daily life. For instance,

1556-6013 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2016.2581316, IEEE
Transactions on Information Forensics and Security

a Facebook user may directly write down a description, e.g., arbitrary length (e.g. an English sentence, or a paragraph). A
“my birthday party with best friends Bob and Kate”, for an valid data receiver can generate and deliver a search token
uploaded photo. Furthermore, suppose a tax form is encrypted represented as a Deterministic Finite Automata (DFA) to
and archived in some tax authority. The authority may need a cloud server, such that the cloud server can locate the
to search one of the tax forms based on an exact sentence or corresponding ciphertexts and return them to the data receiver.
paragraph of the tax form, such as “Alice have paid $ 8,000 In the search phase, the server knows nothing about the
tax in total in 2014”, in which the number is encrypted. search contents and the underlying data. We further present
Some more recent applications for regular language search extensive evaluation for our system to show its security, and
are online genetic relatedness test and chemical compound the efficiency compared to two most related works [6], [31].
search. Suppose a language space only contains “A,G,C,T”, Supporting regular language search is a great advantage of
a search querier may upload a masked search pattern “ACG- our system that makes the system be the first of its type, to the
GTTCT” to an encrypted genetic database to request the server best of our knowledge. It is undeniable that SSE (e.g. [12])
to return all possible matching encrypted DNA sequences. usually enjoys better efficiency in data searching compared
Unfortunately, there is no SE supporting regular language to the public key based searchable encryption. However,
search in the literature. Designing flexible and scalable regular our novel system can support any arbitrary alphabet/regular
language search without loss of data confidentiality and query language search, so that it is more human-friendly readable for
privacy that becomes the main motivation of our work. search keyword design. Besides, the system provides verifiable
Searchable Symmetric Encryption (SSE) and Public key (data integrity) check for system users (due to public-key based
Encryption with Keyword Search (PEKS) are two types of feature). Moreover, our system does not need to require a data
SE. SSE generally enjoys better search efficiency than that of owner to pick up some special keywords before constructing
PEKS. It provides a limited level of expressiveness for search. keyword index structures, e.g., least frequent keyword [12], but
It is not difficult to see that the limitation of expressiveness also it only leverages a DFA structure to embed flexible search
actually inherits from some original limitation design in sym- expressiveness, e.g., “AND, OR, NOT”, unlike that of [12]
metric encryption1 , such that it is difficult for SSE to support only limited in “a keyword AND (formula)” expression.
expressive search query (e.g. formula search, subset queries).
Therefore, we deal with the case of PEKS to achieve more
B. Related Work
search expressiveness in this paper.
PEKS allows anyone to encrypt searchable contents but only Song et al. [27] introduced the notion of SE, in which full
a defined user group can generate search trapdoor. Moreover, text search over encrypted data is allowable. Following the
it separates search contents from search keywords that offers notion, many SE systems have been proposed in the literature.
flexibility for search expressiveness. The notion of PEKS is All existing systems can be categorized into two types: SSE
initially defined by Boneh et al. [5] in EUROCRYPT 04. Af- (e.g. [13], [14], [28], [7]) and PKES (e.g. [5]). Although
terwards, there are many PEKS systems have been proposed. SSE generally enjoys better search efficiency than that of
We mainly concentrate on the following two systems below. PEKS, it provides limited expressiveness for search, e.g.,
In TCC 07, Boneh and Waters [6] proposed a PEKS system single keyword match [16], [21], [22], conjunctive keyword
supporting conjunctive, subset, and range search queries. It search [12], [18], [11], and fuzzy search [8], [25], [10]. To
can be regarded as the most expressive PEKS scheme in achieve more expressiveness for search query, this paper deals
the literature. However, it does not guarantee the privacy of with the case of PKES. A more important reason for us
search contents, i.e. a cloud server will know what a user to start with public key technology is that the loose bind
wants to search. We will further explain the design limitation between keyword index structure and (uploaded) data (in a
of [6] later. A latest research work for PEKS is proposed SSE system) does not support integrity check. This may give
in INFOCOM 14 by Zheng et al. [31]. In [31], the authors malicious server a chance to completely change the whole
explore PEKS into the context of Attribute-Based Encryption search index structure, such that the corresponding data owner
(ABE). Although their systems (note that they designed more fails to correctly search its data.
than one system in [31]) achieve data and search contents Boneh et al. [5] introduced the notion of PKES, and de-
privacy, they only support single keyword equality search that signed a concrete scheme an identity-based encryption (IBE).
cannot fully achieve our goal, regular language search. Later on, Abdalla et al. [1] presented a generic construc-
tion from anonymous IBE to SE. After that, some variants
A. Our Contributions of PKES supporting single keyword search have been pro-
posed, such as [2], [3]. More practical and search expres-
We first define a new notion called searchable deterministic sive PKES have been constructed, e.g., authorized keyword
finite automata-based functional encryption. The notion is a search [19], verifiable keyword search [4], fuzzy (single)
general notion for PEKS. We next design a concrete con- keyword search [30], [19], [18] with conjunctive keyword
struction satisfying the notion. In our construction, any system search, and [26] with range queries.
user can describe a data to be shared with regular language
In TCC 07, Boneh and Waters [6] designed the most
in an encrypted form, where the language description can be
expressive PKES for not only conjunctive but also keyword
1 It is undeniable that symmetric key encryption focuses more on protecting subset/range queries by leveraging hidden vector encryption
the confidentiality of the data but not the expressiveness of data share. technique. But it is built in composite order group that

1) (mpk, msk) ← Setup(1n , ): on input a security P

P
seriously affects its efficiency. We note that [20] introduces pa-
a way to convert the system into prime order group. Most rameter 1n and the description of a finite alphabet ,
importantly, the system cannot protect the privacy of search output a master public key mpk and a master secret key
contents. Recently, Zheng, Xu and Ateniese [31] introduced a msk.
notion attribute-based keyword search, in which they explore 2) CT ← Enc(mpk, W = (w1 , w2 , ..., wl )): on input mpk
the keyword search into the ABE setting without loss of search and an l-length string W used for keyword description,
content privacy. Nevertheless, their research outcomes only output a ciphertext CT .
support single keyword search. 3) tk ← T okenGen(msk, M = (Q, T , q0 , F )): on input a
Below we compare our work with [6] and [31] in terms msk and a DFA description M = (Q, T , q0 , F ), a fully
of functionality and security in Table I. As to the efficiency trusted authority interacts with a data receiver to generate
comparison, we leave it to Section V. To the best of our knowl- a search token tk. Specifically, the authority delivers a
edge, our scheme is the first to achieve privacy-preserving partial search token ptk to the receiver, and next the
regular language keyword search. We note that our system is receiver constructs a fully search token tk. In this paper
proved under the asymmetric l-Expanded BDHE (which will we require that the communication channel between the
be introduced later), while [6], [31] rely on composite 3-party authority and the receiver must be secure (e.g. SSL).
Diffie-Hellman assumption and the bilinear Diffie-Hellman 4) 1/0 ← T est(tk, CT ): on input a search token tk, and
assumption, and decisional linear assumption, respectively. a ciphertext CT , output 1 if the DFA of the token
accepts the string W embedded into the ciphertext, and
TABLE I: Comparison with [6], [31] 0 otherwise.
Note. We note that the system does not provide a decryption
Sch. Security Standard Pairings
Search Model Group
algorithm that uses a user’s secret key to decrypt a ciphertext
[6] Subset, range CPA ! composite order CT and outputs a message. This capability can be achieved
conjunctive by leveraging a standard public key system. From the point
[31] Single keyword CPA # symmetric of view of encrypted search functionality, there is no need to
Ours Regular Language CPA ! asymmetric explicitly support this capability.

C. System Architecture
II. P ROBLEM S TATEMENT We depict how our system works in Fig. 1.
• A fully trusted authority runs the setup algorithm Setup
A. System Entities
to generate the public parameter mpk for system users
We consider a cloud-based data store and search service and a cloud server, initialize the system and keep the
involving four different entities, as illustrated in Fig. 1. master secret key msk secret.
• A data encryptor can mark a specified keyword descrip- • A data encryptor runs the encryption algorithm Enc
tion for a data, and upload the encrypted data to a cloud to generate different ciphertexts (CTW1 , ..., CTWn ) for
server. It can be a data receiver as well (indicating a user the data receiver with “masked-and-unknown” keyword
encrypts data for itself). description strings (W1 , ..., Wn ), and further uploads the
• A data receiver can download encrypted data from cloud ciphertexts to the cloud.
server, have fully decryption rights of the encrypted data, • When needing the help of the trusted authority to con-
but also to construct a search token with help of a fully struct a search token for some specified keyword descrip-
trusted authority. tion, the data receiver first builds up a secure channel (e.g.
• A fully trusted authority takes charge of generating the SSL) with the authority, and next sends a request with
public parameters for the system, initializing the system, the corresponding DFA search policy to the authority.
issuing a partial search token to a data receiver to help The authority then generates a “partial” search token
the data receiver construct a keyword search token. and replies the token to the data receiver via the same
• A cloud server: given a search token associated with secure channel. We will improve the system later such
a search policy2 and a ciphertext tagged with unknown that there is no need to require the fully trusted authority
keyword description, it verifies whether the ciphertext and to participate into the generation of search token.
the token match or not. If there is a match, output 1 and • After receiving the partial token from the authority,
return the corresponding ciphertext; otherwise, output 0 the data receiver generates a fully search token tkWi ,
and return ⊥. and forwards the token to the cloud server in a public
communication channel.
• The cloud server verifies whether there exists a CTWi
B. System Algorithms
matching the search token tkWi . If there is a match, the
Definition 1: A Searchable DFA-Based Functional Encryp- server outputs 1 and returns the corresponding ciphertext
tion (S-DFA-FE) system consists of the following algorithms: CTWi . Otherwise, the server outputs 0 and returns ⊥.
2 We use a DFA to represent a search policy in this paper. We refer the Note. One might question that why we need a trusted authority
reader to [29] for the definition of DFA. in the generation of search token. In a practical point of view, a

Fig. 1: System Architecture

valid system user holding a secret key is capable of generating E. Design Goals
a search token for his/her encrypted files. We state that our In this paper, our protocol design achieves the following
system can be improved to achieve this requirement. We will functionality and security guarantees.
discuss this after the description of our construction.
• Publicly description ability: to allow every system user to
produce a ciphertext associated with some keyword string
D. Threat Model for others.
• Search token generation ability: to allow any valid system
The below defined adversary model is to see whether a PPT user obtaining the decryption rights of a data to generate a
adversary can tell a ciphertext is associated with a keyword search token for the corresponding encrypted data search.
description string or not, in which all the characters of the • Test verifiability: to allow a cloud server to allocate one
keyword string is from a public known finite alphabet. The (or more) matching ciphertext(s) by a given search token.
defined model is used to prevent the attacks that given a • Privacy-preserving ability: to guarantee the following
ciphertext (resp. a search token), any invalid data receiver (i.e. aspects of privacy.
the one without any decryption rights) cannot compromise the 1) Given a search token, a cloud server does not know
corresponding keyword description field. any knowledge of the keyword(s) embedded into the
Assumption. We assume that the cloud server and data DFA of the token3
receiver are semi-honest (i.e. honest-but-curious), while the 2) Given a ciphertext, a cloud server does not know any
authority is fully trusted. By semi-honest we mean that one information of the keyword string tagged with the
will honestly run a protocol by following the specification of ciphertext.
the protocol but curiously collecting some interesting infor-
mation during the period of protocol running.
III. P RELIMINARIES
Definition 2: An S-DFA-FE system P achieves keyword pri-
KP n A. Asymmetric Pairings
vacy if the advantage AdvA (1 , ) is negligible for any
PPT adversary A in the following experiment. Let BSetup be an algorithm that on input the security
X parameter n, outputs the parameters of a bilinear map as
|P r[b = b0 :(W0∗ , W1∗ , state1 ) ← A( ); (p, g, ĝ, G1 , G2 , GT , e), where G1 , G2 and GT are multiplica-
tive cyclic groups of prime order p, where |p| = n, and g is a
X
(mpk, msk) ← Setup(1n , );
random generator of G1 , ĝ is a random generator of G2 .
(state2 ) ← AO (mpk, state1 );
b ∈R {0, 1}; CT ∗ ← Enc(mpk, Wb∗ );
B. Complexity Assumptions
1
b0 ← AO (CT ∗ , state2 )] − |, Definition 3: (Asymmetric) l-Expanded BDHE Assump-
2 A-l-BDHE
tion. We say that an algorithm A has advantage AdvA
where state1 , state2 are the state information, W0∗ , W1∗ are in solving the asymmetric l-Expanded BDHE problem in (G1 ,
l+1
two distinct keyword strings, and O = {Otg , Otest }. For G2 ) if |P r[A(X̂, e(g, ĝ)a bs ) = 0]−P r[A(X̂, T ) = 0]| ≥ ,
Otest , the oracle intakes (M, CT ), and outputs a bit of value 0 where the probability is over the random choice of gener-
or 1. But the oracle will only return ⊥ for the query (M, CT ), ators g ∈ G1 , ĝ ∈ G2 , the random choice of exponents
where CT is the challenge ciphertext and meanwhile M
accepts CT ’s keyword string. For Otg , the oracle intakes M , 3 We only give the token to the server for guessing the embedded key-

and outputs a search token tk associated with M with an word(s). Of course, the server can construct a ciphertext with a keyword K
to test whether a search token with another keyword K 0 matches the ciphertext
exception that M cannot accept Wb∗ , where b ∈ {0, 1}. or not. This is called offline keyword-guessing attack.

a, b, c0 , ..., cl+1 , d ∈ Z∗p , T ∈R GT , the random bits used by the length of the string only but not the plain string itself. One
A, and X̂ is a set of the following elements: might doubt that the reason of this kind of information leak.
We will explain it later.
g, ĝ, ĝ a , g a , ĝ b , ĝ ab/d , g ab/d , g b/d , ĝ b/d To achieve the privacy of a search query, we require a trap-
i i
∀i ∈ [0, 2l + 1], i 6= l + 1, j ∈ [0, l + 1] g a s , g a bs/cj
, door generator (i.e. a data receiver) to “hide” each character
i
a b/ci i
a b/ci ci i
a d abci /d bci /d of a description string as well as each state information when
∀i ∈ [0, l + 1] g , ĝ , ĝ , ĝ , ĝ , ĝ ,
constructing a DFA. The search query now is represented as
ai bd/cj
∀i ∈ [0, 2l + 1], j ∈ [0, l + 1] ĝ , the DFA, where DFA is much like a graph with direction
∀i, j ∈ [0, l + 1], i 6= j ĝ a
i
bcj /ci
. includes a unique initial node (resp. state), many edges and
ending nodes (resp. states). For each edge (with direction)
We say the asymmetric l-Expanded BDHE assumption holds connecting two nodes (resp. states), there is a transaction,
in (G1 , G2 ) if no PPT algorithm has advantage in solving (x, y, σ), where x is the origin node, y is the destination
the asymmetric l-Expanded BDHE problem in (G1 , G2 ). one, and σ is the character of a string, say WP. We note that
We can show that the above extended complexity assump- all the characters are from a finite alphabet . To hide the
tion still holds in the generic group model by employing the information of the character and states, σ as well as x and
same proof technology introduced in [29]. Specifically, we can y should be masked by some way such that each transaction
see from the set X̂ that there are five elements in G1 including only shows the direction. For the initial state and accept states,
i i i
g, g b/d , g a , g ab/d , g a s , g a bs/cj and g a b/ci . Here, we show the cloud server will not know the exact value of them, but
that these elements cannot help an adversary to compute an be notified that which ones are the initial state and accept
exponent value al+1 bs. For the element g, it is easy to see states, respectively. We will introduce a specific way to hide
l+1
that there does not exist a ĝ a bs in X̂. Similarly, we need the information of characters and states later.
l+1 l l z z z
the G2 elements ĝ a ds , ĝ a ds , ĝ a bs , ĝ a b , ĝ a cj and ĝ a sci We assume that a DFA (graph) has many different direction
b/d a ab/d ai s ai bs/cj ai b/ci
for the G1 elements g , g , g ,g ,g and g , paths, and each of them has the same initial state, i.e. the root
respectively, where i + z = l + 1. It is not difficult to see that of the graph, and a unique accept state, i.e. the end note of
the above G2 elements cannot be provided by the set X̂. the path. We define the term “full path” below. If there exists
a path from an initial state to an accept state intaking a string
IV. A N E XPRESSIVE K EYWORD S EARCH M ECHANISM W as input, we state that this path is a full path with |W |
From a lemma stated in [5], we know that a keyword search edges, where |W | is the length of the string W .
We state that the system construction technique of [29]
mechanism can be built on top of an anonymous identity-
does deliver us the possibility of the above information hid-
based encryption system. Based on this valuable observa-
den behavior. First of all [29] embeds each character of an
tion, we construct an expressive keyword search mechanism
alphabet into individual public elements, and each character
from an anonymous attribute-based encryption scheme. The
for encryption is represented as a single ciphertext component
anonymous attribute-based encryption can be regarded as a
as well. Secondly, a successful decryption relies on a fact that
combination of masked technique and an extension of Waters
whether there exists a full path in a DFA (associated with
functional encryption [29].
a secret key) matches a string (associated with a ciphertext).
After our anonymous conversion for the ciphertext, no one
A. An Intuition for Our Basic Construction will know a given single ciphertext component corresponds to
We first take an expressive functional encryption sys- what character. Moreover, a masked DFA only delivers a graph
tem [29] as a starting point. We treat [29] in a different with unknown states and characters to a cloud server. The only
point of view, specifically in a searchable encryption view. requirement for the cloud server (upon doing search matching)
We regard the description string associated with a ciphertext is to find out the full path(s) in a DFA. For instance, a masked
as a description for future search, and the secret key of a user DFA includes 2 full paths, one with length 3 and the other with
as a search token. If the token matches the ciphertext (namely length 4. The cloud server will need to first find out the paths
a successful decryption occurs in the functional encryption as well as their respective length, and choose all ciphertexts
point of view), a cloud server will return the ciphertext. from its storage encrypted backend associated with unknown
Nevertheless, we cannot simply turn [29] to be a searchable description strings with length 3 and 4. The cloud server then
encryption system. In [29], one (i.e. a data encryptor) needs makes some specific computation by using the DFA search
to regard a description string (for a ciphertext) as one of the token and the ciphertext so as to verify whether they match
output of ciphertext, i.e. showing the string in a plain format. or not. The construction technology of [29] guarantees that
In this case, the privacy of the string, however, cannot be pro- only a length match pair of ciphertext and search token can
tected. We thus extend the system into the asymmetric pairing make a possible search match. Thus, the premise of our search
group such that we can guarantee the privacy of the search technique is that we need to let cloud server know the graph’s
description string associated with a given ciphertext. That is, direction and the corresponding full path(s) in a DFA search
given a ciphertext tagged with some search description, a cloud token and the length of description string in a ciphertext.
server does not know what the corresponding description is.
Furthermore, we also need to require the data encryptor to B. Scheme Notations
publicly show “the length of the string”. Accordingly, we leak We summarize the notations used in our system in Table II.

TABLE II: Frequently Used Notations the partial search token as follows. First it sets:
1n security parameter
P
a finite alphabet Kstart1 = D̂0 · (ĥstart )rstart , Kstart2 = ĝ rstart ;
H(ψ) a Target Collision Resistant (TCR)
hash function [15] with input ψ For each transaction t = (x, y, σ) ∈ T , set: Kt,1 =
a ∈R A a is randomly chosen from the field A D̂x−1 ẑ rt , Kt,2 = ĝ rt , Kt,3 = D̂y (ĥσ )rt ; For each qx ∈
Z∗p a (non-zero) positive field of all integers
module a prime p F it computes:
[1, l] all integers from 1 to l
M = (Q, T , q0 , F ) a DFA description including: Kendx,1 = ĝ −α · D̂x · (ĥend )rendx , Kendx,2 = ĝ rendx .
Q is a set of all states, a state is qP
i∈[1,|Q|]
T is a transaction function: Q × →Q Finally, set the partial search token as
a transaction denotes as ti = (qi−1 , qi , wi ) ∈ T
q0 ∈ Q is the start state
F ∈ Q is a set of accept states
Kstart1 , Kstart2 ,
∀i ∈ I for all indexes i belonging to the set I
mpk the master public key
msk the master secret key ∀t ∈ T (Kt,1 , Kt,2 , Kt,3 ), ∀qx ∈ F (Kendx,1 , Kendx,2 ) .
W = (w1 , ..., wl ) a keyword string with length l, i.e. |W | = l
CT a ciphertext associated with a keyword string We note that the partial search token is delivered to
tk a keyword search token the corresponding receiver via a secure communication
channel (such as SSL).
2) The data receiver then constructs the full search to-
ken as follows. It chooses new random elements
C. Details of Our Basic Construction
D̄0 , D̄1 , ..., D̄|Q|−1 ∈R G2 , θ1 , θ2 ∈R Z∗p , ∀qx ∈ F
it chooses θ3,x ∈R Z∗p . First it sets:
Setup(1n , ): The setup algorithm is run by a fully
P
•
trusted authority who will initialize the system, publish Kstart1 = D̂0 · (ĥstart )rstart · D̄ · (ĥstart )θ1 ,
the system public parameters, and store the master secret
Kstart2 = ĝ rstart · ĝ θ1 , Kstart3 = ĝ θ2 ;
key secretly.
1) PChoose βf , βend , βstart , α, ξ ∈R Z∗p and for each σ in For each transaction t = (x, y, σ) ∈ T , set:
choose βσ ∈R Z∗p . Choose a TCR hash function:
H : GT → G1 . Kt,1 = D̂x−1 ẑ rt D̄x−1 , Kt,2 = ĝ rt , Kt,3 = D̂y (ĥσ )rt D̄y fˆθ2 ,
2) Choose g ∈ G1 and ĝ ∈ G2 . Set hstart = g βstart , For each qx ∈ F it computes:
ĥstart = ĝ βstart , hend = g βend , ĥend = ĝ βend , f =
g βf , fˆ = ĝ βf , z = g ξ , ẑ = ĝ ξP
, and set a pair of Kendx,1 = ĝ −α · D̂x · (ĥend )rendx · D̄x · (ĥend )θ3,x ,
elements corresponding to a σ in as hσ = g βσ and Kendx,2 = ĝ rendx · ĝ θ3,x .
βσ
ĥσ = ĝ .
3) Set the master secret key as msk = (ĝ −α , ẑ), and the Finally, set the fully search token as
master public key as mpk = (e(g, α ˆ
Pĝ) , g, ĝ, z, f , f ,

hstart , hend , ĥstart , ĥend , ∀σ ∈ hσ , ĥσ ). tk = M ∗ , Kstart1 , Kstart2 , Kstart3 ,
• Enc(mpk, W = (w1 , ...., wl )): the encryption algorithm
is run by a data encryptor who would like to send a data ∀t ∈ T (Kt,1 , Kt,2 , Kt,3 ), ∀qx ∈ F (Kendx,1 , Kendx,2 ) ,
with an arbitrary length string W .
where M ∗ is identical to M except that each state and
1) Choose random elements s0 , ...., sl ∈R Z∗p .
string of the DFA turns to be a masked symbol ∗.
2) Set the ciphertext CT as: C1 = H(e(g, ĝ)αsl ),
Cstart1 = C0,1 = g s0 , Cstart2 = (hstart )s0 , • Test(tk, CT ): a cloud server proceeds the following test
for i = 1 to l: Ci,1 = g si , Ci,2 = hswii z si−1 , Ci,3 = f si , to verify whether a given ciphertext CT matches a search
finally Cend1 = Cl,1 = g sl , Cend2 = (hend )sl . token tk or not. If yes, output 1 and output 0 otherwise.
3) Output CT = (l, C1 , Cstart1 , Cstart2 , {Ci,1 , Ci,2 , In the view of the cloud server, CT is associated with
Ci,3 }i=[1,l] , Cend1 , Cend2 ). an unknown string ∗ = (∗1 , ..., ∗l ) and the search token
tk is associated with a search pattern represented by a
• TokenGen(msk, M = (Q, T , q0 , F )):
DFA M ∗ = (Q, T , q0 , F ). The cloud server first checks
1) the partial token generation algorithm is run by the whether there exist a sequence of l +1 states ∗0 , ∗1 , ..., ∗l
fully trusted authority. and l transitions t1 , ..., tl fitting an unknown string with
The description of M includes a set Q of states length l (note by fitting we mean they only match in
q0 , ..., q|Q|−1 and a set of transitions T where each P length), where ∗0 is a starting state and ∗l ∈ F , and for
transition t ∈ T is a triple (x, y, σ) ∈ Q × Q × . i = 1, ..., l, we have ti = (∗i−1 , ∗i , ∗i ) ∈ T . If there does
q0 is designated as a unique start state and F ⊆ Q not exist the corresponding sequence, the cloud server
is the set of accept states. The algorithm chooses outputs 0 indicating a mismatch. Otherwise, the server
D̂0 , D̂1 , ..., D̂|Q|−1 ∈R G2 (associating D̂i with qi ), proceeds. First compute:
for each t ∈ T it chooses rt ∈R Z∗p , ∀qx ∈ F it
chooses rendx ∈R Z∗p . The trusted authority constructs B0 = e(Cstart1 , Kstart1 ) · e(Cstart2 , Kstart2 )−1

For i = 1 to l, compute: somewhat “ABE” infrastructure where there exists a fully

−1 trusted authority working as a private key generator with
Bi = Bi−1 · e(C(i−1),1 , Kti ,1 ) · e(Ci,2 , Kti ,2 )
knowledge of msk. With the help of this trusted authority, a
·e(Ci,1 , Kti ,3 ) data receiver is able to generate a search token for a specified
Since M accepts w, we have that ul = Pqx for some DFA. In this “ABE” infrastructure, the secure interaction with
l
qx ∈ F and Bl = e(g, D̂x · D̄x )sl · e(g v=1 sv , fˆθ2 ). the trusted authority is necessary for the generation of a search
Further compute token. To eliminate the interaction, we may choose to extend
the construction in the identical way as [6]. Namely, we allow
Bend = Bl · e(Cendx,1 , Kendx,1 )−1 user to become a trusted authority for itself to generate msk
·e(Cendx,2 , Kendx,2 ) · e(Cendx,3 , Kstart3 ) as its personal secret key. Specifically, we let each system
Pl
user generate its own msk = (ĝ −α , ẑ) to make it become a
= e(g, ĝ)α·sl · e(g v=1 sv , fˆθ2 ).
personal secret key component and meanwhile, e(g, ĝ α ), z will
If become the corresponding public key component for the same
l
Y user. We state that this additional secret key msk only is for
C1 = H(Bend /e( Ci,3 , Kstart3 )) the user who generates it, and other users cannot gain access
i=1 to it. This basically is identical to public key based system
Pl
e(g, ĝ) α·sl
e(g v=1 sv
, fˆθ2 ) - other users cannot reach one’s secret key. It is clear to see
= H( Ql ) that a data receiver (i.e. a valid decryptor) can generate his/her
e( i=1 f si , ĝ θ2 )
own search token with knowledge of its secret key msk, such
= H(e(g, ĝ)α·sl ), that we do not need the help of the trusted authority here. This
output 1 (indicating a match); otherwise, output 0. definitely reduces the communication cost of the system. We
state this slight revision for the construction will not yield any
D. Improvement for Our Basic Construction effect in keyword matching test and data encryption phases.
Offline Keyword-Guessing. We state that our system (as
Hide DFA. In our basic construction, we require that
well as [6], [31]) cannot hold against this attack due to their
the description of the DFA associated with a given search
natural public key based property. How to prevent public key
token should be masked with respect to the corresponding
based searchable encryption against the attack is an unsolved
keyword strings as well as the states information. To hide
interesting problem.
these information, we may leverage an efficient and secure
way, i.e. using pseudorandom function technique [17]. For
each DFA, a data receiver can choose two random pseu- V. E VALUATION
dorandom keys, say key (1) and key (2) , and further replace
each keyword character wi and each state qj with the values A. Security Analysis
(1) (2)
ξi = P RF (key (1) , wi , i) and ξj = P RF (key (2) , qj , j), Theorem 1: Our S-DFA-FE system achieves keyword pri-
where P RF is a pseudorandom function, i is the i-th position vacy under the asymmetric l-Expanded BDHE assumption.
of the string W , and j is the j-th position of a successful Proof: Here we first present an analysis in practical point
path. Therefore, the DFA description M ∗ includes a set of of view, and next deliver a theoretical analysis.
(2)
masked states Q∗ = {ξj |j ∈ [1, |Q|]}, a set of transactions Practical Analysis. We closely take a look at a ciphertext
(2) (2) (1) (2)
T ∗ = {∀t = (ξi , ξi+1 , ξi ) ∈ T }, the start state ξ0 and a CT associated with a search description string W . Recall
(2) that we embed each character wi of the string into the i-
set of accept states F ∗ = {ξj |j ∈ [1, |F ∗ |]}.
One might think that it is easy to employ our masked tech- th element of the ciphertext: Ci,1 = g si , Ci,2 = hswii z si−1
nique into some existing ABE systems so as to propose another and Ci,3 = f si . Given these elements, a PPT adversary may
expressive searchable encryption system as well. However, this choose to make brute force pairings calculation on them so as
is not always the truth. We can take [6] as an example. We to find out which element corresponds to which character. The
may choose to hide a predicate I into a random value set as adversary can first try to calculate e(Ci,2 = hswii z si−1 , ĝ) =
well, where I is the search predicate and also an output of e(hswii , ĝ)e(z si−1 , ĝ). It is easy to see that the equation is equal
a ciphertext. By masking the predicate, we guarantee that no to e(ĥwi , g si )e(ẑ, g si−1 ). Here all components are known by
one (except the pseudorandom key holder) knows the values the adversary except ẑ. Without knowledge of ẑ, the adversary
in I. Nevertheless, this indeed limits the capability of search. cannot make a successful pairings match, such that it cannot
It is not difficult to see that in the search test algorithm of [6], tell whether e(hswii z si−1 , ĝ) contains ĥwi or not. Furthermore,
a cloud server needs to know the information of a subset S it is not difficult to see that even the adversary replaces ĝ in
(which is a set of non-wildcard value of I). Now I cannot his first calculation with ĥstart , ĥend or fˆ, it still cannot have
be seen by the server, such that S is unknown as well. This successful pairings match.
definitely will lead to a fail search test. From this example, we On the other hand, the adversary may choose to guess all
can see that not all existing ABE systems show their potential characters of the string W as it knows the length l of W . Since
to be extended to become a searchable encryption system. we allowP all system insiders or outsiders to P know thePfinite
Generate Search Token. From the description of our basic alphabet , the adversary has knowledge of and | |. If
construction, we can see that the construction is based on the adversary tries to make a successful guess for this l length

i l∗
string, its probability is | |−l . We state that when | | and Ci,3 = g a sβf ; finally sets Cend1 = g a s and
P P
l ∗ ∗
l +j
l are sufficient large, the probability trends to negligible. Cend2 = (g a s )vend j∈[2,l∗ +1] g −a bs/cj
Q
.
Theoretical Analysis. In the theoretical proof, we will reduce • Phase 2. Same as Phase 1.
the keyword privacy to the hard problem of asymmetric l- 0 0
• Guess. A outputs a guess bit b . If b = b , B outputs 1
l+1
Expanded BDHE. The proof here will share many similarities (guessing T = e(g, ĝ) a bs
); else, it outputs 0 (guessing
with that of [29], we below hence mainly present the differ- T ∈R GT ).
ences in our proof. We note that the game challenger is B This completes the simulations. For the challenge ciphertext,
given the problem instance of asymmetric l-Expanded BDHE. if T = e(g, ĝ)al+1 bs , the ciphertext is a valid ciphertext of W ∗ ,
b
∗ ∗
• Initialization. The adversary outputs W0 and W1 with such that the probability of the adversary in outputting b = b0 is
equal length l∗ . 1/2+µ. Otherwise, T is a random element in GT such that the
∗
• Setup. B sets w = Wb∗ , where b ∈ {0, 1}. B ciphertext is a random one. The probability of the adversary
∗
P elements vz , vstart , ∗vend , βf ∈ Zp , to guess a correct b is 1/2. Therefore, the probability of B
chooses random 0
and ∀σ ∈ chooses vσ ∈ Zp . It then sets correctly decides a T with the problem instance is 1/2(1/2 +
e(g, ĝ)α = e(g a , ĝ b ), g = g, ĝ = Q ĝ, z = g vz g ab/d , µ+1/2) = 1/2+µ/2. B can solve the asymmetric l-Expanded
vz ab/d vstart −aj b/cj
ẑ = ĝ ĝ , hstart = g j∈[1,l∗ ] g , BDHE problem with advantage µ/2 if the adversary can win
j
ĝ vstart j∈[1,l∗ ] ĝ −a b/cj , the keyword privacy game with advantage µ.
Q
ĥstart =
hend =
Q j
g vend j∈[2,l∗ +1] g −a b/cj , ĥend = Theorem 2: Given a search token, a PPT adversary cannot
j compromise the information of characters embedded into the
ĝ vend j∈[2,l∗ +1] ĝ −a b/cj , ∀σ
Q P
∈ , hσ = DFA under the assumption of the psuedorandom function (we
∗
(l +1−j)
g vσ g −b/d j∈[0,l∗ +1]s.t.w∗ 6=σ g −a b/c(l∗ +1−j)
Q
, use in masking DFA) is secure.
j
(l∗ +1−j) Proof: We suppose an adversary can break the security
vσ −b/d −a b/c(l∗ +1−j)
Q
ĥσ = ĝ ĝ j∈[0,l∗ +1]s.t.wj∗ 6=σ ĝ ,
of pseudorandom function with probability adv P RF . Since
f = g βf and fˆ = ĝ βf . B outputs mpk = P (e(g, ĝ)α , g, ĝ, all characters are masked as ξ (1) , the adversary either needs
ˆ i
z, f , f , hstart , hend , ĥstart , ĥend , ∀σ ∈ hσ , ĥσ ). to break the P RF or correctly guesses the pseudorandom
• Phase 1.
key key (1) , where key (1) belongs to a valid key space
1) Search Token Queries. B constructs search token {0, 1}poly(1n ) . Therefore, the adversary may know the em-
much like the secret key construction Q in the proof bedded characters with probability adv P RF + 2−poly(1n ) . On
ai+1 b
of [29]. D̂k will be represented as i∈Sk ĝ the other hand, it is possible for the adversary to guess
below. For each qk ∈ Q, we need a set Sk of the characters by unfolding the information of all states.
indices between 0 and l∗ . For i=0,1,...,l∗ , we We recall that inside a DFA given an origin state and a
put i ∈ Sk if and only if w∗(i) matches Mk , character as input, it deterministically outputs a destination
where k ∈ [0, |Q| − 1], Mk = (Q, T , qk , F ) state. Besides, the alphabet
P
is known by the adversary.
∗(i) ∗
and w denotes last i symbolsPof w . B starts The adversary hence can try all unknown characters from
P
by implicitly setting rstart = i∈S0 ci+1 , such to determine a transaction ti = (xi , yi , wi ). Similarly, the
that Kstart2 = ĝ rstart ĝ θ1 = ( i∈S0 ĝ ci+1 )ĝ θ1 , adversary can reveal all states information with probability
Q
n n
Kstart1 = Q D̂0 (ĥstart )rstart D̄0 (ĥθstart 1
) = adv P RF + 2−poly(1 ) , where 2−poly(1 ) is the probability
j
(Kstart2 )vstart ( j∈[1,l∗ ],i∈S0 ,j6=i+1 ĝ −a bci+1 /cj )D̄ of guessing the pseudorandom key key (2) . In summary, the
(ĥstart ) , Kstart3 = ĝ , where θ1 , θ2 and adversary can achieve the knowledge of characters embedded
θ1 θ2
n
P RF
D̄0 is chosen by B. For Pall qx ∈ F , B into the DFA with probability adv = 2adv + 21−poly(1 ) .
implicitly sets rendx = c , such Since we assume the psuedorandom function is secure, we
Qi∈Sx ,i6=0 i+1 that adv P RF is negligible. Therefore, adv is negligible
that Kendx,2 = ĝ rendx θ3,x
ĝ = ( i∈Sx ,i6=0 ĝ ci+1 )ĝ θ3,x , have1−poly(1 n
)
as 2 is negligible with a sufficient large n.
Kendx,1 = ĝ −α D̂x (ĥend )rendx D̄x (ĥend )θ3,x =
vend −aj bci+1 /cj
Q
(Kendx,2 ) ( j∈[2,l∗ +1],i∈Sx ,i6=0,j6=i+1 ĝ )
θ3,x B. Performance Analysis
D̄x (ĥend ) , where θ3,x and D̄x are chosen by B.
The construction of the components for each Boneh and Waters [6] propose an expressive searchable
t = (x, y, σ) ∈ T are similar to that of [29] except encryption supporting conjunctive, subset and range search,
−1
that B will additionally multiply D̄x , D̄y and f to ˆ θ2 while Zheng et al. [31] introduce a key-policy attribute-
Kt,1,i and Kt,3,i , respectively. based system supporting keyword search. Here, we make a
2) Test queries. B first generates the search token as in comprehensive comparison among our scheme, [6] and [31]
the previous step, and next runs the Test algorithm to in terms of computation and communication cost. Table IV
output 1 or 0. and Table V show the theoretical comparison of efficiency.
i We now define the notations used in the Tables. Let |G|
• Challenge Phase. B implicitly sets si = sa and
and |GT | denote the bit-length of an element in groups G
sets C1 = H(T ). It next setsQCstart1 = g s ,
j and GT , l denote the length of a keyword search string, |T L|
Cstart2 = (g s )vstart j∈[1,l∗ ] g −a bs/cj ;
i denote the number of leaf in an access tree, |S| denote the
for i = 1 to l∗ : Ci,1 = g a s , Ci,2 = number of attribute in an attribute set, c , c(1) , c(2) denote the
i v ∗ i−1 l∗ +1−j+i p ex ex
(g a s ) wi (g a s )vz j∈[0,l∗ +1]s.t.w∗ 6=w∗ g (−a )bs/cj
Q
,computation cost of a bilinear pairing, an exponentiation in
j i

G, and an exponentiation in GT , respectively. Note that we TABLE V: Communication Comparison with [6], [31]
regard G1 and G2 (of our system) as a G in the theoretical
Size/Length
analysis for simplicity purpose. However, we will treat the Schemes
Ciphertext Token
two groups in a different way in the following up practical [6] O(1)|GT | + O(l)|G| O(l)|G|
analysis. Similarly, the subgroups Gp and Gq , GT,p and GT,q [31] O(l)|G| O(l)|G|
Ours O(l)|G| O(l)|G|
(of [6] whereby n = pq) will be simply seen as G and GT in
the theoretical analysis only, respectively.
To make a fair comparison, we further make the following
assumptions. Since [6], [31] and our system provide different in the ciphertext metric.
functionalities in keyword search, we need to define a common In conclusion, our scheme achieves regular language key-
function among them for a fair comparison. We hence define word search without requiring a great amount of additional
the comparison is based on a single equality keyword match. computation and communication cost.
We assume our system shares the same l with [6], namely,
the length of keyword string in our system is equal to that of
C. Practical Analysis
keyword searchable field in [6]. Here we also have |S| = l for
an equality match test in [6]. In [31], we will set |T L| (i.e. We leverage the Java Pairing Based Cryptography Li-
the number of attribute in an access tree) and |Attr| (i.e. the brary [23] to calculate the system running time shown in
number of attribute embedded into a ciphertext) to be equal Table VI. Our testbed is: Intel(R) Core(TM)2 Quad CPU
to l as well, although they are defined by a specific access Q6600 @ 2.40GHz, 3 GB RAM, Ubuntu 10.04. For the
policy but not any information of a search keyword. We note fairness of the practical comparison, we will use different
that it can be seen from this point of view that the efficiency pairing types - one is Type a with 160-bit group order (the
of keyword search of [31] mainly depends on the complexity embedding degree of the curve is 2) for the implementation
of access policy. If a complex access policy is used, there must of [31]’s KP-ABKS scheme; one is Type a1 with 1024 bits
be a low search efficiency. This can be regarded one of the based field size and k = 2 for the implementation of [6]’s
limitations in [31]. In our construction, a single keyword match hidden vector encryption construction; and one is Type d with
indicates that a designed DFA needs one and only successful 159 bits based field and k = 6 for the implementation of our
path from an initial state to a unique accept state. system (in which we assume a search token only has one final
successful state). The above pairing types are chosen based
TABLE III: Comparison with [6], [31] on the recommendation introduced in [24], and all the data
is without preprocessing. We suppose all schemes listed in
Schemes Complexity Pairings the Tables must at least achieve a security level comparable
Assumption Group
to a symmetric key cryptosystem with an 80-bit key. That
[6] Composite composite
3-party Diffie-Hellman order is, an elliptic curve cryptosystem with around 160-bit key is
[31] Decisional Linear symmetric prime order needed. Therefore, we set n = 160 bits, the group elements
Ours Asymmetric l-Expanded BDHE asymmetric prime order in Gξ1 are set to be 160 bits, and the group elements from
GT and GT,ξ2 are set to be 1024 bits, respectively, where
Before proceeding to the theoretical complexity analysis, ξ1 ∈ {1, 2, p, q} and ξ2 ∈ {p, q}. We further set the following
we first present the complexity assumption comparison among four experimental samples: Test 1: l = 10; Test 2: l = 30;
our scheme, [6] and [31] in Table III. From the Table I and Test 3: l = 60; Test 4: l = 100. Table VII is the comparison
Table III, we can see that the efficiency of [31] may be better of concrete communication cost.
than that of [6] and ours, since [31] is built in the random
TABLE VI: Comparison in System Running Time
oracle model with symmetric pairings and prime order group.
This fact will be further confirmed in our practical analysis. Algorithms (Running Time ms)
Schemes Enc Token Gen Test
TABLE IV: Computation Comparison with [6], [31] T est1 : 20628.458 16989.681 6830.796
T est2 : 60491.738 50302.781 19841.836
[6]
T est3 : 120286.658 100272.431 39358.396
Computation Cost
Schemes T est4 : 200013.218 166898.631 65380.476
Enc Token Gen Test
(1) (2) (1) T est1 : 260.232 446.112 344.208
[6] O(l)cex + O(1)cex O(l)cex O(l)cp T est2 : 631.992 1189.632 974.008
(1) (1) (2) [31]
[31] O(l)cex O(l)cex O(l2 )cp + O(l)cex T est3 : 1189.632 2304.912 1918.708
Ours
(1) (2)
O(l)cex + O(1)cex
(1)
O(l)cex O(l)cp T est4 : 1933.152 3791.952 3178.308
T est1 : 242.204 1151.185 1224.16
T est2 : 664.204 3124.645 3322.72
Ours
T est3 : 1297.204 6084.835 6470.56
From Table IV, we observe that our scheme shares the same T est4 : 2141.204 10031.755 10667.68
efficiency with [31], while [6] suffers from linearly cost in the
test phase, i.e. O(l2 ) pairings. It is worth mentioning that our
scheme supports more powerful search functionality than [31]. To clearly show the comparison, we use the line charts to
Table V shows that the systems have similar complexity in depict the experimental results below. Fig. 2, 3 and 4 show
the size of search token, while [6] needs an extra GT element the running time in encryption, search token generation and

TABLE VII: Comparison in Communication Cost

Components Length (bit)

16
Schemes Public Key Ciphertext Token
T est1 : 11264 7744 3360

log2(Running Time(ms))
T est2 : 30464 20544 9760 14
[6]
T est3 : 59264 39744 19360
T est4 : 97664 65344 32160 12
T est1 : 640 2080 3520
T est2 : 640 5280 9920 10
[31]
T est3 : 640 10080 19520
T est4 : 640 16480 32320 [6]
8
T est1 : 5664 5600 5600 [31]
T est2 : 12064 15200 15200 Ours
Ours 6
T est3 : 21664 29600 29600 10 20 30 40 50 60 70 80 90 100
T est4 : 34464 48800 48800 Length of Search Keyword (character)

Fig. 3: Running Time Comparison in Token Generation

18
16
16
15
log2(Running Time(ms))

14 14

log2(Running Time(ms))
13
12
12
10
11

8 [6] 10
[31] [6]
Ours 9 [31]
6
10 20 30 40 50 60 70 80 90 100 Ours
Length of Search Keyword (character) 8
10 20 30 40 50 60 70 80 90 100
Length of Search Keyword (character)
Fig. 2: Running Time Comparison in Encryption
Fig. 4: Running Time Comparison in Test

keyword matching test, while Fig. 5, 6 and 7 present the

comparison results in the length of public key, ciphertext and represent each character element individually. This basically
search token. We note that the unit of the vertical axis of Fig. explains the reason why more expressiveness requires more
2, 3 and 4 is millisecond, and that of others is bit. complexity. Fig. 6 shows that although our system requires
From the running time experimental results, we can see more space for the storage of ciphertext as compared to [31], it
that all systems experience a climbing trend with the increase outperforms [6]. In Fig. 7, we see that our system suffers from
of l. Specifically, [6] suffers from significantly increase of the largest storage cost for search token, while other systems
running time in encryption, keyword token generation and test have similar space cost (two lines are almost overlapped).
algorithms. However, our system and [31] have small range of Our system needs larger space for search token is due to
ascendancy. In addition, the systems share approximately the a reason that the DFA associated with the token provides a
same running cost in each metric. fine-grained character (which is from a given alphabet) match
As illustrated in the running time figures, while the search pattern. In the design of DFA, each character is seen as an
keyword length is ≤ 100, our system achieves efficient and individual component. The pattern and the structure is the
acceptable complexity in encryption, decryption and keyword premise of allowing us to achieve regular language search.
search. Based on this fact, we set some limitations for our However, the premise incurs a shortage in search token storage
system while it is implemented in practice. To avoid heavy cost. How to reduce the size of search token without loss of
complexity in search queries, we may present a limitation for search expressiveness is an interesting open problem.
the design of DFA that each valid path (with direction inside
the DFA) may intake less than 100 symbols for each search ACKNOWLEDGEMENTS
query. This constraint, actually, limits the expressiveness of K. Liang is supported by privacy-aware retrieval and mod-
AND/Not gate(s) search query. Recall that one direction path elling of genomic data (PRIGENDA, No. 13283250), the
is seen as a query with AND/Not gates in our system. Besides, Academy of Finland. X. Huang is supported by National
it is better to limit at most 5-6 OR gates in the design of DFA. Natural Science Foundation of China (61472083, U1405255),
From the size of public key comparison (Fig. 5), we can Program for New Century Excellent Talents in Fujian Univer-
see that [31] enjoys the constant cost as the increase of l. sity (JA14067), Distinguished Young Scholars Fund of Fujian
This is because it only supports a single keyword equality (2016J06013), and the CICAEET fund and the PAPD fund. F.
match but not expressive search. [6] and our system aim to Guo is supported by National Natural Science Foundation of
present more expressive search query such that we need to China (No. 61572390).

100000 50000
[6] [6]
[31] [31]
80000 Ours 40000 Ours

60000 30000
Size(bits)

Size(bits)
40000 20000

20000 10000

0 0
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
Length of Search Keyword (character) Length of Search Keyword (character)

Fig. 5: Comparison for Public Key Size Fig. 7: Comparison for Token Size

70000 [11] D. Cash, J. Jaeger, S. Jarecki, C. S. Jutla, H. Krawczyk, M. Rosu, and

[6] M. Steiner. Dynamic searchable encryption in very-large databases: Data
60000
[31]
Ours structures and implementation. In NDSS. The Internet Society, 2014.
[12] D. Cash, S. Jarecki, C. S. Jutla, H. Krawczyk, M. Rosu, and M. Steiner.
50000 Highly-scalable searchable symmetric encryption with support for
boolean queries. In CRYPTO, vol. 8042 of LNCS, pp. 353–373. Springer,
40000
Size(bits)

2013.
[13] Y. Chang and M. Mitzenmacher. Privacy preserving keyword searches
30000
on remote encrypted data. In ACNS, vol. 3531 of LNCS, pp. 442–455,
2005.
20000
[14] M. Chase and S. Kamara. Structured encryption and controlled dis-
10000
closure. In ASIACRYPT, vol. 6477 of LNCS, pp. 577–594. Springer,
2010.
0
[15] R. Cramer and V. Shoup. Design and analysis of practical public-key
10 20 30 40 50 60 70 80 90 100 encryption schemes secure against adaptive chosen ciphertext attack.
Length of Search Keyword (character)
SIAM J. Comput., 33(1):167–226, January 2004.
[16] R. Curtmola, J. A. Garay, S. Kamara, and R. Ostrovsky. Searchable
Fig. 6: Comparison for Ciphertext Size symmetric encryption: improved definitions and efficient constructions.
In CCS, pp. 79–88. ACM, 2006.
[17] O. Goldreich, S. Goldwasser, and S. Micali. How to construct random
functions. J. ACM, 33(4):792–807, 1986.
R EFERENCES [18] P. Golle, J. Staddon, and B. R. Waters. Secure conjunctive keyword
search over encrypted data. In ACNS, vol. 3089 of LNCS, pp. 31–45.
[1] M. Abdalla, M. Bellare, D. Catalano, E. Kiltz, T. Kohno, T. Lange, Springer, 2004.
J. Malone-Lee, G. Neven, P. Paillier, and H. Shi. Searchable encryp- [19] Y. Hwang and P. Lee. Public key encryption with conjunctive keyword
tion revisited: Consistency properties, relation to anonymous ibe, and search and its extension to a multi-user system. In Pairing, vol. 4575
extensions. J. Cryptology, 21(3):350–391, 2008. of LNCS, pp. 2–22. Springer, 2007.
[2] J. Baek, R. Safavi-Naini, and W. Susilo. On the integration of public [20] V. Iovino and G. Persiano. Hidden-vector encryption with groups of
key data encryption and public key encryption with keyword search. In prime order. In Pairing, vol. 5209 of LNCS, pp. 75–88. Springer, 2008.
ISC, vol. 4176 of LNCS, pp. 217–232. Springer, 2006. [21] S. Kamara, C. Papamanthou, and T. Roeder. Dynamic searchable
[3] M. Bellare, A. Boldyreva, and A. O’Neill. Deterministic and efficiently symmetric encryption. In CCS, pp. 965–976. ACM, 2012.
searchable encryption. In CRYPTO, vol. 4622 of LNCS, pp. 535–552. [22] K. Kurosawa and Y. Ohtaki. Uc-secure searchable symmetric encryption.
Springer, 2007. In FC, vol. 7397 of LNCS, pp. 285–298. Springer, 2012.
[23] JPBC Library. http://gas.dia.unisa.it/projects/jpbc/benchmark.html#
[4] S. Benabbas, R. Gennaro, and Y. Vahlis. Verifiable delegation of
.U5FXwZS1bLd/, 2013. Online; accessed 18-March-2015.
computation over large datasets. In CRYPTO, vol. 6841 of LNCS, pp.
[24] B. Lynn. On the Implementation of Pairing-based Cryptosystems. PhD
111–131. Springer, 2011.
thesis, Department of Computer Science, Stanford University, June 2007.
[5] D. Boneh, G. D. Crescenzo, R. Ostrovsky, and G. Persiano. Public key [25] E. Shen, E. Shi, and B. Waters. Predicate privacy in encryption systems.
encryption with keyword search. In EUROCRYPT, vol. 3027 of LNCS, In TCC, vol. 5444 of LNCS, pp. 457–473. Springer, 2009.
pp. 506–522. Springer, 2004. [26] E. Shi, J. Bethencourt, H. T. Chan, D. X. Song, and A. Perrig. Multi-
[6] D. Boneh and B. Waters. Conjunctive, subset, and range queries on dimensional range query over encrypted data. In S&P, pp. 350–364.
encrypted data. In TCC, vol. 4392 of LNCS, pp. 535–554. Springer, IEEE Computer Society, 2007.
2007. [27] D. X. Song, D. Wagner, and A. Perrig. Practical techniques for searches
[7] C. Bösch, A. Peter, B. Leenders, H. W. Lim, Q. Tang, H. Wang, P. H. on encrypted data. In S&P, pp. 44–55. IEEE Computer Society, 2000.
Hartel, and W. Jonker. Distributed searchable symmetric encryption. In [28] C. Wang, K. Ren, S. Yu, and K. M. R. Urs. Achieving usable
PST, pp. 330–337. IEEE, 2014. and privacy-assured similarity search over outsourced cloud data. In
[8] C. Bösch, Q. Tang, P. H. Hartel, and W. Jonker. Selective document INFOCOM, pp. 451–459. IEEE, 2012.
retrieval from encrypted database. In ISC, vol. 7483 of LNCS, pp. 224– [29] B. Waters. Functional encryption for regular languages. In CRYPTO,
241. Springer, 2012. vol. 7417 of LNCS, pp. 218–235. Springer, 2012.
[9] J. Camenisch, M. Kohlweiss, A. Rial, and C. Sheedy. Blind and [30] P. Xu, H. Jin, Q. Wu, and W. Wang. Public-key encryption with fuzzy
anonymous identity-based encryption and authorised private searches on keyword search: A provably secure scheme under keyword guessing
public key encrypted data. In PKC, vol. 5443 of LNCS, pp. 196–214. attack. IEEE Trans. Computers, 62(11):2266–2277, 2013.
Springer, 2009. [31] Q. Zheng, S. Xu, and G. Ateniese. VABKS: verifiable attribute-based
[10] N. Cao, C. Wang, M. Li, K. Ren, and W. Lou. Privacy-preserving multi- keyword search over outsourced encrypted data. In INFOCOM, pp.
keyword ranked search over encrypted cloud data. IEEE Trans. Parallel 522–530. IEEE, 2014.
Distrib. Syst., 25(1):222–233, 2014.

User Manual GRC - UAR v.09
No ratings yet
User Manual GRC - UAR v.09
48 pages
Privacy-Preserving Multi-Keyword Ranked Search Over Encrypted Cloud Data
No ratings yet
Privacy-Preserving Multi-Keyword Ranked Search Over Encrypted Cloud Data
13 pages
A2 Key 2020 Sample Tests Speaking PDF
100% (1)
A2 Key 2020 Sample Tests Speaking PDF
6 pages
D&D5e League of Legends - Campaign Handbook GM Binder
50% (2)
D&D5e League of Legends - Campaign Handbook GM Binder
71 pages
Career Adapt-Abilities Scale (CAAS)
100% (1)
Career Adapt-Abilities Scale (CAAS)
3 pages
Efficient Regular Language Search For Secure Cloud Storage
No ratings yet
Efficient Regular Language Search For Secure Cloud Storage
14 pages
14th ICCCNT 2023 Paper 998
No ratings yet
14th ICCCNT 2023 Paper 998
6 pages
Privacy-Preserving Multi-Keyword Top-K K Similarity Search Over Encrypted Data
No ratings yet
Privacy-Preserving Multi-Keyword Top-K K Similarity Search Over Encrypted Data
14 pages
Chaud Hari 2020
No ratings yet
Chaud Hari 2020
10 pages
Keyword Search With Access Control Over Encrypted Cloud Data
No ratings yet
Keyword Search With Access Control Over Encrypted Cloud Data
11 pages
Practical Techniques For Searches On Encrypted Data
No ratings yet
Practical Techniques For Searches On Encrypted Data
12 pages
Efficient Ranked Keyword Search For Achieving Effective Utilization of Remotely Stored Encrypted Data in Cloud
No ratings yet
Efficient Ranked Keyword Search For Achieving Effective Utilization of Remotely Stored Encrypted Data in Cloud
9 pages
A Secure and Dynamic Multi-Keyword Ranked Search Scheme Over Encrypted Cloud Data
No ratings yet
A Secure and Dynamic Multi-Keyword Ranked Search Scheme Over Encrypted Cloud Data
32 pages
Appendies 2
No ratings yet
Appendies 2
8 pages
Published Version PDF
No ratings yet
Published Version PDF
10 pages
Secure searchable encryption_ a survey
No ratings yet
Secure searchable encryption_ a survey
14 pages
978-3-642-22200-9_11
No ratings yet
978-3-642-22200-9_11
16 pages
A Survey On Data Retrieval Techniques in Cloud Computing: S.Balasubramaniam, Dr.V.Kavitha
No ratings yet
A Survey On Data Retrieval Techniques in Cloud Computing: S.Balasubramaniam, Dr.V.Kavitha
10 pages
S3C: An Architecture For Space-Efficient Semantic Search Over Encrypted Data in The Cloud
No ratings yet
S3C: An Architecture For Space-Efficient Semantic Search Over Encrypted Data in The Cloud
10 pages
Co PRIVACY PDF
No ratings yet
Co PRIVACY PDF
7 pages
Paper (4)
No ratings yet
Paper (4)
3 pages
Research Papers Cloud Computing
No ratings yet
Research Papers Cloud Computing
8 pages
Proxy-Mediated Searchable Encryption in SQL Databases Using Blind Indexes
No ratings yet
Proxy-Mediated Searchable Encryption in SQL Databases Using Blind Indexes
26 pages
Homomorphic Encryption For Multi-Keyword Based Search and Retrieval Over Encrypted Data
No ratings yet
Homomorphic Encryption For Multi-Keyword Based Search and Retrieval Over Encrypted Data
9 pages
Fast Boolean Queries With Minimized Leakage
No ratings yet
Fast Boolean Queries With Minimized Leakage
14 pages
Efficient Regular Language search for secure cloud storage
No ratings yet
Efficient Regular Language search for secure cloud storage
7 pages
Achieving Secure and Efficient Dynamic Searchable Symmetric Encryption Over Medical Cloud Data
No ratings yet
Achieving Secure and Efficient Dynamic Searchable Symmetric Encryption Over Medical Cloud Data
11 pages
2018 Camaleao Busca Nuvem
No ratings yet
2018 Camaleao Busca Nuvem
11 pages
Searchable Encryption Over Feature Rich Data PDF
No ratings yet
Searchable Encryption Over Feature Rich Data PDF
14 pages
3 - Wang2016secure - 2016 Springer JCIN Secure Searchable Encryption - A Survey
No ratings yet
3 - Wang2016secure - 2016 Springer JCIN Secure Searchable Encryption - A Survey
14 pages
Implementation Paper
No ratings yet
Implementation Paper
4 pages
A Deep Study of Analysis For Encryption and Decryption Algorithm in Cloud Data With Machine Learning Techniques
No ratings yet
A Deep Study of Analysis For Encryption and Decryption Algorithm in Cloud Data With Machine Learning Techniques
5 pages
Fin Irjmets1656149629
No ratings yet
Fin Irjmets1656149629
4 pages
Verifiable_Attribute-Based_Keyword_Search_Over_Encrypted_Cloud_Data_Supporting_Data_Deduplication
No ratings yet
Verifiable_Attribute-Based_Keyword_Search_Over_Encrypted_Cloud_Data_Supporting_Data_Deduplication
13 pages
(IJCST-V4I4P12) :Sonam.K.Darda, Prof. (MRS) .Manasi.K.Kulkarni
No ratings yet
(IJCST-V4I4P12) :Sonam.K.Darda, Prof. (MRS) .Manasi.K.Kulkarni
6 pages
34 Seema Kadam
No ratings yet
34 Seema Kadam
7 pages
VTJCC05
No ratings yet
VTJCC05
14 pages
Semantic Search Approach in Cloud: Gurmeet Kaur Saini, Akhil Chaurasia, Rajni Rani
No ratings yet
Semantic Search Approach in Cloud: Gurmeet Kaur Saini, Akhil Chaurasia, Rajni Rani
4 pages
Secure Keyword Search Using Bloom Filter With Specified Character Positions
No ratings yet
Secure Keyword Search Using Bloom Filter With Specified Character Positions
18 pages
Protecting - Your - Mobile - Cloud - Data-Project 2
No ratings yet
Protecting - Your - Mobile - Cloud - Data-Project 2
20 pages
Privacy-Preserving_and_Trusted_Keyword_Search_for_Multi-Tenancy_Cloud
No ratings yet
Privacy-Preserving_and_Trusted_Keyword_Search_for_Multi-Tenancy_Cloud
15 pages
Efficient Secure Ranked Keyword Search Algorithms Over Outsource Cloud Data
No ratings yet
Efficient Secure Ranked Keyword Search Algorithms Over Outsource Cloud Data
5 pages
Multi-Keyword Searchable Encryption System For Distributed Systems in Cloud Technology
No ratings yet
Multi-Keyword Searchable Encryption System For Distributed Systems in Cloud Technology
5 pages
(IJCST-V3I2P35) : Balaji M, Rajashekar S A
No ratings yet
(IJCST-V3I2P35) : Balaji M, Rajashekar S A
5 pages
Searchable Encryption: From Concepts To Systems
100% (1)
Searchable Encryption: From Concepts To Systems
178 pages
Smart Cloud Search Services Verifiable Keyword-Based Semantic Search Over Encrypted Cloud Data
No ratings yet
Smart Cloud Search Services Verifiable Keyword-Based Semantic Search Over Encrypted Cloud Data
9 pages
Privacy
No ratings yet
Privacy
29 pages
Abstract
No ratings yet
Abstract
2 pages
CSE599-Presention-1 11 Decemeber 2020
No ratings yet
CSE599-Presention-1 11 Decemeber 2020
14 pages
VTJDM04_FR1[1]
No ratings yet
VTJDM04_FR1[1]
20 pages
Ijcse V2i1p2
No ratings yet
Ijcse V2i1p2
6 pages
.Decentralized Access Control and Anonymous Authentication of Data Stored in Clouds PDF
No ratings yet
.Decentralized Access Control and Anonymous Authentication of Data Stored in Clouds PDF
2 pages
9 Dr. Amit Sharma 1459672381 - 254V
No ratings yet
9 Dr. Amit Sharma 1459672381 - 254V
6 pages
A Secure Search Scheme of Encrypted Data On Mobile Cloud: Abstract
No ratings yet
A Secure Search Scheme of Encrypted Data On Mobile Cloud: Abstract
11 pages
Fuzzy Final
No ratings yet
Fuzzy Final
45 pages
Module 5 SECURITY IN WSN USING EDGE COMPUTING
No ratings yet
Module 5 SECURITY IN WSN USING EDGE COMPUTING
84 pages
A Dynamic Multi-Keyword Ranked Search Using Mrse Over On Encrypted Cloud
No ratings yet
A Dynamic Multi-Keyword Ranked Search Using Mrse Over On Encrypted Cloud
5 pages
Phrase Search Using Bloom Filter For Encrypted Cloud Storage
No ratings yet
Phrase Search Using Bloom Filter For Encrypted Cloud Storage
8 pages
Song2016 PDF
No ratings yet
Song2016 PDF
25 pages
(IJCST-V5I2P61) :irene Getzi S
No ratings yet
(IJCST-V5I2P61) :irene Getzi S
5 pages
Optimizing Information Leakage in Multicloud Storage Services
No ratings yet
Optimizing Information Leakage in Multicloud Storage Services
91 pages
Irjaes V2n3p116y17 PDF
No ratings yet
Irjaes V2n3p116y17 PDF
6 pages
Efficient Retrieval Over Documents Encrypted by Attributes in Cloud Computing
No ratings yet
Efficient Retrieval Over Documents Encrypted by Attributes in Cloud Computing
15 pages
Cybersecurity in Cloud Computing
From Everand
Cybersecurity in Cloud Computing
Akula Achari
No ratings yet
(Checked) 12 Anh 1-8
No ratings yet
(Checked) 12 Anh 1-8
9 pages
Memoirs of A Student in Manila by P. Jacinto (A Pen Name of José Rizal)
No ratings yet
Memoirs of A Student in Manila by P. Jacinto (A Pen Name of José Rizal)
28 pages
Norman - S Principles of Interaction
No ratings yet
Norman - S Principles of Interaction
21 pages
HDL71 도면
No ratings yet
HDL71 도면
228 pages
Predictive Maintenance Using Machine Learning in Industrial IoT
No ratings yet
Predictive Maintenance Using Machine Learning in Industrial IoT
7 pages
Engine Lock 100%progress
No ratings yet
Engine Lock 100%progress
58 pages
NBS Deep-Resilience-Master Final
No ratings yet
NBS Deep-Resilience-Master Final
23 pages
5-Ingredient Creamy Pink Beet Pasta Sauce (Vegan) - Eating by Elaine
No ratings yet
5-Ingredient Creamy Pink Beet Pasta Sauce (Vegan) - Eating by Elaine
2 pages
Exercise 1 solutions
No ratings yet
Exercise 1 solutions
20 pages
Bahala Na A Philosophical Analysis 2005 PDF
No ratings yet
Bahala Na A Philosophical Analysis 2005 PDF
18 pages
Introduction to Logic Circuits Logic Design with Verilog 3rd Edition Brock J. Lameres - Get instant access to the full ebook with detailed content
100% (5)
Introduction to Logic Circuits Logic Design with Verilog 3rd Edition Brock J. Lameres - Get instant access to the full ebook with detailed content
69 pages
Inter Preneur Ship
No ratings yet
Inter Preneur Ship
135 pages
Accenture A Massive Cloud Migration in A Flash
No ratings yet
Accenture A Massive Cloud Migration in A Flash
8 pages
Var Description and Values
No ratings yet
Var Description and Values
4 pages
Yugoslavism Dejan Djokic download
100% (2)
Yugoslavism Dejan Djokic download
48 pages
Yoga For Emotional Balance Part 1
100% (2)
Yoga For Emotional Balance Part 1
14 pages
DevOps Interview QuestionS
No ratings yet
DevOps Interview QuestionS
10 pages
Iccs Brochure Vagai
No ratings yet
Iccs Brochure Vagai
2 pages
ODL and OpenStack - Workshop - 0
No ratings yet
ODL and OpenStack - Workshop - 0
38 pages
Achievement Test, Diagnostic Test, Achievements Vs
No ratings yet
Achievement Test, Diagnostic Test, Achievements Vs
9 pages
All 5 Units-Python Programming
No ratings yet
All 5 Units-Python Programming
160 pages
Datasheet - Live: 20 MM, PVC-2LXT-L5, - 2LXT-LD5
No ratings yet
Datasheet - Live: 20 MM, PVC-2LXT-L5, - 2LXT-LD5
2 pages
EAPP - Q1 - Lesson 1D - Thesis Statement
No ratings yet
EAPP - Q1 - Lesson 1D - Thesis Statement
9 pages
ADAPTIVE TEACHING GUIDE DISS W2
No ratings yet
ADAPTIVE TEACHING GUIDE DISS W2
5 pages
Basic Vocal Training Part 2
No ratings yet
Basic Vocal Training Part 2
6 pages

Privacy-Preserving and Regular Language Search Over Encrypted Cloud Data

Uploaded by

Privacy-Preserving and Regular Language Search Over Encrypted Cloud Data

Uploaded by

This article has been accepted for publication in a future issue of this journal, but has not been

Privacy-Preserving and Regular Language Search

1) (mpk, msk) ← Setup(1n , ): on input a security P

Fig. 1: System Architecture

For i = 1 to l, compute: somewhat “ABE” infrastructure where there exists a fully

TABLE VII: Comparison in Communication Cost

Components Length (bit)

Fig. 3: Running Time Comparison in Token Generation

keyword matching test, while Fig. 5, 6 and 7 present the

70000 [11] D. Cash, J. Jaeger, S. Jarecki, C. S. Jutla, H. Krawczyk, M. Rosu, and

You might also like