A Simple Substring-Knowledge Password Recovery in the Challenge-Response Setting

Client-Server Password Recovery (Extended Abstract)

Abstract

Human memory is not perfect – people constantly memorize new facts and forget old ones. One example is forgetting a password, a common problem raised at IT help desks. We present several protocols that allow a user to automatically recover a password from a server using partial knowledge of the password. These protocols can be easily adapted to the personal entropy setting [7], where a user can recover a password only if he can answer a large enough subset of personal questions.

We introduce client-server password recovery methods, in which the recovery data are stored at the server, and the recovery procedures are integrated into the login procedures. These methods apply to two of the most common types of password based authentication systems. The security of these solutions is significantly better than the security of presently proposed password recovery schemes. Our protocols are based on a variation of threshold encryption [17, 8, 5] that may be of independent interest.

Keywords:
password recovery, threshold encryption scheme, private computing, personal entropy

1 Introduction

People constantly memorize new facts, but also forget old ones. One quite common example is forgetting a password. It is one of the most common problem raised at IT help-desks. Therefore, many systems for password recovery (PR) have been built. The common aim of all these systems is to provide reliable solutions for legitimate users to recover lost passwords or to receive a new password (i.e., resetting the old password), without significantly increasing the vulnerability against attackers.

The simplest way to authenticate the user is to use an out-of-band channel, like a phone call, or show up physically at a system administrator. This is costly however, and cumbersome. More user-friendly, but less secure, is the common method used by many websites that store the password of the user in the clear and resend it to the user’s email address on request. Sometimes websites require a user to answer some personal question, like “what is your mother’s maiden name?”. However, this method is insecure because a password sent in cleartext can be easily intercepted and it is relatively easy to answer such a single question.

Another widely used method to cope with forgetting passwords is a password reset system. In this system when a user forgets the password then the server sets a new password and emails the new password to the client (again maybe after answering a personal question). Now the legitimate user can regain system access easily. However, the security of this system depends heavily on the security of the email server, and therefore, this system is uninteresting from our point of view.

There is quite a lot of research on more sophisticated PR methods that do not fully trust the server. One approach is to use secret sharing [18, 2]. This solution divides a password into shares (that are stored on trusted servers) in such a way that for the reconstruction, it is necessary to collect at least a threshold of these shares. However, the user still needs to authenticate somehow to the servers, and therefore this system does not fully solve our problem.

In [7] a PR system, based on personal entropy, is proposed. In this system, a user is asked some questions about his personal history during password registration. The system generates a random secret key, and encrypts the real password with it. Subsequently, the answers given by the user are used to “encrypt” the random secret key. The user then stores the questions, the “encryption” of the secret value, and the encryption of the password on his computer. A secret sharing scheme is used to enable password recovery, even if some questions are answered incorrectly. The drawback of this scheme is the lack of a rigorous security analysis. In fact, [3] demonstrates a serious weakness of this scheme: with the parameters recommended for a security level of , the system is in fact vulnerable to an attack that requires only operations.

The ideas from [7] were improved in [9]. This improved password recovery uses error-correcting codes instead of a secret sharing scheme. A rigorous security analysis is performed in the chosen model. The solution of [9] uses techniques that are very close to secure sketches.

Secure sketches and fuzzy extractors (described e.g., in [6]), and their robust versions [15, 12], are cryptographic tools useful for turning noisy information into cryptographic keys and securely authenticating biometric data. They may also be used to solve the password recovery problem. However, contrary to intuition, it seems hard to use these cryptographic primitives to solve password recovery in our most secure model, as show in Section 3.

We believe that [7, 9] are a significant step towards a practical PR solution. However, such so-called local PR systems are vulnerable to attackers that steal the recovery data from the user’s machine (which is quite often inadequately secured) and then mount an offline brute force attack to recover the password. To avoid this scenario, we introduce client-server password recovery, in which the recovery data should be stored at the server, and PR should be integrated into the login procedure. In such a setting (under the more reasonable assumption that the recovery data cannot be stolen from the secure server) an attacker can only perform an online brute force attack. Security then can be increased by limiting the number of tries per account, or increasing the response time.

Our contributions are the following. Firstly, we introduce the password recovery problem and the client-server PR security model, together with a short analysis of password authentication systems, in Section 2. All our client-server PR systems apply to a simple (low entropy) password login system. In all these PR systems, the client is stateless, and all recovery data is stored at the server. Our solutions reduce the entropy somewhat, but are still more secure than other approaches. Moreover, our ideas can be straightforwardly applied to the personal entropy system, as shown in Subsection 2.2, making the recovery phase more secure. We elaborate on using secure sketches and fuzzy extractors for PR in Section 3. Subsequently, we present a new algorithm (Section 4) for local PR that is based on intraceability Assumption  from [14]. In Section 5, we introduce a new variant of threshold encryption [17, 8, 5], called equivocal threshold encryption, that does not provide validity proofs for the decryption shares. Combining these two, we present protocols for client-server PR integrated into two classes of systems for password based login: the most common, hash based one in which the server keeps hashes of passwords but receives cleartext passwords during the login phase (Section 6), and the most secure solution, based on challenge response, in which the server never sees passwords in clear at all (Section 7). Moreover, in Appendix A we briefly present a simple substring-knowledge PR working in the challenge response setting. Furthermore, all our password recovery systems can be easily modified to work as password reset systems. Due to space constraints we omit these easy transformations.

Due to space constraints in this version of the paper, proofs of security and correctness of the presented protocols are short and informal.

2 Password Recovery Model

In this section we discuss the kinds of password authentication (PA) systems for which we consider password recovery, define exactly what we mean by password recovery, and talk about the kinds of adversaries our protocols need to withstand.

2.1 Password Authentication (PA) Systems

PASSWORD REGISTRATION:
Client (, ; ): Server (database DT):
1) Chooses a cyclic group with generator , like in Section 5.
2) STORE(DT, )
LOG IN:
Client (, ; ): Server (database DT):
1) 2) LOOK-UP(DT, );
3) Chooses random and sends it.
4) 5) If then ACCEPT else REJECT
Figure 1: challenge-response password authentication system

Two kinds of participants are involved in PA systems: users (also called clients) and servers. Clients have a username (also called login) and a password , where and is the domain of characters of passwords ( is usually small, e.g., ). For simplicity, we assume that clients always remember their logins, and that the length of the password is fixed for all users.

Initially, a client registers himself (in the registration phase) with the server by submitting a username and an authenticator (derived from the password), which the server stores in its database. Subsequently, the client can authenticate (in the log in phase) to the server using his username and a proof of knowledge of the password. The server, using the authenticator from the database, and the proof of knowledge, can efficiently verify that the user knows the corresponding password. We distinguish three different PA schemes with respect to the security requirements. These systems differ in the way that authenticators and proofs of knowledge are defined: an authenticator can be equal to a password, a proof can be equal to a password (this is the case in hash based systems, where the server stores hashes of passwords), or neither of the above (which is the case for challenge-response type systems, an example of which is presented in Figure 1).

The password recovery for the first system is trivial (because the server stores passwords in clear), and we omit it in this paper. The PR solutions for the other two PA systems are presented in Sections 6 and 7, respectively.

2.2 Client-Server Password Recovery (PR)

A system for client-server PR has the same participants and log in routine as a PA system. Moreover, it provides an additional routine called password recovery (PR), in which the client tries to recover the lost password. The password registration is also modified: besides submitting the login, and the authenticator, it also submits the recovery data. The client’s input in the PR phase is login and a perturbed (incorrect) password , while the server’s input is the database with the logins and the registration data. Local password recovery is similar to client-server password recovery, except that the recovery data is stored locally at the client, and the recovery protocol is run locally at the client.

The requirement is that the client recovers the password, if and only if, is similar to the password corresponding to his login. To be precise, we define similarity between strings and as ( matches ), if and only if, . We assume that the parameters and are public.

Note, that having partial knowledge of the password is a very similar recovery condition to the personal entropy one [7, 9]. In the personal entropy system the client needs to answer some threshold of questions (i.e., out of questions) to recover the password. The answers to the questions can be considered as an additional password, where every single answer can be treated as a letter. It is easy to transform our systems to work with an auxiliary password, and therefore, with personal questions. We skip these straightforward transformations in this paper.

We develop our protocols based on the following assumptions. We assume existence of the secure channels between the server and clients (which can be achieved using TLS connections). We work in the Random Oracle Model (ROM) [1], which means that we assume that hash functions work like random functions. Moreover, we use keyed hash functions, also called message authentication codes (MACs), of the form , where is a field. The first parameter of is a random string of length (the security parameter). For simplicity, we often omit this parameter in our descriptions.

We look for efficient protocols, i.e., , at the server side (because many clients might want to perform password recovery simultaneously), but we do allow a certain time penalty at the client side.

2.3 Adversaries and Security Requirements

All our client-server protocols defend against an adversary impersonating a client. Such an adversary is computationally bounded by (but not by ) and is malicious [11], which means he can disobey the protocols routine. This adversary tries to break a server’s privacy that can be informally defined as follows. The impersonator, after any number of unsuccessful PR runs, can recover more information about the password, than following from the fact that the PR invocations failed, only with a negligible probability in . Notice however, that this adversary can always perform an online brute force attack on the PR routine (even using the password’s distribution). But this is easily mitigated by adding timeouts or allowing only a fixed number of tries before blocking an account.

We also consider an adversary accessing the server’s database in all our client-server protocols. We model this adversary differently than the one impersonating client, because this adversary can perform offline brute force attack using the PR routine. Therefore, we define the adversary to not know the password distribution and to be computationally bounded with respect to and the parameters , , (in a way that the problem from Assumption 4.1 is hard). The adversary tries to break a client’s privacy that can be informally, defined as follows. For every two passwords and , the corresponding two PR data instances are indistinguishable. An adversary accessing local PR (see Section 4) is defined in the same way.

Only the challenge-response protocol (Section 7) is resistant against a fully corrupted server. The adversary corrupting the server is computationally bounded by and tries to gain information about client’s password guesses from the data received in PR runs. We assume that this adversary is malicious in the sense, that he performs any actions to break the guesses privacy. However, there is no point for him to alter the client’s output: the client can easily verify correctness of the recovery by logging in. This approach is very similar to private computation from [14]. The guesses privacy can be defined as follows: from a PR run the adversary gains negligible knowledge about the client’s guess.

3 Problems with Using Robust Fuzzy Extractors and Secure Sketches for Client-Server PR

In this section we show the main problems of using secure sketches or fuzzy extractors solving client-server PR in our strongly secure model. Secure sketches and fuzzy extractors (see [6]) can be used for turning noisy information into cryptographic keys and securely authenticating biometric data.

Now, let’s define secure sketches and fuzzy extractors. Let be a field, , and a Hamming distance function in . An -secure sketch is a pair of procedures, “sketch” () and “recover” (), with the following properties. Firstly, on input returns a bit string . Secondly, the procedure takes an element and a bit string . The correctness property guarantees that if , then equals . The security property guarantees that for any distribution over with min-entropy , the value of can be recovered by the adversary who observes , with probability no greater than .

An -fuzzy extractor is a pair of procedures, “generate” () and “reproduce” (), with the following properties. Firstly, the procedure on input outputs an extracted string and a helper string . Secondly, takes an element and a string as inputs. The correctness property guarantees that if and were generated by then . The security property guarantees that for any distribution over with min-entropy , the string is nearly uniform even for those who observe . A robust version of fuzzy extractor additionally detects whether the value got modified by an adversary (which is essential in the biometric authentication).

Secure sketches can be used to solve local PR (Section 4) and client-server PR from Section 6. Roughly speaking, the first case is close to the approach from [9]. Let’s consider the second case. The client produces of his password and sends it to the server, who stores . When the client invokes the PR routine by sending then the server runs and if then the server sends back . This solution is sound and secure, i.e, the server can guess with probability no greater than . However, we do not see a way to transform this solution to the challenge response model, because in this model the server is not allowed to see the password’s guesses. We leave finding the transformation of this solution to the challenge response model as a future work.

It would appear that Robust Fuzzy Extractors (RFE) can be used to overcome this problem in, for example, the following way. First the client produces and (where is a symmetric encryption scheme, e.g., AES), and he sends and to the server, who stores them. When the client invokes the PR routine, then the server sends the relevant to the client. Now, the client can recover , and try to decrypt: . This solution is sound and seems secure. However, in our security model this protocol gives too much information to the adversary impersonating the client, because it allows an offline dictionary attack. We remind, that the adversary is computationally bounded by but not . Therefore, the adversary can simply guess bits (notice, that practically always ), and break the protocol. Other solutions based on RFE seem to suffer to the same problem.

4 Local Password Recovery

As explained in the introduction, a client of local password recovery, similarly to [7, 9], keeps the recovery data on his machine (there is no server). The client generates the recovery data and later on, tries to recover the lost password from the password guess and the recovery data. In Figure 2 we present a solution for local PR. Its security is based on the following intraceability assumption derived from [14], which is related to the polynomial list reconstruction problem.

The intraceability assumption.

Let denote the probability of distribution of sets generated in the following way:

  1. Pick a random polynomial over (denote ), of degree at most , such that .

  2. Generate random values subject to the constraint that all are distinct and different from .

  3. Choose a random subset of different indexes in , and set for all . For every set to be a random value in .

  4. Partition the pairs in random subsets subject to the following constraints. Firstly, the subsets are disjoint. Secondly, each subset contains exactly one pair whose index is in (hence ) and exactly pairs whose indexes are not in . We denote these subsets as . Output the resulting subsets.

The intractability assumption states that for any the two probability ensembles , are computationally indistinguishable depending on the parameters , , , and .

Assumption 4.1 (Assumption 2 from [14])

Let be a security parameter, and let , , , be at least linear polynomially bounded functions that define the parameters , , and . Let and be random variables that are chosen according to the distributions and , respectively. Then it holds that for every , the probability ensembles and are computationally indistinguishable.

Password Registration: The input is , where , and . The client: Generates , and values . Every is a MAC with implicit first parameter as described in Section 2.2. Generates random values in such a way that points define a polynomial of degree , and =. Returns: =; each is a similar MAC to . Password Recovery: The input is: , . The client computes set . The client tries to reconstruct from any subset of elements of (that is checks). He checks whether for any potentially recovered polynomial the following holds (let ): and defines a polynomial of degree . If it holds then he outputs . If it does not hold for any then the client outputs .

Figure 2: Local Password Recovery

In our applications the assumption’s parameters are set as follows: and like in PR, and , where is large prime. One may argue that , and are relatively small parameters (e.g., is the length of passwords) and that they might not deliver good security to the system. However, notice that in the personal entropy setting (i.e., the question-answer setting) the parameters can be significantly enlarged. Moreover, we are not aware of any algorithm solving the assumption problem (i.e., finding ) in our setting faster than by guessing proper points.

We are conscious that for similar problems there exist fast solutions. For example, if in the above problem all then the problem can be solved fast (see [3, 4]). However, these fast algorithms do not solve the problem from Assumption 4.1, as stated in [14].

The local PR solution.

Now we describe the protocol. In the first step the client prepares PR data: and , such that define a polynomial of degree , for which . Here, are hash functions (see Figure 2). Afterwards, the client forgets the password, and tries to recover it from . If then he obtains in at least proper points belonging to , and can derive the password . Otherwise, informally speaking, the client needs to solve the problem from Assumption 4.1.

Theorem 4.2 (Local PR Security)

An adversary attacking PR from Figure 2 first produces two passwords , and sends them to an oracle. Then the oracle chooses , performs password registration for , and sends the result back. Finally, outputs his guess of .

succeeds with some probability . We denote his advantage as . Working in ROM, no having non-negligible advantage exits under Assumption 4.1.

Proof (sketch)

Assume to the contrary that there exists an adversary , that attacks our local PR with non-negligible advantage. Using , we construct an adversary that breaks Assumption 4.1. Firstly, sends to . forwards them to an intraceability oracle (corresponding to Assumption 4.1). This oracle chooses , and answers with subsets sampled from . Now sends to : , and random points in : . defines random oracles (representing and ) in the following way: for all and : . outputs the result of . Notice, the importance of the implicit random parameter , which lets random oracles, for two different PR runs, have different outputs (even for the same password).

Because of working in ROM, the distribution of ’s input, created in such a way by for , is identical to the distribution of the client’s input created in password registration (from Figure 2) for . Therefore, ’s advantage is equal to ’s advantage, and Assumption 4.1 is broken. ∎

5 Equivocable Threshold Cryptosystem

In this section we define an equivocable threshold encryption (TE) scheme, and we present a slightly modified threshold ElGamal scheme (based on [17], and the “normal” ElGamal scheme [10]) that is equivocable. Subsequently, in Sections 6 and 7 we use this scheme to solve the PR problem.

In [8] a standard TE scheme consists of the following components. A key generation algorithm takes as input a security parameter , the number of decryption servers , the threshold parameter and randomness; it outputs a public key , a list of private keys, and a list of verification keys. An encryption algorithm takes as input the public key , randomness and a plaintext ; it outputs a ciphertext . A share decryption algorithm takes as input the public key , an index , the private key and a ciphertext ; it outputs a decryption share (called also partial decryption) and a proof of its validity . Finally, a combining algorithm takes as input the public key , a ciphertext , a list of decryption shares, a list of verification keys, and a list of validity proofs. It performs decryption using any subset of of size , for which the corresponding proofs are verified. If there is no such set then fails.

An equivocable TE scheme consists of the same components as above, but: does not produce verification keys, does not produce validity proofs, and validity proofs are not part of ’s input. Therefore, simply checks if a decryption is possible for any subset (that is checks).

A secure equivocable TE scheme should fulfill the standard TE security definition called threshold CPA [8]. Notice, that omitting validity proofs does not help a malicious combiner to decrypt, because he possesses less data than for standard TE. A secure equivocable TE scheme moreover has the following properties. After any number of invocations, a malicious combiner (which does not know any secret shares) gains no information about: (1) the plaintexts in unsuccessful runs (semantic security) and (2) the shares used in unsuccessful runs for producing partial decryptions. We formalize this intuition in Definition 1.

Definition 1 (Equivocable Security)

Define an oracle . Firstly, performs algorithm (for the parameters stated above). Then can be accessed by the following procedures:
; returns: an encryption of , and correct decryption shares .
, where and ; produces an encryption of , and , where if , and (where is a random value) otherwise; returns .
; returns ; every is a random value.

First game (corresponds to property 1):

  1. invokes , and sends a public key to a malicious combiner .

  2. sends a message to the oracle , which returns . This step is repeated as many times as the combiner wishes.

  3. chooses and sends them to the oracle.

  4. chooses , and sends them to , which chooses . Then sends back . This step is repeated as many times as the combiner wishes.

  5. repeats Step 2, and finally, outputs his guess of .

No polynomial time adversary guesses with a non-negligible advantage.

Second game (corresponds to property 2):

  1. invokes , and sends a public key to a malicious combiner .

  2. The same like Step 2 of .

  3. chooses and sends it to the oracle.

  4. chooses , and sends them to , which chooses . Then sends back if , and otherwise. This step is repeated as many times as the combiner wishes.

  5. repeats Step 2, and finally, outputs his guess of .

No polynomial time adversary guesses with a non-negligible advantage.

5.1 ElGamal Equivocable TE Scheme

In this section we introduce our version of the ElGamal scheme and prove that this version is securely equivocable.

Let g denote a finite cyclic (multiplicative) group of prime order for which the Decision Diffie-Hellman (DDH) problem is assumed to be infeasible: given , where either ( means that a value is chosen uniformly at random from a set) or , it is infeasible to decide whether . This implies that the computation Diffie-Hellman problem, which is to compute given , is infeasible as well. In turn, this implies that the Discrete Log problem, which is to compute given , is infeasible. We use the group defined as the subgroup of quadratic residues modulo a prime , where is also a large prime. This group is believed to have the above properties.

In the ElGamal scheme the public key consists of , a generator of , and , while the private key is . For this public key, a message is encrypted as a pair , with . Encryption is multiplicatively homomorphic: given encryptions , of messages , respectively, an encryption of is obtained as . Given the private key , decryption of is performed by calculating .

ElGamal semantic security can be defined using the following game. An oracle first sends to an adversary. Then the adversary sends plaintexts to the oracle, which answers, for , with . Finally, the adversary guesses . The scheme is semantically secure if the adversary’s advantage is negligible. The ElGamal scheme achieves semantic security under the DDH assumption.

In this paper we use a -threshold ElGamal cryptosystem based on [17], in which encryptions are computed using a public key , while decryptions are done using a joint protocol between parties. The th party holds a share of the secret key , where the corresponding can be made public. As long as at least parties take part, decryption succeeds, whereas less than parties are not able to decrypt.

We set the shares as follows: the dealer makes the polynomial , by picking (for ) and . In the original scheme, the th share is , while in our scheme , and each is made public. The schemes security is based on linear secret sharing [18]: points of a polynomial of degree are sufficient to recover the polynomial and less points give no knowledge about .

The reconstruction of plaintext can be performed in the following way. For some , it is required to have proper partial decryptions and , which can be combined to compute (for any ):

(1)

Hence, because can be computed, can be decrypted as follows: . Equation 1 describes a polynomial interpolation in the exponent.

We now show that our TE scheme is equivocable with respect to Definition 1 under the DDH assumption. For simplicity, we assume that the combiner receives only the data from unsuccessful invocations. However, the successful ones can be handled in a similar way to the security proof of [17]. We prove some lemmas, and then based on them we show that our scheme is equivocable.

Lemma 1 (Run Independence)

We define the following game. Firstly, an adversary gets from an oracle a public key , and parameters , . Secondly, the oracle: chooses , prepares a list of shares with secret key , and sends to . Then, chooses two plaintexts and , and sends them to the oracle. Now, repeats as many times as he wishes the following step: chooses any and sends them to an oracle, which returns: , where is chosen by the oracle. Finally, outputs his guess of .

No polynomial adversary guesses with non-negligible advantage under the DDH assumption.

Proof (sketch)

Assume that asks the oracle for partial decryptions at most times (where is polynomial in ). For simplicity, we assume here that and . The proof for greater , , and can be made similarly.

Assume to the contrary that there exists an , that wins the game with a non-negligible advantage . Using we construct an adversary that breaks the ElGamal semantic security. Firstly, receives a public key from a “semantic security” oracle, and forwards it to . also generates and sends them to . Then chooses plaintexts , and sends them to . Subsequently, forwards them to the oracle, which answers with . Now, chooses and . computes, using Equation 1, such that points: define a polynomial of degree . Then chooses , and a random permutation .

Subsequently, asks for partial decryptions. When asks th time (st or nd time) and then answers: . If and then halts and outputs a random bit. Eventually, if then sends to (for ): . Finally, returns ’s output.

Notice that in the case , the probability that (and the attack stops with a random output) is . Assume that it does not happen. Note, that if then ’s input is well constructed and the probability that outputs is . Otherwise, because of the random permutation , ’s input is distributed independently of (even if the adversary asks less than times). Thus, the probability of guessing correctly is in this case. Therefore, the ’s advantage is . ∎

The proof for greater and is easy: can simply produce more data . In the case of , the proof is modified as follows. chooses randomly indexes and the corresponding shares. Then chooses , and constructs the answer to the th question of () as follows. If ( is a random permutation of set ) then, if knows , then answers with . If and does not have corresponding shares then finishes and outputs a random bit. Otherwise (), answers (using Equation 1) with:

Finally, ’s result is returned by .

This construction ensures that ’s input is either well constructed or, because of the permutation , is produced independently of . The probability of not returning a random bit (when ) is , and is non-negligible in . Details of this constructions are quite straightforward, and we omit them here.

Lemma 2 (Run Indistinguishability)

We define the following game. Firstly, an adversary gets from an oracle a public key , and parameters , . Secondly, the oracle: chooses , prepares a list of shares with a secret key , and sends to . Now, repeats as many times as he wishes the following step. chooses a set (where each and ) and sends it to the oracle. If then the oracle chooses and answers with: . Otherwise the oracle chooses and answers with: . Finally, outputs his guess of .

No polynomial adversary guesses with non-negligible advantage under the DDH assumption.

The proof sketch of this lemma is in the Appendix B.

Corollary 1

We define the following game. Firstly, an oracle: chooses , generates a public key , and a list of random elements (in ): . Secondly, the oracle sends , , and to an adversary . The following action is repeated as many times as wishes: if then the oracle chooses and sends to : . Otherwise the oracle chooses and sends: . Finally, outputs his guess of .

No polynomial adversary that guesses with non-negligible advantage exists under the DDH assumption.

Proof

Follows directly from Lemma 2 for parameters and . ∎

Now based on Lemmas 12, we show that our TE scheme is equivocable.

Theorem 5.1 (ElGamal Equivocable TE Scheme)

The ElGamal TE scheme described above in Section 5.1 is equivocable with respect to Definition 1 under the DDH assumption.

Proof

Successful combining invocations can be handled like in the security proof from [17]. This theorem, for unsuccessful invocations, follows directly from Lemma 1 for the first game, and from Lemma 2 for the second game. ∎

6 Password Recovery for the Hash based PA System

In this section we present solutions that work for the most widely used PA system. We present first a simple and secure PR scheme, that has a functional drawback: the server’s time complexity is too high for many scenarios. Secondly, we show the solution that eliminates this drawback.

6.1 Simple PR System for the Hash based PA System

In the simple PR system the server performs all important security actions. During the registration the client sends to the server the login, and the password . The server generates the local PR data, like in Section 4. Later, if the client wants to recover , he sends a perturbed password to the server, who runs the local PR routine (Section 4). If the recovery was successful then is sent to the client and the request is rejected otherwise. The correctness and the security of this protocol follows directly from the corresponding local PR properties.

Notice, that the client’s privacy is not protected during protocols run (the server even knows the result of PR). Furthermore, there are two significant drawbacks: checks on the server side, and we do not foresee any way to transform this protocol to work in the securer, challenge-response model. These problems are solved in Section 6.2.

6.2 Improved PR System for the Hash based PA System

We improve the simple PR scheme by combining the equivocable TE scheme (Section 5) with local PR. In this solution, the client checks whether the password recovery is possible. Therefore, the server’s time complexity is efficient. The improved PR system is presented in Figure 3.

During registration the client first produces a public key of the equivocable TE scheme, with the corresponding secret key and computes an encryption of the password . Subsequently, he generates the PR data: secret values (they have the same meaning as in local PR) and points . All the points together with define the polynomial of degree . This construction is very similar to the local PR registration. The client also produces the login and the hash of the password for the PA system. Then all these data are stored on the server. Intuitively, the server cannot recover more than in local PR, because he stores the local PR data and an encryption of the password under the secret of the local PR data.

If the client forgets the password then he invokes the PR routine by sending the login and a guess . Subsequently, the server produces, using the homomorphic property, a new encryption of . Afterwards, the potential partial decryptions are produced. Notice, that if then () is a proper partial decryption of . Later on, the server sends (so the client can compute ), , and . If , then the client can easily obtain , because he has at least proper decryptions. Otherwise, the client does not have enough correct decryptions to obtain . Moreover, because of the equivocable property of the TE scheme, the client cannot recognize which partial decryptions are correct from the data from many unsuccessful PR runs.

and are implicit parameters for and , respectively, that are used to make different local PR data indistinguishable. is public (it is send to the client before any authentication), while is not revealed to the client, so he cannot locally compute .

PASSWORD REGISTRATION: The client’s input is: and (); the server’s input is his database. The client chooses and generates a public key of the -TE scheme (Section 5): =. Then he generates shares: of the secret key , where . is MAC (described in Section 2.2) with implicit parameter . The client computes encryption of the password : , and produces =; is MAC with implicit parameter . Then he sends ( is from the PA system). The server stores in his database. LOG IN: The client sends his , and to the the server, which accepts the client if is equal to the corresponding value from the database. PASSWORD RECOVERY: The client’s input is: and (); the server’s input is his database. The client sends to the server. The server performs: finds = corresponding to in the database. re-randomizes , by . produces potential partial decryptions of : . sends , , , and the partial decryptions to the client. Using , the client performs a invocation from Section 5. If a decryption matches then the client outputs .

Figure 3: Improved PR for UNIX-based Log In

Correctness and Security.

Correctness of the PR phase is straightforward: if then at least partial decryptions are correct and thus, the client can decrypt . Otherwise, the client does not have enough partial decryptions of .

Theorem 6.1 (The privacy of the client)

An adversary attacking the privacy of the client from Figure 3 produces two passwords , and sends them to an oracle. Then the oracle, chooses , performs the registration for , and sends the result back. Finally, outputs his guess of .

Working in ROM, no having non-negligible advantage exits under the DDH assumption and Assumption 4.1.

Proof (sketch)

Assuming that the DDH assumptions holds (and thus, the ElGamal is semantically secure), can break the scheme only by gaining the secret of the local PR data. Following Theorem 4.2, if the local PR security is broken then Assumption 4.1 does not hold.

Theorem 6.2 (The privacy of the server)

Define an ideal situation to be one, in which an adversary tries PR by sending his guess of the password to the server, who returns if , and the empty string otherwise. Now, define a simulator as an algorithm that works in the ideal situation, and acts as a server to an adversary attacking the privacy of the server.

In ROM and under the DDH assumption, there exists a simulator such that no adversary can distinguish between and the real server (from Figure 3) with non-negligible advantage.

The proof sketch of this lemma is in the Appendix C.

Complexity.

During the registration the client sends a public key, two secret values (of length ), the login, the hash of the password, an encryption of the password, and perturbed shares. The complexity of this phase can be bound by bits. In the PR phase the server sends the public key, an encryption of password, and potential partial decryptions. This totals to bits.

The registration is performed efficiently by the participants. In the PR phase the server’s performance is fast (main load is exponentiations), while the client’s time complexity involves polynomial interpolations (Step 3).

7 Password Recovery for the Challenge-Response System

In this section we present a PR solution for challenge response login system, where the password or the guess of the password is never sent to the server. We combine the protocol from Section 6.2 with oblivious transfer (see below). The challenge-response PR protocol is shown in Figure 4.

There are two participants in the OT protocol: Receiver, who wants to obtain some information from a remote database and Sender that owns the database. OT can be formalized as follows. During a -party -out-of- OT protocol for -bit strings (), Receiver fetches from the Sender’s database , , so that a computationally bounded Sender does not know which entry Receiver is learning. Moreover, we assume information-theoretically privacy of Sender (it means that Receiver obtains only desired and nothing more). Such scheme is presented in [13]. This OT protocol works in bit communication , low degree polylogarithmic Receiver’s time computation and linear time Sender’s computation. This is the fastest oblivious transfer protocol to the best of our knowledge.

This system is very similar to the one from Section 6.2. However, the log in routine is different (i.e., the challenge-response one is used), and the PR routine is a bit modified. The client does not send the guess directly to the server. Instead, he obtains partial decryptions corresponding to in an oblivious way, as follows. For each , the server prepares a potential partial decryption for all possible letters (Step 3). Then the client asks for partial decryptions for guess by performing oblivious transfer times: for every letter separately. In this way, the server does not gain information about , and the client cannot ask for more than one partial decryption per OT protocol. The protocol’s security follows from the security of OT and the security properties of the scheme from Section 6.2.

PASSWORD REG.: like in Fig. 3, but instead of , values are sent. LOGGING IN: like in the challenge-response PA system (Figure 1). PASSWORD RECOVERY: The client’s input is: and ; ; the server’s input is the database. The client sends to the server. The server, using , finds = in the database. Then he re-randomizes : and sends , , . For , the client and the server performs protocol, where = and is a partial decryption’s bit size. The server acts as Sender with the database: and the client acts as Receiver with index . The client’s output is . The same like Step 3 in PR from Figure 3.

Figure 4: challenge-response PR

7.1 Correctness and Security

We give an informal intuition about the theorems and the proofs. The proof of the correctness and the privacy of the client outside the protocol runs are the same as for the system from Figure 3. The proof of the privacy of the server is the same as the one for PR from Figure 3, assuming that the OT is secure. The privacy of the client during PR runs is maintained by using OT (the server cannot gain any information about the client guess ).

7.2 Complexity

Only the PR phase is significantly different from the system from Figure 3. The major payload comes from runs of protocols. This can be bound by bits. The bit complexity of this PR, although greater than the one from Figure 3, is still efficient.

In the PR protocol the time complexity of the client is relatively high and follows from polynomial interpolations. The main drawback of this protocol is the time complexity of the server, who acts as Sender in OT, using operations. However, for the relatively small domain of letters , and due to the fact that PR is performed rarely, this solution is still quite feasible. This drawback might be of greater impact if we use this protocol in the personal entropy setting (i.e., the question-answer setting), where might be larger.

8 Conclusions

In this paper we have presented secure and efficient solutions for password recovery, where the recovery data is stored securely at the server side. Our solutions apply to all common types of password authentication systems, without significantly lowering their security. We have introduced a variant of threshold encryption, called equivocable, that serves as a building block to our solutions, and that may be of independent interest as well.

Further research could be aimed at alternative definitions of password similarity, that also include reordering of password letters (which is a common mistake). Other issues that can be improved are the time complexity at the client side, and the server’s time complexity in the challenge-response protocol (Section 7).

Appendix A Simple Substring-Knowledge Password Recovery in the Challenge-Response Setting

In this appendix we present a simple and efficient substring-knowledge challenge-response PR scheme that uses an additively homomorphic encryption scheme. In order for a client to recover a password it needs to prove to the server that he remembers a substring of the original password.

Let denote a homomorphic encryption function with a public key . The homomorphic cryptosystem supports the following two operations, which can be performed without knowledge of the private key. Firstly, given the encryptions