Usability of Humanly Computable Passwords

Usability of Humanly Computable Passwords

Abstract

Reusing passwords across multiple websites is a common practice that compromises security. Recently, Blum and Vempala have proposed password strategies to help people calculate, in their heads, passwords for different sites without dependence on third-party tools or external devices. Thus far, the security and efficiency of these “mental algorithms” has been analyzed only theoretically. But are such methods usable? We present the first usability study of humanly computable password strategies, involving a learning phase (to learn a password strategy), then a rehearsal phase (to login to a few websites), and multiple follow-up tests. In our user study, with training, participants were able to calculate a deterministic eight-character password for an arbitrary new website in under 20 seconds.

mindhash, password strategy, humanly computable
\settopmatter

printacmref=false \setcopyrightrightsretained \acmDOI10.475/123_4 \acmISBN123-4567-24-567/08/06 \acmConference[WWW]The Web ConferenceApril 2018 \acmYear2018 \copyrightyear2018 \acmArticle4 \acmPrice15.00

1 Introduction

For over fifty years, passwords have served as the most common method of human-computer authentication and are likely to do so for the foreseeable future Bonneau et al. (2015). Extensive research shows that many passwords in use can be easily guessed (Mazurek et al., 2013) and that people reuse passwords across different accounts (Bonneau, 2012). Password reuse, though rampant in practice Das et al. (2014), leaves accounts vulnerable to a single breach, e.g., a malware attack to an unsecured website can lead to attacks to more important accounts if they enable an attacker to guess usernames and passwords. In order to generate secure passwords, users have to create and remember complex strings, which often results in forgetting their passwords after a certain period Weiss and De Luca (2008). What makes this process even more tedious is that users are often forced to change their passwords. Unfortunately, the number of unique and secure passwords that users can comfortably memorize is very limited Florencio and Herley (2007). To overcome this limitation, most users tend to choose simpler passwords, or one strong password and use it across multiple websites. These approaches have resulted in many password breaches over the past few years (BBC, 2017; Forbes, 2014; Adobe, 2013; LinkedIn, 2012; Zappos, 2012).

In an attempt to ameliorate these difficulties, recent work has introduced mental password management schemas Blocki et al. (2015, 2013); Blum and Vempala (2015); Blocki et al. (2014) that enable users to systematically and securely generate and remember passwords for their different accounts. These schemas model passwords as mathematical functions from challenges (e.g., website names) to responses (character string passwords), and design such functions that can be computed by humans. Some of these schemas require paper or digital assistance Blocki et al. (2014), but we focus on those schemas that can be computed in one’s mind without any additional resources.

For brevity, we use the term mindhash to refer to any such password creation schema that enables a user to mentally compute a different password for each challenge without external memory or computational aid, i.e., without paper or a smartphone Blum and Vempala (2015). Mindhashes require learning, memorization of a secret key, and execution when logging in to an account. In return for this effort, users enjoy security in the form of provable resilience to a small number of breaches. Blum and Vempala (2015) introduced several simple mindhashes with varying complexity, memory, and execution requirements, accompanied by varying security guarantees. We evaluate one of these schemas, which requires memorizing only three words. The schemas are resilient to multiple breaches in the sense that even knowing multiple different challenge-password pairs, an adversary is unlikely to be able to guess one’s password to a different challenge. Moreover, these schemas are self-rehearsing Blocki et al. (2013) in the sense that the process of typing passwords on different websites naturally reinforces the user’s memory of the secret key.

Mindhashes may appear to be an appealing solution to the problem of remembering different passwords. Whether such methods are truly usable for most humans is an intriguing open question. Would human users (beyond mathematicians) find these methods pleasant? Would they be willing to adopt them? We address these questions through training and usability studies which are designed to teach mindhashes and measure the effectiveness of the proposed mindhashes. The amount of human computation required in executing these schemas was analyzed in precise models of human mental effort Blum and Vempala (2015), but these formal models have yet to be tested in human experiments.

Several factors are important in the usability of such a system, including the amount of time that is required for learning and practice; memorizing the secret key; and using the mindhash to generate a password. It was suggested in Blum and Vempala (2015) that a password strategy is humanly usable if “any initial long-term memorization should take at most 1 hour, preferably less than 20 minutes; future rehearsals should take at most a total of 1 hour over the user’s lifetime. Generation of a 10-character password should take at most 30 seconds, preferably less than 20 seconds.” In this work, we compare the usability of two mindhashes: a random-letter hash (a simplification of other schemas from Blum and Vempala (2015)) and a 3-word hash (called LP2 in Blum and Vempala (2015)). Following Blum and Vempala, we also define security of a password by: (i) given no prior information, how difficult it would be for an adversary to guess any generated password, and (ii) given that an Internet hacker has access to a few passwords that are generated using a specific password strategy, how difficult it would be to guess a new password generated with the same password strategy.

In our empirical user study, we teach participants how to use a mindhash using videos (less than 5 minutes) that explains the concept and the problem being solved and teaches them how to compute the mindhash function in general. We then help them to choose their secret key and to memorize it and have them practice using the mindhash on artificial website names. We later performed follow-up experiments simulating logins over the next month to evaluate how quickly and accurately participants can use their mindhashes on these and further artificial website names.

We find that for the random-letter hash (3-word hash), the teaching phase involved 5.2 (4) minutes of videos, a median of 8 (4.7) minutes to choose and memorize a secret key, a median of 6 (7.8) minutes to practice the mindhash on 15 logins. On these and 24 other logins performed over the next month, the median time to enter a password was 2.9 (3.2) seconds per character, and the average success rate of typing the correct password within the first three tries was 98% (91%). Hence, there seems to be a tradeoff between learning and execution, with the 3-word hash being faster to learn and memorize a secret key, while the random-letter hash was faster to execute and gives higher accuracy.

The target audience of mindhashes is, potentially, anyone who seeks a secure way to remember different passwords across many different accounts. The participants in our usability study were US-based crowd workers on Amazon’s Mechanical Turk crowdsourcing platform, which has been shown to source a diverse set of users Stewart et al. (2015); Buhrmester et al. (2011) and often produce results similar to those of more traditional approaches Bentley et al. (2017). Nonetheless, such users have a certain minimum age and demonstrated the ability to learn to perform tasks (we filtered for 98% task approval rating), which may differ from other groups of people using multiple accounts. More specifically, our participants reported being between 21 and 55 years old, with the gender distribution of 40% female and 60% male.

Hence, mindhashes may be a viable alternative to the common password management approaches of password reuse or writing passwords down. Another approach to solve the password memorization problem is to use a third-party password management software. Password vaults have become popular over the past few years as they require the user to remember only one master password and then the system automatically fills in login pages with strong (randomly generated) passwords. Unfortunately, this results in a single point of failure, which has caused security breaches (Gasti and Rasmussen, 2012). Popular password vaults have been vulnerable to security attacks in recent years LastPass (2015); OneLOgin (2017). Moreover, the user must have the vault installed on every device that she uses, making it difficult to use on shared devices such as a library computer or a friend’s phone.

The rest of this paper is organized as follows. In Section 2, we define the random-letter and 3-word hashes. In Section 3, we describe the usability study in detail including the precise instructions given to the users. Then we present the results of the user study in Section 4. In Section 5, we recall the human computation model of Blum and Vempala (2015) and use it to analyze the usability and security of the random-letter and 3-word hashes. We discuss limitations of our study and in general mental password management schemas in Section 6 and present conclusions and future work in Section 7.

2 Mindhashes

Here we describe two mindhash functions and approaches to choose and memorize their secret keys. We will use these mindhashes in our empirical and theoretical analysis. Both mindhashes consist of a map from letters to letters. To generate a password, this character map is applied to the challenge (website name) left-to-right, and a special character string is appended that meets various password-composition policies. For example, if the website name is six characters and the special string is three characters, then the password will be nine characters, consisting of the application of the character map to each of the six characters of the website name followed by the three-character special string. Blum and Vempala give more sophisticated mindhashes that have stronger security guarantees, but for the purposes of this study we restrict our attention to this character map type that still offers significantly more security than reusing a small number of passwords. Note that for actual use, small modifications are necessary for special cases such as non-alphabetical characters or very short domain names, as discussed in Section 6.

2.1 3-word hash

For this mindhash, the secret key consists of a user-selected 3 words that in total contain at least 15 different letters of the alphabet, a random letter (which we will refer to as a wild card), and a special character string consisting of an uppercase letter, a digit, and a non-alphanumeric character. The three words that are chosen by the user are concatenated to one single string, called the 3-word string. For example, one secret key might be:

3-word string wild card special string
adjust flight computer x B7!

In this example, the 3-word string contains the 17 distinct letters

The character map takes any letter l of the alphabet to the consonant that appears after the first occurrence of l in the 3-word string. In case that the letter l is not present in the 3-word string, then it maps it to the wild card. If the letter is the last consonant of the 3-Word string, then it wraps around to the first consonant. Consonants are chosen because they offer greater entropy and hence greater security than vowels, which are more common and hence easier to guess.

Here is a specific example of how to apply the 3-word mindhash. Suppose that you want to login to amazon.com. The challenge is the word amazon.

  • Start with a (first letter of amazon) and find the first occurrence of a in adjust flight computer. Output the consonant that appears after a, which is d.

  • The next letter is m and it appears in computer. The consonant after it is p. Output p.

  • a is repeated, output d again.

  • The next letter is z and it does not appear in the word string. Output the wild card x.

  • The next letter is o and it appears in computer. Output m.

  • The last letter is n and it does not appear in the word string, so we output the wild card x.

  • Append the special string B7!

The following table gives a few examples of websites and their corresponding passwords.

challenge password
amazon dpdxmxB7!
facebook ldmrxmmxB7!
fidelity lgjrggfxB7!

2.2 Random-letter hash

Like the 3-word strategy, the random-letter hash is defined by a letter-to-consonant map and a special character string. Concretely, the user is aided in picking a random letter-to-consonant map for the first 20 letters of the alphabet (since the last six letters uvwxyz are infrequent) and a special 3-character string to meet password-composition policy requirements. If the challenge contains a letter from uvwxyz, the user skips that letter without any output. Alternatively, one could map each of these letters to a wild card, but since these letters appear only rarely, this is not necessary and not considered here.

For example, consider the map

a b c d e f g h i j k l m n o p q r s t
q f h c g b s k l m n p j r d t n w x y

and the special string 8*A. Suppose that you want to login to netflix.com. The challenge is the string netflix.

  • Start with the first letter of the challenge n and find its mapping in the above table, r. Output r.

  • The next letter is e, output g.

  • The next letter is t, output y.

  • The next letter is f, output b.

  • The next letter is l, output p.

  • The next letter is i, output l.

  • The last letter is x. It is not present in the table, so we skip it.

  • Append the special string 8*A.

Similarly, we have the following passwords:

challenge password
netflix rgybpl8*A
facebook bqhgfddn8*A
fidelity blcgply8*A

2.3 Memorizing mindhashes

3-word hash. The user memorizes the string of three words (order of words matters), a wild card and a special string.

Random-letter hash. The user memorizes the letter hash using our method Memorization with help of words. The idea of this method is the following. The user looks at each letter pair, e.g., (a, q), and types the first word that comes to her mind that starts with the first letter and has the target letter as the next consonant, e.g., aqua. She will do the same for all letter pairs. For example:

a q aqua
b f beef
c h chef
d c duck
e g

Note that the mnemonics do not, in fact, have to be English words, but can be any memorable strings. Once the words are written for all pairs, the user only needs to memorize the (first letter, word) associations.

a aqua
b beef
c chef
d duck
e

This part should be rehearsed with repetition, i.e., rote memorization. Once the (letter, word) associations are memorized, the user can directly use them to recover the letter hash.

3 Human Usability Study Design

In this section, we describe the details of our usability study. Participants were randomly divided into two groups, with half of the participants assigned to the 3-word hash and the other half to the random-letter hash. The reader can access and try all our surveys at the following link: https://github.com/PasswordUsability/Surveys.

3.1 Qualification

Users were first filtered; they had to pass a qualification test to be able to participate in our study. The qualification included reading a paragraph first, informing about password security and then describing the study, followed by a few simple multiple choice questions. The qualification tested that participants were paying attention and understood the need for having different passwords for different websites.

3.2 Learning the 3-word hash

We showed the users a short video teaching them how to generate a password with a 3-word mindhash. After the video, to make sure they understood the idea, we asked them to generate passwords for one website using the same secret key (word sequence, wild card, and the special string) that was used in the video tutorial. Users were provided with the secret key, multiple attempts, and hints to aid in learning. At the end of this phase, participants learned how to generate passwords using a 3-word mindhash. After this phase, we asked the users to choose their own three words sequence, wild card letter, and special string. Users were allowed to proceed only if their word sequence contained at least 15 different letters and their special string contained an uppercase letter, a number, and a special character. As participants typed their 3 words, an alphabet letter bar, with the used letters crossed out, and the number of used letters was shown. This was to simplify the process of choosing words.

3.3 Learning the random-letter hash

We showed the users a short video teaching them how to generate a password using a random-letter hash. After the video, we displayed the letter map and the special string used in the video and asked them to generate two passwords. At this point, the user did not need to memorize a character map or special string, but had to practice generating passwords using such a map. Next, we provided users with an interface that allowed them to choose a random letter map – a random consonant for each of the first 20 letters of the alphabet. In the next step, we showed them a simple illustrative video explaining our memorization technique, as described in Section 2.3.

Then we ask the users to repeat the letter pairs and the corresponding words for themselves. Although such a memorization might be done more quickly by speaking aloud, we asked the users to type the letter pairs and words to ensure compliance. To further solidify memorization of the character map, we gave three further exercises:

  • Showing the letter pairs and asking the users to type the words.

  • Showing only the left letter and asking the user to first type the word and then the random letter.

  • The same as second exercise, but this time showing the first letters in a different order, e.g., “b, d, c, e, a” in the above table.

3.4 Practice

Immediately after the learning phase, participants were presented with 15 artificial website names to try to log in, one at a time. For each website, they were asked to type the password using the mindhash that they had learned. Two hint buttons were provided on this page. One showed text instructions on how to generate the password using the mindhash, and the other one displayed the user’s secret key (letter map, the special string, and, in case of 3-word hash, the wildcard). Participants had three tries to type each password. If they failed to type the password within three tries in the practice phase, they were presented with the correct response.

3.5 Feedback after learning

Users were asked to give us their feedback on different aspects of the study. We asked the users if the task was fun/boring and easy/hard on a seven point bipolar rating scale. We asked the users if password generation became easier toward the end using a five point scale. Finally, we asked the users whether they would like to participate in our follow-ups and if they have any other feedback. The details of the feedback form are shown in Figure 6.

3.6 Follow-up evaluations

After learning and practice (day 0), we performed six follow-up evaluations of the user’s ability to log in using their passwords, over a period of one month. The first follow-up was performed the next day (day 1), the second follow-up was again a day later (day 2), and the remaining four follow-ups were at day 4, day 8, day 16 and the final follow-up during days 32-35. The last follow-up was scheduled during a holiday period and thus we allowed the users to fill it out anytime during a 3 day interval. At each follow-up survey, the user was asked to generate passwords for 4 challenges. Participants had three attempts to type the password, and then the correct password was shown.

Studies show that users manage on average 25 password-protected accounts Florencio and Herley (2007). Some of these accounts are used frequently (e.g., work account) and some are used occasionally. We consider 25 synthetic website names chosen as random common words: {kite, pillow, atlantic, bundle, reverse , family, quebec, cough, subject, mug, spike, fishing, jumper, knob, chord, quiz, fixed, world, campaign, warm, navy, banquet, hazy, chef, twist}. We assume that the first 15 names are frequent accounts and the last 10 are occasional or newly opened accounts. To reflect this, we asked the user to type the passwords for all the frequent accounts at the end of the learning phase. The challenges in the follow-up evaluations were chosen with probability from the frequent accounts and with probability from the infrequent accounts, to reflect the use of passwords for both logging in to frequent accounts and infrequent or one-time accounts.

The 1/2/4/8/16/32-day timing follows a doubling schedule Pimsleur (1967), which has been shown to be an effective repetition spacing in the practice of learning (Wozniak and Gorzelanczyk, 1994).

In addition to these sequential follow-ups, we ran a quantitative follow-up survey on day 4 of the study. In this survey, users were asked to provide a self-recall of the secret key that they memorized.

3.7 Hints and writing down passwords

Since the study was performed online, one concern is that our results would be tainted by users writing down their secret keys and consulting this written record without our knowledge in the experiment. To avoid this, participants knew that throughout the study they had constant access to two hint buttons, one reminding them of the instructions and the other one reminding them of their secret key. Participants were told that there was no penalty or cost (other than that of pressing the button) to use these hints. Participants pressed the hint button liberally and indeed some participants pressed the hint button liberally for each login.

The hint buttons also captured the fact that some users might carry a “cheat sheet” (e.g., on a note card) to consult while they commit their secret keys to memory. The use of a written record may not constitute a serious security problem Cheswick (2013), and this argument is of course only stronger if the written record is only consulted during the first few days of learning the secret keys.

4 Results

In this section, we present the result of our user study. The participants in our usability study were US-based crowd workers on Amazon’s Mechanical Turk crowdsourcing platform with at least a 98% task approval rating. Our participants reported being between 21 and 55 years old with the gender distribution of 40% female and 60% male. For random-letter hash, overall 32 users participated in the training phase and 12 finished the last follow-up. For the 3-word hash, overall 34 users participated in the training and 14 finished the last follow-up. Table 1 shows the number of participants that that did each of the follow-ups. 1

mindhash/survey day 0 1 2 4 8 16 32-35
Random-letter 32 27 27 24 14 14 12
3-word 34 28 25 25 18 16 14
Table 1: Number of people who participated in the original study (day 0) and follow-up surveys during the one-month study.

4.1 Learning phase

Time Random-letter hash 3-Word hash
Learning+Memorization 8+13 min 11+0 min
Password Generation 19 sec 25 sec
Table 2: For random-letter (3-word) mindhash, learning time includes watching a 2 (4) minute tutorial video, choosing a personal secret key, and practicing the mindhash on a few passwords. Memorization time includes watching a 3.5 (0) minute video describing the memorization technique, and using it to memorize the secret key. Password Generation time is calculated for typing a password of length 8.

Learning times are reported in Table 2. The median times were 8 minutes to learn the random-letter hash and 11 minutes to learn the 3-word hash. The learning time for the 3-word hash was longer due to the longer training video (4-minute video versus 2-minute video). The memorization step for the 3-word hash was negligible. For the random-letter hash, the memorization time was 13 minutes, including a 3.5-minute video tutorial (see Section 3.3 for the details of memorization).

The median time that the participants spent on generating each character of the password decreases over time (Figure 1) with an average of 2.3 seconds per character for the random-letter hash, and 3 seconds per character for the 3-word hash (averaged over the last 5 logins). This corresponds to a password generation time of 19 seconds for the random-letter hash and of 25 seconds for the 3-word hash for a password of length 8 (Table 2) .

Figure 1: Median time participants spent per character of the generated password, measured for passwords that user typed at the practice phase.
Figure 2: Average number of tries to type the correct password during the practice phase.

For both groups, the accuracy of typing the correct password during the practice phase was high: for each login, at least 96% of the users typed correct passwords within three attempts for the random-letter hash. For the 3-word hash, the accuracy of typing the correct password within the first three attempts is 82%. This high accuracy is achieved while the average number of tries also decreases over time (Figure 2).

Figure 1 and Figure 2 show that for both 3-word and random-letter mindhashes, the speed of generating each character of the password and the accuracy of typing the password increases as the number of logins increases. This improvement is an indicator that these mindhashes are self-rehearsing.

4.2 Follow ups

Figure 3 shows the median of the time that the participants spent on generating each character of the password each day. From prior work on memorization and self-rehearsing passwords, we hypothesized that the passwords generation time would decrease over time as the secret key and password process is establishing in long-term memory. Indeed, for both mindhashes, although the gaps between follow-ups doubled each time, password generation time remained low (less than 3.5 second/character during the last follow-up). This is evidence that users could still type passwords reasonably quickly even for websites that are visited rarely. The error bars indicate the standard deviation of the medians, across users, for all logins during the day.

Figure 3: Median time (seconds per character) that participants spent generating each character of the password each day.

Figure 4 shows the average number of tries for participants in order to successfully login. Our result shows that, although the gaps between logins doubles over time, the accuracy of typing the correct password for both mindhashes remains high over time (it requires less than 1.3 attempts to successfully login). This is consistent with the self-rehearsable property of the password schemes.

Figure 4: Average number of tries for participants to login successfully for each day. Error bars represent one standard error.

As the participants type their passwords over time (follow-ups), we expect the secret key to be self-rehearsed and therefore users to click on the secret key hint button less frequently. Figure 5 shows the fraction of users that used the secret key reminder hint button for maximum one login.

Figure 5: Fraction of participants that did not click on the hint button during logins and the fraction of participants that used the hint button for maximum one login.

Table 3 shows these results, specifically for the last follow-up. It can be seen that, for 3-word hash 42% of the participants are successfully typing passwords without the help of the hint buttons. For random-letter hash, the number is smaller, 25%, but still comprises a meaningful fraction of users. Note that users were told that there was no penalty for using the hints, hence the actual use of such aids in practice would be expected to be lower.

No. click(s) Random-letter hash 3-word hash
0 click 25% 42%
1 click 33% 78%
Table 3: Fraction of participants that clicked on the hint button for the secret key during the last follow-up at the end of of one month of study. First row shows fraction of participants that did not click on the hint button (0 click) and second row shows fraction of participants that got help of the hint buttons for at most one login.

At the 4th day quantitative follow-up, participants were asked to type a free recall of their secret key. For the 3-word mindhash, of the participants perfectly remembered the 3 words and of the users remembered at least 2 words. For random-letter mindhash of the users remembered at least 18 letters out of 20 ( of what they memorized), and of the users remembered at least 12 out of 20 letters () of what they memorized.

At the end of the follow-up after a month, participants were asked if they had adopted the mindhash for generating passwords for managing their own personal passwords. For random-letter hash, 25% of the participants reported that they have adopted the mindhash in “real life.” For the 3-word hash, 42% of the participants reported that they used the mindhash for generating their own personal passwords. Although such statistics are known to be greatly inflated, the comparison between the two schemes may be of interest.

4.3 Feedback

At the end of the training phase, we asked the users to fill out the feedback form shown in Figure 6, and we further received free-text feedback throughout the one month study. We present the summary in this section.

For both mindhashes, participants reported that the effort for generating password decreased over time. Participants of both studies reported that they have found the task of generating passwords using mindhashes neither easy nor hard, with random- letter hash being slightly easier. Participants of random-letter Hash found the task slightly fun. This was not completely the case for the 3-word hash as the users reported that the task was neither boring nor fun. For both studies, 6% of the users reported that they wrote down information.

Overall, participants found the random-letter hash study more fun and interesting, maybe partially since they were surprised that they could memorize such a letter map:“This actually worked” or “Worked surprisingly well”. Over the one month period, users we more comfortable generating passwords using our methods and reported that the password generation is feeling more natural over time. Some typical anecdotal feedback that users provided during the follow-up included:

  • 3-Word hash: “It’s getting easier.” or “It’s definitely getting easier. I still have to open my word list but my brain is adapting and I’m starting to know what each letter should translate to without looking sometimes.”.

  • Random-letter hash: “I have definitely warmed up to the program. It feels more natural now than the last time. ” or “I think I’ve finally got a handle on this password combination! Well, minus the one mistake.”.

5 Theoretical analysis

Usability of a mindhash has two main aspects: learning time and password generation time. In this section, we discuss the rigorous model from Blum and Vempala (2015) for the password generation time.

Password generation time is the time that the user spends on outputting her passwords. Password generation is done entirely in the human’s head with no paper, writing instrument, or computing device. It can be viewed as a restricted streaming computation. The working memory (Jonides et al., 2005) is very small, typically at most one or two pointers and two characters (which might typically be letters or digits). Each elementary operation (retrieve a sequence from long-term memory, follow a pointer, add two digits mod 10) has a cost, which is the total number of write operations to the working memory. For example, retrieving a pointer to a sequence in long-term memory has cost , following the sequence has cost , adding two digits has cost or depending on the number of digits created. A human algorithm can thus be assigned a total cost, by adding up the cost of each step. This is the human complexity of the algorithm (called Human Usability Measure or HUM in (Blum and Vempala, 2015)). It is meant as a complexity measure for human computation analogous to the standard runtime complexity analysis of Turing machines.

The HUM measures the human effort required to execute human algorithms. Just as machines running the same algorithm can take different times, humans also have variability in speed. For human computation, asymptotic complexity is too coarse, and the leading constants are important.

Password Generation Phase Given a challenge , start with the first letter . Output the random letter of . If , skip to the next letter. Shift the pointer to the next letter and do similarly till you reach the end of the challenge. Append the special string to the end of your password.

To illustrate this measure, we now compute the HUM for the two mindhashes that we have used in this paper.

5.1 HUM of 3-word hash

Let -- be the sequence of words with total length and the special string. Let be the letter-to-letter map defined by the 3-word hash. The cost of applying is initially higher (to scan the words and find the next consonant) but eventually becomes . We use this in Algorithm 1 below.

  • Input: Challenge

  • Retrieve challenge . Pointer . Cost =

  • While not end of :

    • Let be the current character.

    • Output Cost =

    • Shift pointer to next character. Cost =

  • Retrieve fixed string . Pointer . Cost =

  • While not end of :

    • Output current character. Cost =

    • Shift pointer to next character. Cost =

Algorithm 1 3-Word hash

The HUM is .

5.2 HUM of random-letter hash

Let be a random map from the first 20 letters of the alphabet to consonants and a special string, both chosen by the user. Algorithm 2, describes the HUM of the random letter hash.

  • Input: Challenge

  • Retrieve challenge . Pointer . Cost =

  • While not end of :

    • Let be the current character

    • Output Cost =

    • Shift pointer to next character. Cost =

  • Retrieve fixed string . Pointer . Cost =

  • While not end of :

    • Output current character. Cost =

    • Shift pointer to next character. Cost =

Algorithm 2 random-letter hash

The HUM is .

5.3 Security

Password strategies should be secure against a computationally all-powerful adversary observing (challenge, response) pairs and trying to impersonate the human.

We use the following two security parameters, identified in earlier work Blum and Vempala (2015).

  • It should be hard for the adversary to guess any password of the user. This is the intuition behind the definition of the security parameter . Given a password strategy and a positive integer , we say that if for any single challenge , the probability that an adversary can guess the correct response to is at most .

  • Assume that an internet hacker has found your password to a couple of insecure websites, and is trying to login to your bank account. She might not have the full information to precisely guess your bank account password, but she will have partial information that narrows her predictions to choices. As a result, if your bank website allows her to try multiple guesses, she can successfully login to your account. How many tries will she need? How many passwords will she need to see in the clear? This is the motivation behind the definition of the security parameter . Given a password strategy  and , is defined as the number of random (challenge, response) pairs that an adversary must observe in order to be able to respond correctly to the next challenge with probability greater than . The dictionary of challenges must be specified (e.g., English words, random strings, the top 400 most popular website names, etc.).

It is important to note that there is an inherent tradeoff between security and usability of any password strategy. Generating more secure passwords requires the user to memorize more information.

We discuss the security of the random-letter hash in detail.

. Given a challenge , the adversary can respond correctly to only if she can correctly guess the mappings for all the letters of and the special string . Each random letter has been chosen uniformly at random from the set of 21 consonants. The user’s special string consists of one letter, one number and one special string, all chosen uniformly at random too2. Therefore, the probability that the adversary could guess the correct password is

Assuming that an average password has length 7 (challenge of length four characters3), this gives us .

. The adversary can respond correctly to a challenge only if she has seen all letters in the challenge in the previous challenges. If she has not seen even one letter, the chance of guessing the correct response to the challenge is . What are the expected number of (challenge, response) pairs that the adversary should see to have complete knowledge of the random letters of all letters of a random new challenge? This value depends on the dictionary of challenges. Based on the computations in Blum and Vempala (2015), for the top 500 domain names, this value is equal to . Therefore, for any ,

In practice, most secure websites block the user’s account if he types a wrong password for 3-5 times. This is equivalent to , and thus the above security parameter value is meaningful. Also note that once the adversary sees one response, she already knows the special string . Therefore does not add to the value of .

The security of the 3-word hash is lower since the total entropy generated by choosing 3 random words is smaller. The parameter is estimated as between and in Blum and Vempala (2015).

6 Limitations

We have shown that mindhashes are secure and human-usable solutions for choosing passwords for many users. However, in this section, we discuss limitations of mindhashes and the study that we have done in this paper.

  1. Password policies. Many websites have policies with differing password requirements involving password length or special characters. In our study, users were instructed to append a fixed “special string” to all of their passwords in order to meet such requirements. A recent survey finds that it is often possible to choose a single such string that simultaneously satisfies the requirements of different websites Seitz et al. (2017). However, in some special cases, a website may have different requirements that may not be met by the special string. In general, this is considered as a challenging problem for other password generating approaches as well Furnell (2007).

  2. Short or irregular challenges. Some website names may be very short or contain non-alphabetic characters, such as 53.com for the Fifth Third Bank. While not measured in our study, it would be natural for users to choose a memorable, sufficiently long challenge for these websites, such as the string fifththird for the Fifth Third Bank. Note that different users may choose different challenges, but this does not cause any problems as long as each user is consistent with using the same challenge. Further study is necessary to see how common this problem is and how easy it is for users to recall their challenges.

  3. Infrequently used accounts. Our study does evaluate the ability to correctly generate passwords for numerous new challenges, which is similar to generating a password for a rarely visited website. The 3-word hash has a natural self-rehearsing property so that using it frequently reinforces the memory of the entire secret key, and hence generating passwords for rarely used challenges is straightforward. However, for the random-letter hash, infrequently used letters pose a greater problem. For example, a user may forget her mapping for the letter q if it is never used.

  4. Passwords sharing. Sharing passwords across different accounts is a challenging problem that is not addressed by the mindhashes. Although mindhashes do not offer any solution for sharing passwords across different accounts, if a user chooses to share a password, security is not entirely compromised.

  5. Changing passwords. Certain systems may require passwords to be changed periodically. This is a problem with the password management that is not studied in our work and is not directly addressed by mindhashes. A solution for this, suggested in Blum and Vempala (2015), is to append a digit that indicates which letter of a challenge the user should start with when generating a password. The human usability of these approaches can be studied as part of future work.

  6. Entropy of passwords. Mindhashes assume that the secret key is chosen randomly. For example, in the random-letter strategy, we assume that the user memorizes a random letter-to-letter map. Although we suggest a simple interface for users that allows them to build such a random map, it could still be possible that in practice users may choose predictable secret keys (e.g., for the letter a some letters may be more commonly chosen, such as p for apple, or a person named Alice may be more likely to choose l). This would reduce the entropy and advantage an adversary that is attempting to guess one’s secret key.

  7. Multiple accounts on the same website. Some users may have multiple accounts on one website. In this case, they may use the same password across accounts.

  8. Dropouts and hints. Approximately 60% of participants in both conditions dropped out during the course of this study. Our statistics should be interpreted as representative of the 40% of participants who completed the study. While we could have provided additional incentives in the form of completion/milestone bonuses to increase completion rates, we felt that there was value in observing the natural completion rate at a static pay rate. As discussed, participants had the opportunity to press a hint button to see their secret keys without any discouragement or adverse affect on their payment. In the last follow-up, 25% (42%) of the participants using the random-letter (3-word) mindhash did not use hints even once. Taken together, if one considers mindhashes “usable” for such participants, this gives a lower bound of 9% (18%) on the usability rate. This is a lower bound because it is likely that some users did not complete the study for various personal reasons aside from usability, and that some users clicked the hint buttons even when they would have found the system usable without hints. As mentioned, we provided the hint button to dissuade users from secretly recording their secret keys in a way that we could not monitor.

7 Conclusion

This paper presents the first user study of two different mindhashes (i.e., password strategies): 3-word hash and random letter hash. Participants in our user study spent a median of 11 minutes learning the 3-word hash and 8+13=21 minutes learning the random letter hash.

After the learning phase, the user is ready to use these mindhashes, and it takes 19-25 seconds to generate a password. As predicted by the self-rehearsing property of mindhashes, the time to generate a password decreases over time. We showed that, although there are increasing gaps between rehearsals with no practice, users remembered their memorized codes/words and were able to successfully login to arbitrary websites. It was encouraging that some users seemed interested in adopting these methods to manage their own passwords. Therefore, a natural research question is to identify mindhashes with even better usability and security.

Acknowledgement. This work was supported in part by Microsoft Research and NSF awards CCF-1563838 and CCF-1717349. We would like to thank Manuel Blum, Vivek Sarkar, and Rosa Arriaga for helpful discussions on the topics of this paper.

Figure 6: Feedback form following learning the mindhash.

Footnotes

  1. Participants were paid to complete the qualification, () to complete the day 0 training for random-letter (3-word) mindhash, for sequential follow-ups, and for day 4 quantitative follow-up.
  2. This assumption is based on the distribution of special strings reported by 40 users.
  3. For longer challenges, the user can use only the first 4 characters of the challenge.

References

  1. Adobe. 2013. Important Customer Security Announcement. (2013). http://blogs.adobe.com/conversations/2013/10/important-customer-security-announcement.html
  2. BBC. 2017. Equifax to be investigated by FCA over data breach. (2017). http://www.bbc.com/news/technology-41737241
  3. F. Bentley, N. Daskalova, and B. White. 2017. Comparing the Reliability of Amazon Mechanical Turk and Survey Monkey to Traditional Market Research Surveys. In CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, 1092–1099.
  4. J. Blocki, M. Blum, and A. Datta. 2013. Naturally Rehearsing Passwords. In Advances in Cryptology - ASIACRYPT - International Conference on the Theory and Application of Cryptology and Information Security. https://doi.org/10.1007/978-3-642-42045-0_19
  5. J. Blocki, M. Blum, A. Datta, and S. Vempala. 2014. Towards Human Computable Passwords. arXiv preprint arXiv:1404.0024 (2014).
  6. J. Blocki, S. Komanduri, L. Cranor, and A. Datta. 2015. Spaced Repetition and Mnemonics Enable Recall of Multiple Strong Passwords. In Annual Network and Distributed System Security Symposium, NDSS. http://www.internetsociety.org/doc/spaced-repetition-and-mnemonics-enable-recall-multiple-strong-passwords
  7. M. Blum and S. Vempala. 2015. Publishable Humanly Usable Secure Password Creation Schemas. In AAAI Conference on Human Computation and Crowdsourcing, HCOMP. 32–41. http://www.aaai.org/ocs/index.php/HCOMP/HCOMP15/paper/view/11587
  8. J Bonneau. 2012. The Science of Guessing: Analyzing an Anonymized Corpus of 70 Million Passwords. In Security and Privacy (SP). IEEE, 538–552.
  9. J. Bonneau, C. Herley, P. Van Oorschot, and F. Stajano. 2015. Passwords and the Evolution of Imperfect Authentication. Commun. ACM 58, 7 (2015), 78–87.
  10. M. Buhrmester, T. Kwang, and S. Gosling. 2011. Amazon’s Mechanical Turk: A New Source of Inexpensive, yet High-Quality, Data? Perspectives on Psychological Science 6, 1 (2011), 3–5.
  11. W. Cheswick. 2013. Rethinking Passwords. Commun. ACM 56, 2 (2013), 40–44.
  12. A. Das, J. Bonneau, M. Caesar, N. Borisov, and X. Wang. 2014. The Tangled Web of Password Reuse.. In NDSS, Vol. 14. 23–26.
  13. D. Florencio and C. Herley. 2007. A Large-Scale Study of Web Password Habits. In international conference on World Wide Web. ACM, 657–666.
  14. Forbes. 2014. Ebay Suffers Massive Security Breach, all Users Must change their passwords. (2014). http://www.forbes.com/sites/gordonkelly/2014/05/21/ebay-suffers-massive-security-breach-all-users-must-their-change-passwords/
  15. S. Furnell. 2007. An Assessment of Website Password Practices. Computers & Security 26, 7 (2007), 445–451.
  16. P. Gasti and K. Rasmussen. 2012. On the Security of Password Manager Database Formats. In European Symposium on Research in Computer Security. Springer, 770–787.
  17. J. Jonides, S. Lacey, and D. Nee. 2005. Processes of Working Memory in Mind and Brain. Current Directions in Psychological Science 14, 1 (2005), 2–5.
  18. LastPass. 2015. LastPass Security Notice. (2015). https://blog.lastpass.com/2015/06/lastpass-security-notice.html/
  19. LinkedIn. 2012. An Update on LinkedIn Member Passwords Compromised. (2012). http://blog.linkedin.com/2012/06/06/linkedin-member-passwords-compromised/
  20. M. Mazurek, S. Komanduri, T. Vidas, L. Bauer, N. Christin, L. Cranor, P. Kelley, R. Shay, and B. Ur. 2013. Measuring password guessability for an entire university. In ACM SIGSAC Conference on Computer & Communications Security. ACM, 173–186.
  21. OneLOgin. 2017. Security Incident. (2017). https://www.onelogin.com/blog/may-31-2017-security-incident
  22. P. Pimsleur. 1967. A Memory Schedule. The Modern Language Journal 51, 2 (1967), 73–75.
  23. T. Seitz, M. Hartmann, J. Pfab, and S. Souque. 2017. Do Differences in Password Policies Prevent Password Reuse?. In CHI Conference on Human Factors in Computing Systems, Extended Abstracts. 2056–2063. https://doi.org/10.1145/3027063.3053100
  24. N. Stewart, C. Ungemach, A. Harris, D. Bartels, B. Newell, G. Paolacci, and J. Chandler. 2015. The Average Laboratory Samples a Population of 7,300 Amazon Mechanical Turk Workers. Judgment and Decision making 10, 5 (2015), 479.
  25. R. Weiss and A. De Luca. 2008. PassShapes: Utilizing Stroke Based Authentication to Increase Password Memorability. In Nordic Conference on Human-Computer Interaction: Building Bridges. ACM, 383–392.
  26. P. A. Wozniak and E. J. Gorzelanczyk. 1994. Optimization of Repetition Spacing in the Practice of Learning. Acta Neurobiologiae Experimentalis 54 (1994), 59–62.
  27. Zappos. 2012. Zappos Customer Accounts Breached. (2012). http://www.usatoday.com/tech/news/story/2012-01-16/mark-smith-zappos-breach-tips/52593484/1
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minumum 40 characters
   
Add comment
Cancel
Loading ...
104518
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description