The Tangled Web of Password Reuse

《The Tangled Web of Password Reuse》论文阅读记录

摘要

首次研究了攻击者如何利用来自一个站点的已知密码，来更好的猜测该用户在其他站点的密码
数据集为来自11个网站的几十万个泄露的密码。对密码重用进行了用户调查
确定了一些用户通常用来在站点之间转换基本密码的简单技巧，攻击者可以使用这些技巧来大大简化密码猜测
开发了第一个跨站点密码猜测算法，该算法能够在100次尝试内猜出30%的转换密码，而标准密码猜测算法（JTR）在没有跨站点密码知识的情况下只能猜出14%

介绍

用户倾向于选择容易记忆的简单密码【20、33】
在线服务经常使用密码组合策略或密码表来帮助用户了解密码的强度
典型的互联网用户估计有25个不同的在线帐户【10、11、32】，传统的组合策略或密码表无法防止密码重用，因为这些工具只能看到单个站点的密码
用户通常使用简单的技巧来处理不同网站上的密码策略，例如对常用密码短语进行调整，这大大提高攻击者在其他站点猜测密码的能力
贡献：
- 基于迄今为止，为此目的收集的最大数据集，对同一用户在不同网站帐户的直接密码重用率进行了经验估计（43%）
- 广泛分析同一用户在不同在线账户中密码不同的相似性，揭示了用户如何修改密码以便跨不同站点重用
- 进行了一项调查，以了解用户在不同在线账户上构建密码的行为
- 提出跨站点猜测算法，并评估了算法在离线和在线攻击场景下的可行性

数据集的来源各不相同，但大多数是 SQL 注入攻击或网络钓鱼活动的结果
只使用同时带有邮件地址和密码的泄漏数据集，得到6077个用户，每个用户至少有两个从不同网站泄露的密码
使用 John the Ripper 工具包反转了尽可能多的散列，以更好地比较密码字符串
剩余的散列，通过搜索一个最大的在线数据库——有87亿个 MD5 散列预映像【7】——破解，最终只有12个散列不能破解

不同站点的密码组合策略

手动调查所有泄露密码网站的密码组成策略
下面描述了来自【3】的更广泛的网站密码组合策略，并从各个类别前25个站点随机选择一组受欢迎的网站，手动确定密码组成策略
- Char5：至少5个字符
- Char6：至少6个字符
- Char8：至少8个字符
- Char6LU：至少6个字符，包括一个小写字母和一个大写字母
- Char8DSU：至少8个字符，包括一个数字或一个符号或一个大写字母
- Char8LDS：至少8个字符，包含至少一个字母和至少一个数字或符号
- Char6D：至少6个字符，包括至少一个字母和数字
- Char6-12：6-12个字符
- Strong1：
  - 7-32个字符，包含至少一个字母和数字
  - 无特殊字符
  - 和用户ID不同
  - 不能和使用过的最后五个密码中的任何一个相同
- Strong2：
  - 8-20个字符，包含至少一个字母和数字
  - 无空格，区分大小写
  - 和用户ID不同
  - 无指定的特殊符号
- Strong3：
  - 8-20个字符，包含至少一个字母和数字
  - 可有指定的特殊符号
  - 和用户ID不同
  - 无空格，不区分大小写

密码相似性

分析密码数据集，测量不同密码之间的相似性
9个字符串的相似性度量：
- Distance-like functions：基于距离
  - Manhattan【36】, Cosine【27】
  - 每个字符视作是一个维度，其频率是维度值
  - 每个字符串映射到多维空间中的一个点
  - 计算点之间的距离
- Edit-distance like functions：基于编辑距离
  - Levenshtein【38】，Damerau Levenshtein【28】
  - 确定将一个字符串转换为另一个字符串所需的编辑操作（如插入、删除、替换、换位）的数量
- Token-based distance functions：基于标记
  - Dice【30】, Overlap【37】
  - 将字符串分割成更小的标记，之后计算相似性
- Alignment-like functions：基于对齐
  - Smith Waterman【45】，Neddleman Wunsch【40】，最大公共子序列（LCS）
  - 反映一对字符串之间最大对齐或子序列之间相似性分数
调查同一个网站不同用户密码的相似度：
- 随机抽取1000个密码对（下图 a），计算相似性得分 CDF——同一网站内的用户使用几乎明显不同的密码
- 从随机选择的用户中随机抽取了1000对密码（下图 b）——不同的人选择的密码有很大的不同

调查给定用户在不同网站上选择密码的相似度
- （下图 a）几乎40%的密码具有在[0.9，1.0]范围内的相似性分数
- （下图 b）密码不相同的相似性分数，几乎30%的非相同密码具有在[0.8，1.0]范围内的相似性分数——用户在不同网站上使用的密码之间存在不小的相似性
密码分作三组
- 相同：在不同的web帐户上使用相同的密码
- 子字符串：来自不同web帐户的密码是彼此的子字符串
- 其他：密码要么完全不同，要么遵循更复杂的转换规则
- 最频繁的插入/删除操作发生在字符串的末尾

调查

对不同大学的用户进行匿名调查，收到224份答复，结果如下图：
77%的参与者会修改或重用现有的密码
某些转换规则更为普遍。最常见的选择是在结尾或开头添加数字、字母大写和在结尾或开头添加符号
用户修改他们的密码，以满足不同网站实施的密码限制；倾向于对现有密码进行小的修改来满足这些限制
大多数参与者在密码字符串的末尾插入数字，在开头插入大写字母
65%的参与者维护的密码类型不超过四种。大多数参与者（61%）记住了他们的密码

猜测算法

背景：给定一个泄露的密码，攻击者试图猜测同一用户其他在线帐户的密码

主要的转换规则

在40%的泄漏密码中得到转换规则，并在剩余60%的数据上测试了猜测算法
不考虑额外的转换——添加一些随机的额外字符或在结尾添加表情符号
表中转换规则按概率递减的顺序排列

猜测算法设计

算法生成一组有序的候选密码
操作由几个阶段组成，每个阶段之后，检查是否已经猜到了所需的目标密码。在每一步之后，都会返回原始密码
- 字符序列特征分析：
  - 在输入密码中查找已知的模式序列，如键盘上的相邻键、字母顺序模式和顺序交替键（如输入密码时按住shift或control）——使用这种顺序模式的用户很可能在其他站点的密码中使用类似的模式
  - 将密码拆分为令牌，其中每个令牌表示属于某种模式的一系列字符，如“qwerasdf”将拆分为两个部分“qwer”和“asdf”
  - 尝试不同的令牌排列作为候选密码
  - 若不成功：
    - 扩展令牌以包含序列中的下一个字符（qwert）
    - 用属于同一类别的类似大小的令牌替换令牌（1234）
    - 分析RockYou密码列表，以提高正确替换的可能性。文章末尾附上了常见的替换
    - 考虑反向顺序
- 删除：姐妹密码长度大于6，因为大多数网站要求密码长度大于6
  - 尝试迭代删除属于以下集合{Digit，Symbol，Uppercase letter，Lowercase letter}的字符——人们通常会添加属于这些不同类别的字符，以实现不同的密码组合策略
  - 若不成功：尝试依次从字符串前面、后面、然后两者的组合中删除字符
- 插入：姐妹密码长度小于10，因为RockYou数据集中90％的密码长度小于或等于10
  - 前面或结尾插入数字或符号是最常见的选择
  - 尝试在前端插入数字和符号
  - 首先判断姐妹密码中字符类型，之后插入除这些类型之外的单个字符——从RockYou密码列表中计算出各个字符的插入概率
  - 从RockYou列表中计算前20个可能的二元插入和三元插入
- 大写：
  - 首先尝试将密码中的所有字母大写
  - 若失败
    - 尝试将字符串前面的字母大写
    - 尝试将字符串后面的字母大写
    - 两者组合
- Leet：
  - 尝试流行的Leet转换（o->0、a->@、s->$、i->1、e->3、t->7）
  - 尝试所有可能的Leet转换
- 子字符串移动：
  - 将输入密码拆分为子字符串，其中定界字符属于集合{Digit，Symbol，Uppercase letter}
  - 如“xyz@123”拆作xyz @ 123，则@为定界符，依次尝试子串的所有可能置换，则123@xyz作为结果
- 子词修改：
  - 根据常用英文单词拆分输入密码，然后将每个单词的第一个字母大写
  - 尝试按不同的顺序重新排列单词
对1000个随机非相同密码对进行采样，决定规则的排序
猜测算法本质上对输入密码应用固定顺序的转换，而不是尝试所有可能的转换

算法评估

根据破解目标密码所需的猜测次数来评估猜测算法的性能

对比算法
- RockYou guesser：
  - 先建立一个单词列表以及可能的前缀后缀插入
    - 对于给定的泄露密码，首先在词典【9】（Moby Words II）寻找这个单词，确定可能的前缀和后缀
    - 在其他密码中查看，是否给定密码为子串，确定前后缀
  - 由此，得到一个可能的前缀后缀插入的映射
- Edit Distance (ED) guesser：
  - 使用【49】的基于 ED 的猜测器
  - 以从 RockYou 列表中计算出的概率递减顺序插入字符，在特定位置应用编辑转换
  - 编辑距离最大为 2，否则转换数目过大
- JTR：
  - 将数据集中每对密码中的一个附加到一个列表中
  - 生成剩余密码的 MD5 哈希，对它们运行 JTR
  - 使用默认的57个规则集，对密码列表进行转换，查看匹配率
专注于评估不相同的密码对
- 重试次数限制在10次，猜测算法可以破解几乎30%的互为子串的密码对
- 猜测算法在100次尝试内破解80%的互为子串的密码对
- ED Guesser 需要数百万次尝试来破解最多65%的子串密码对
猜测算法更适合于在线攻击场景
研究无法破解的密码的相似性与破解了的密码的相似性：
- 破解：
  - 大多数密码对相似性得分都在[0.8，1]的范围内
- 未破解：
  - 大多数密码对的相似性得分在[0.1,0.5]范围内
猜测算法能够成功地预测相似的密码，并且随后的回报率会迅速下降

讨论

有效性

大多数调查参与者比普通人群年轻一些，受教育程度更高
所有的密码数据集都代表真实的密码数据
调查结果与泄露的密码数据集进行了比较，发现了类似的观察结果

局限性

研究仅涉及基于文本的密码
猜测算法相对简单，没有利用可能提高性能的更先进的模型——更复杂的方法高度依赖于培训数据
主要目标是表明，即使使用一组相当简单的规则也能显著提高攻击性能
没有足够的数据来分析相似的网站类别是否影响了密码重用的选择

对策

单点登录技术
双因素身份验证——密码与手机验证码
此密码猜测算法可作为测试工具，用户可以在不同的站点输入密码，并在本地评估跨站点猜测的难度
开发一个在线服务，以进行跨站点密码安全度量。密码构造页面可以将用户输入的密码与站点可合法获取的用户密码进行比较

结论

43%的用户直接在站点之间重复使用密码
许多用户跨站点对其密码进行小的修改，并且许多用户的修改过程相似
猜测算法能够在不到10次尝试中破解大约10%的非同一密码对，并且在少于100次尝试中破解大约30%的非同一密码对

参考文献

[1] 55K Twitter Passwords Leaked. http://www.newser.com/story/145750/55k-twitter-passwords-leaked.html.

[2] 6.46 million LinkedIn passwords leaked online. http://www.zdnet.com/blog/btl/6-46-million-linkedin-passwords-leaked-online/79290.

[3] Alexa. http://www.alexa.com.

[4] John the ripper password cracking toolkit. http://www.openwall.com/john/.

[5] Leet Transformations. http://qntm.org/l33t.

[6] LinkedIn confirms password leak, encourages users to update passwords. http://www.cbsnews.com/8301-501465-162-57448443-501465/linkedin-confirms-password-leak-encourages-users-to-update-passwords/.

[7] MD5 Decryptor. http://www.md5decrypter.co.uk/.

[8] Millions of LinkedIn passwords reportedly leaked online. http://news.cnet.com/8301-1009_3-57448079-83/millions-of-linkedin-passwords-reportedly-leaked-online/.

[9] Moby Words II. http://icon.shef.ac.uk/Moby/.

[10] No wonder hackers have it easy: Most of us now have 26 different online accounts - but only five passwords. http://www.dailymail.co.uk/sciencetech/article-2174274/No-wonder-hackers-easy-Most-26-different-online-accounts--passwords.html.

[11] Online Passwords: Keep it Complicated. http://www.guardian.co.uk/technology/2012/oct/05/online-security-passwords-tricks-hacking.

[12] Over 55,000 Twitter passwords exposed, posted online. http://news.yahoo.com/blogs/technology-blog/over-55-000-twitter-passwords-exposed-posted-online-165625320.html.

[13] RockYou leak. http://techcrunch.com/2009/12/14/rockyou-hack-security-myspace-facebook-passwords/.

[14] RockYou Password Analysis. http://thepasswordproject.com/rockyou_passpal_0.4_dump.

[15] Survey Questions. http://web.engr.illinois.edu/∼das17/passwordreuse.html.

[16] Thousands of Twitter passwords exposed. http://news.cnet.com/8301-1009_3-57430475-83/thousands-of-twitter-passwords-exposed/.

[17] Yahoo hacked, 450,000 passwords posted online. http://www.cnn.com/2012/07/12/tech/web/yahoo-users-hacked.

[18] Yahoo V oice hack leaks 450,000 passwords. http://www.guardian.co.uk/technology/2012/jul/12/yahoo-voice-hack-attack-passwords-stolen.

[19] Yahoo’s password leak. http://news.cnet.com/8301-1009_3-57471178-83/yahoos-password-leak-what-you-need-to-know-faq/.

[20] A. Adams and M. A. Sasse, “Users Are Not The Enemy,” Commun. ACM, vol. 42, no. 12, pp. 40–46, 1999.

[21] J. Bonneau, “The science of guessing: analyzing an anonymized corpus of 70 million passwords,” in Proceedings of the 33rd IEEE Symposium on Security and Privacy, ser. SP ’12, May 2012.

[22] J. Bonneau, C. Herley, P . C. v. Oorschot, and F. Stajano, “The Quest to Replace Passwords: A Framework for Comparative Evaluation of Web Authentication Schemes,” in Proceedings of the 2012 IEEE Symposium on Security and Privacy, ser. SP ’12, 2012, pp. 553–567.

[23] J. Bonneau, M. Just, and G. Matthews, “What’s in a Name? Evaluating Statistical Attacks on Personal Knowledge Questions,” in Proceedings of the 14th International conference on Financial Cryptography and Data Security, ser. FC’10, 2010, pp. 98–113.

[24] J. Bonneau and S. Preibusch, “The password thicket: technical and market failures in human authentication on the web,” in Proceedings of the 9th Workshop on the Economics of Information Security, June 2016 [Online]. Available: http://www.jbonneau.com/doc/BP10-WEIS-password_thicket.pdf

[25] W. E. Burr, D. F. Dodson, and W. T. Polk. (2006) Electronic Authentication Guideline. [Online]. Available: http://csrc.nist.gov/publications/nistpubs/800-63/SP800-63V1_0_2.pdf

[26] S. Chiasson, P . C. van Oorschot, and R. Biddle, “A Usability Study and Critique of Two Password Managers,” in Proceedings of the 15th conference on USENIX Security Symposium, ser. USENIX-SS’06, 2006.

[27] W. W. Cohen, P . Ravikumar, and S. E. Fienberg, “A Comparison of String Distance Metrics for Name-Matching Tasks,” in Proceedings of IJCAI-03 Workshop on Information Integration, 2003, pp. 73–78.

[28] F. J. Damerau, “A Technique for Computer Detection and Correction of Spelling Errors,” Commun. ACM, vol. 7, no. 3, pp. 171–176, Mar. 1964.

[29] M. Dell’Amico, P . Michiardi, and Y . Roudier, “Password Strength: An Empirical Analysis,” in Proceedings of the 29th conference onInformation Communications, ser. INFOCOM’10, 2010, pp. 983–991.

[30] L. R. Dice, “Measures of the Amount of Ecologic Association Between Species,” Ecology, vol. 26, no. 3, pp. 297–302, Jul. 1945.

[31] S. Egelman, J. Bonneau, S. Chiasson, D. Dittrich, and S. Schechter, “It’s Not Stealing If Y ou Need It: A Panel on the Ethics of Performing Research Using Public Data of Illicit origin,” in Proceedings of the 3rd Workshop on Ethics in Computer Security Research, ser. WECSR ’12, 2012, pp. 124–132.

[32] D. Florˆ encio and C. Herley, “A Large-Scale Study of Web Password Habits,” in Proceedings of the 16th International Conference on the World Wide Web, ser. WWW ’07, 2007, pp. 657–666.

[33] S. Gaw and E. W. Felten, “Password Management Strategies for Online Accounts,” in Proceedings of the 2nd Symposium on Usable Privacy and Security, ser. SOUPS ’06, 2006, pp. 44–55.

[34] P . G. Kelley, S. Komanduri, M. L. Mazurek, R. Shay, T. Vidas, L. Bauer, N. Christin, L. F. Cranor, and J. Lopez, “Guess Again (and Again and Again): Measuring Password Strength by Simulating Password-Cracking Algorithms,” in Proceedings of the 33rd IEEE Symposium on Security and Privacy, ser. SP ’12, 2012, pp. 523–537.

[35] S. Komanduri, R. Shay, P . G. Kelley, M. L. Mazurek, L. Bauer, N. Christin, L. F. Cranor, and S. Egelman, “Of Passwords and People: Measuring the Effect of Password-Composition Policies,” in Proceedings of the International Conference on Human Factors in Computing Systems, ser. CHI ’11, 2011, pp. 2595–2604.

[36] E. F. Krause, Taxicab Geometry: An Adventure in Non-Euclidean Geometry. Dover Publications, 1987.

[37] M. Levandowsky and D. Winter, “Distance between Sets,” Nature, vol. 234, no. 5, pp. 34–34, 1971.

[38] V . I. Levenshtein, “Binary codes capable of correcting deletions, insertions, and reversals,” Soviet Physics Doklady, vol. 10, no. 8, pp. 707–710, 1966.

[39] A. Narayanan and V . Shmatikov, “Fast Dictionary Attacks on Passwords Using Time-Space Tradeoff,” in Proceedings of the 12th ACM conference on Computer and communications security, ser. CCS ’05, 2005, pp. 364–372.

[40] S. B. Needleman and C. D. Wunsch, “A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins,” Journal of Molecular Biology, vol. 48, no. 3, pp. 443–453, 1970.

[41] D. Perito, C. Castelluccia, and M. Duermuth, “Adaptive Password-Strength Meters from Markov Models,” in Proceedings of the 19th Network and Distributed Systems Security Symposium, ser. NDSS ’12, 2012.

[42] S. Rane and W. Sun, “Privacy Preserving String Comparisons Based on Levenshtein Distance,” in Proceedings of IEEE International Workshop on Information F orensics and Security, ser. WIFS ’10, 2010.

[43] B. Ross, C. Jackson, N. Miyake, D. Boneh, and J. C. Mitchell, “Stronger Password Authentication Using Browser Extensions,” in Proceedings of the 14th conference on USENIX Security Symposium, ser. USENIX-SS ’05, 2005.

[44] R. Shay, S. Komanduri, P . G. Kelley, P . G. Leon, M. L. Mazurek, L. Bauer, N. Christin, and L. F. Cranor, “Encountering Stronger Password Requirements: User Attitudes and Behaviors,” in Proceedings of the 6th Symposium on Usable Privacy and Security, ser. SOUPS ’10, 2010, pp. 1–20.

[45] T. Smith and M. Waterman, “Identification of Common Molecular Subsequences,” Journal of Molecular Biology, vol. 147, no. 1, pp. 195–197, 1981.

[46] B. Ur, P . G. Kelley, S. Komanduri, J. Lee, M. Maass, M. L. Mazurek, T. Passaro, R. Shay, T. Vidas, L. Bauer, N. Christin, and L. F. Cranor, “How does your password measure up? The effect of strength meters on password creation,” in Proceedings of the 21st USENIX conference on Security symposium, ser. USENIX-SS ’12. USENIX Association, 2012, pp. 65–80.

[47] M. Weir, S. Aggarwal, M. Collins, and H. Stern, “Testing Metrics for Password Creation Policies by Attacking Large Sets of Revealed Passwords,” in Proceedings of the 17th ACM conference on Computer and Communications Security, ser. CCS ’10, 2010, pp. 162–175.

[48] M. Weir, S. Aggarwal, B. de Medeiros, and B. Glodek, “Password Cracking Using Probabilistic Context-Free Grammars,” in Proceedings of the 30th IEEE Symposium on Security and Privacy, ser. SP ’09, 2009, pp. 391–405.

[49] Y . Zhang, F. Monrose, and M. K. Reiter, “The Security of Modern Password Expiration: An Algorithmic Framework and Empirical Analysis,” in Proceedings of the 17th ACM conference on Computer and communications security, ser. CCS ’10, 2010, pp. 176–186.

More Than Code