Fast, Lean, and Accurate Modeling Password Guessability Using Neural Networks

《Fast, Lean, and Accurate Modeling Password Guessability Using Neural Networks》论文阅读记录

摘要

现有的通过模拟对抗密码猜测来评估密码强度的方法，是不准确的或对于实时客户端密码检查数量级太大、太慢
神经网络通常比最先进的方法（如概率上下文无关文法和马尔可夫模型）更有效地猜测密码
神经网络可以被高度压缩到几百千字节，而不会大大降低猜测的有效性
JavaScript中实现了第一个密码猜测的原则性客户端模型。该模型以亚秒的延迟，分析了密码在任意时间猜测攻击中的抵抗能力

介绍

评估密码强度的常用方法是运行或模拟密码猜测技术
一套配置良好的猜测技术，包括概率方法和现成的密码恢复工具
不适用于密码强度的实时评估，有时也不适用于任何实际有用的密码强度评估
提出了使用人工神经网络来猜测密码
首先全面测试了神经网络模型大小、模型结构、训练数据和训练技术的变化对网络猜测不同类型密码的能力的影响
与最先进的密码猜测模型进行了比较，包括马尔可夫模型【65】、概率上下文无关文法【59，93】、软件工具【74，83】
神经网络比其他密码猜测方法更能成功地猜测密码，尤其是超过10¹⁰次的猜测和非传统的密码策略——密码猜测攻击通常会远远超过$10^{10}$次猜测【44，46】
使用最近提出的蒙特卡罗方法来评估概率模型对大量猜测的性能
使用的神经网络可以被高度压缩，而且猜测效率损失最小——足够小，可以包含在移动设备的应用程序中，与加密软件捆绑在一起，或者用于网页密码表
在JavaScript中实现并测试了一个神经网络密码检查程序，适合用于移动应用程序、浏览器扩展和网页密码表。在一秒内的实现密码强度实时反馈，比现有客户端方法更准确地测量对猜测的抵抗力
贡献：
- 提出神经网络模型，并综合评估其训练、参数和压缩的变化对猜测效果的影响
- 建立了一个可压缩和有效的密码猜测模型，用于客户端密码检查
- JavaScript实现此检查器，并测试

背景和相关工作

密码猜测攻击

没有限速策略时，可以实施大规模猜测
散列密码数据库被盗，即数据库被拖库，则可进行离线攻击【20，23，27，45，46，67，73，75，87】
加密密钥从密码派生或受密码保护，此情况下也容易受到大规模猜测的攻击

测量密码强度

密码强度模型通常采用两种概念形式
- 依赖于纯粹的统计方法，如香农熵或其他先进的统计方法【21，22】——所需的样本量太大
- 模拟对抗性密码猜测【34，65，89】
密码猜测的学术研究主要集中在以大密码集作为输入，然后按概率降序输出猜测的概率方法——密码破解工具依靠高效的启发式方法来模拟常见的密码特征
概率上下文无关文法【93】：
- 密码是用模板结构（例如，6个字母后跟2个数字）和对应的结构终端构建的
- 密码的概率是其结构的概率乘以其终端的概率
- 通过平滑将概率分配给未发现的终端也是有益的【60】
- 使用自然语言词典来实例化终端可以改进猜测【91】
马尔可夫模型【70】：
- 根据前一个字符或上下文字符预测密码中下一个字符的概率
- 可能会有过度拟合的风险——平滑和退避方法补偿过度拟合
- 6-grams 马尔可夫模型加性平滑通常是建模英文密码的最佳方法
损坏的Wordlist方法（Mangled Wordlist Methods）：
- 软件工具通常用这种方法于生成密码猜测
- 这种类型的两种流行工具是Hashcat 和John the Ripper
- 使用人工规则来转换词表（密码和字典条目），或者模拟人类编写密码的常见行为进行转换

主动密码检查

以上密码猜测模型可以精确地模拟人工创建的密码，但它们需要数小时或数天以及兆字节或千兆字节的磁盘空间
当前的实时密码检查程序可以根据它们是否完全运行在客户端进行分类
服务器端组件在安全性上有很大问题——将密码发送到服务器进行密码检查会破坏所有安全保证，例如保护密钥的密码不应该离开用户的设备
客户端的密码检查器，例如在浏览器运行的那种，往往根据密码的长度或包含的不同字符类来给密码打分【33，88】
只有 zxcvbn 【94，95】使用了几十种更先进的启发式方法，给出了合理准确的强度估计——无法简洁地对模型进行编码和计算实时结果，不能直接模拟对抗性猜测

神经网络

用于近似高维函数，擅长模糊分类问题和生成新序列
基于前面的元素生成字符串中下一个元素的概率【49，84】
序列可能是不精确的、新颖的序列【49】——神经网络可能适合于密码猜测
神经网络可以在比马尔可夫模型小得多的空间内模拟自然语言

系统设计

密码强度测量

神经网络：给定密码的前一个字符，训练产生密码的下一个字符
依赖于一个特殊的密码结束符号来模拟在一系列字符之后密码结束的概率

 For example, to calculate the probability of the entire password ‘bad’, 
 we would start with an empty password, and query the network for the probability of seeing a ‘b’, 
 then seeing an ‘a’ after ‘b’, and then of seeing a ‘d’ after ‘ba’, then of seeing a complete password after ‘bad’

使用改进的波束（beam）搜索【64】（深度优先和宽度优先的混合搜索）枚举概率高于给定阈值的所有可能密码，并筛选排序
可以在GPU上有效地实现
传统的通过枚举计算猜测数的方法，计算量较大。除了猜测数枚举之外，还可以使用 Monte Carlo 模拟来准确高效地估计猜测数【34】

方法

模型体系结构
- 使用循环神经网络（RNN）
- 网络中的连接可以按顺序处理元素，并使用内部记忆来记忆关于序列中先前元素的信息
- 实现了两种不同的循环体系结构
字母表大小
- 关注的是字符级模型，而不是更常见的单词级模型
- 除了字符外，还允许神经网络对音节或标记等子单词单元进行建模
- 基于【68】对2000个不同的标记建模，并以和字符相同的方法表示，以输出下一个字符为 a 或 pass 的概率
- 对所有字母建模是不必要的开销，而且有些字符，比如大写字母和稀有字符，在神经网络之外建模更好——预测了 a 时，根据训练数据中 a 和 A 出现次数来预测是否大写
密码上下文
- 尝试使用密码中所有前面的字符作为上下文，并且只使用前面的10个字符
- 当少于十个上下文字符时，用零填充输入
- 以相反的顺序提供上下文字符——例如，从“rowssap”预测“d”而不是“passwor”——有时被证明可以提高性能【48】
模型大小
- 评估改变模型大小对猜测成功率的影响，测试了一个有 15700675 个参数的大型神经网络和一个有 682851 个参数的小型网络
迁移学习
- 利用迁移学习来训练神经网络——神经网络的不同部分学习在训练期间识别不同的现象【97】
- 针对非传统密码策略的一个关键问题：几乎没有合适训练数据（训练样本的稀疏性）
- 首先对模型进行训练集中所有密码的学习。然后，模型的下层被冻结。最后，该模型仅对训练集中符合策略的密码进行重新训练
- 模型中的较低层学习数据的低级特征，而较高层学习数据的高级特征
训练数据
- 实验了不同的训练数据集
- 后面使用了两组

客户端模型

减少用户所经历的延迟，通过网络传输尽可能少的数据
模型尺寸优化：
- 利用图形技术对基于浏览器的游戏和可视化，编码了三维模型
包含四个步骤：权重量化、定点编码、锯齿形编码和无损压缩

weight quantization, fixed-point encoding, ZigZag encoding, and lossless compression
- 权重量化：
  - 对神经网络的权值进行量化，以便用更少的数字来表示它们
  - 只发送最高有效数字，而不是发送32位浮点数（权值）的所有数字
  - 有利于减小模型尺寸，但增加了误差；实验发现，权重量化到小数点后三位，误差最小
- 定点编码
  - 使用无符号整数而不是线上的浮点数更简洁地描述量化值
  - 如：在内部表示-5.0和5.0之间的量化权重，最小精度为0.005等效在-1000和1000之间的量化权重，精度为1
- 锯齿形编码
  - 避免产生负数
  - 有符号的值通过使用最后一位作为符号位进行编码——0的值被编码为0，-1的值被编码为1，1被编码为2，-2被编码为3
- 无损压缩
  - 使用常规的 gzip 或 deflate 编码作为压缩的最后阶段
  - 没有考虑其他压缩工具——浏览器对它们的本地支持并没有那么广泛
- Bloom筛选词列表
  - 存储了一个频繁猜测密码的单词列表
  - 将前200万个最常出现的密码存储在一系列压缩 Bloom过滤器中
延迟优化
- 目标延迟接近100毫秒
- 预计算：
  - 减少了将密码概率转换为猜测数的延迟——客户机上表中快速查找
  - 对猜测数映射的概率进行量化，猜测数变得不精确——结果通常以更宽泛的量化形式呈现给用户，如用户可能被告知他们的密码是“弱”或“强”
- 缓存中间结果
  - 计算10个字符密码的概率需要对神经网络进行11次完整的计算
  - 缓存每个子串的概率
- 多线程
  - 在与用户界面不同的线程中运行神经网络
实现
- 在 Keras 库上构建服务器端，在神经网络的 neoctex 浏览器构建客户端
- 使用 Keras 的 Theano 后端库——用 GPU 训练神经网络
- 模型通常使用三个长-短期记忆（LSTM）递归层和两个密集连接层，共五层
- 为了更严格地低估客户端实现中的猜测数，计算猜测数时不考虑密码大小写
- 计算了猜测数之后，应用一个恒定的比例因子作为安全参数，以使模型更为保守

测试方法

与其他多种密码破解方法进行比较，包括 PCFGs、Markov 模型、JtR 和 Hashcat
主要指标是人工创建密码测试集的可猜测性，单个密码的可猜测性通过猜测者破解密码所需的猜测次数来衡量
对于基于概率的方法 PCFG、Markov 模型和神经网络，使用 Monte Carlo 方法【34】计算猜测数
- 生成并计算至少一百万个随机密码的概率，以提供准确的估计
- 观察到95%的置信区间小于猜测数估计值的10%；超过10%的密码只有在超过10¹⁸次猜测后才会被猜到
- 对所有的 Monte Carlo 模拟，建立了高达$10^{25}$个的猜测模型
对于测量 JtR 和 Hashcat 计算密码可猜测性，列举了两个方法所做的所有猜测。在不同测试集中，进行了大约10¹³到10¹⁵次猜测

训练数据

混合使用了泄漏和破解的密码集
第一组为密码猜测服务（PGS）训练集【89】，包含了 Rockyou【90】和 Yahoo【43】泄露的密码集。对于使用自然语言的猜测方法，还包括 web2【11】列表、Google web【47】语料库和词形变化字典【78】，共3300万个密码和590万个自然语言单词
第二组（PGS++训练集）使用额外的泄漏和破解密码集【1、2、3、6、7、9、12、13、14、15、16、20、23、25、42、43、55、56、57、62、63、67、75、77、85、90】来扩充PGS训练集。使用自然语言的方法同上；共1.05亿个密码和590万个自然语言单词

测试数据

使用了五个测试数据集
1class8：3062个为研究收集的密码，长度超过8个字符【59】
1class16：2054个密码，长度超过16个字符【59】
3class12：990个密码，必须包含至少三个字符类（大写字母、小写字母、符号、数字），并至少12个字符【80】
4class8：2997个密码，必须包含所有四个字符类，并且至少8个字符【66】
webhost：000webhost泄漏中至少包含8个字符的密码中，随机抽取30000个密码【40】

猜测设置

PCFG
- 带有终端平滑和混合结构的PCFG【60】
- 训练数据中包含了自然语言词典，每个单词的加权数为密码的十分之一
- 分离了结构和终端的训练
马尔可夫模型
- 训练了4、5和6-gram的模型
- 在测试中使用了带加性平滑的6-gram模型【65】
Mangled Wordlist Methods
- 使用流行的破解工具Hashcat和John the Ripper计算猜测数
- Hashcat 使用自带的 best64 和 gen2 规则集【83】；对于JtR，我们使用 SpiderLabs 的规则【86】；先前的工作发现它们在猜测通用密码方面是有效的【89】

评估

训练神经网络

迁移学习
- 迁移学习改进了密码猜测，主要是在较高的猜测数（如上图 a）
包括自然语言词典
- 将自然语言词典包含在神经网络训练数据
- 在 1class16 上测试，因为特别可能受益于自然语言词典【91】
- 自然语言训练降低了神经网络的猜测效率（上图 b）
- 神经网络并没有受益，因为没有区分密码训练和自然语言词典
辅导网络
- 尝试用大网络随机生成的密码来提高小模型在猜测长密码方面的有效性（如上图 c）

模型大小
- 至少对于一些密码集，神经网络模型可以比其他模型小几个数量级（如上图）
- 长密码更类似于英语短语，建模它们可能需要更多的参数，因此比建模短密码需要更大的网络

后向与前向训练
- 在神经网络的某些应用中，向后而不是向前处理输入可能更有效（密码倒序和正序）
- 尝试向后、向前猜测密码，并使用混合方法，其中一半网络向前检查密码，另一半反向检查密码
- 只观察到总体上的边际差异
- 混合方法增加了训练所需的时间，但在精度上只有很小的提高
密码标记化
- 使用混合的、子字级别的密码模型不会显著提高在低猜测数下的猜测性能（如上图 b）
- 混合模型可以用多种不同的方式表示同一个词，而蒙特卡洛估计中，密码是唯一表示的
- 通过枚举最可能的10⁷个猜测来计算猜测数
- 对于长密码，可能会有早期的好处
循环架构
- 实验了两种不同类型的递归神经网络结构：长短期记忆（LSTM）模型和对LSTM模型的再改进
- 网络的总体输出几乎没有影响，改进后的LSTM模型更精确

猜测有效性

神经网络更善于在猜测次数更多的情况下以及针对更复杂或更长的密码策略时猜测密码，如 4class8， 1class16， 3class12
尽管神经网络的性能通常优于其他模型，但使用多种猜测方法仍应优于使用任何单一猜测方法进行准确的强度估计
较大的神经网络模型更严格地适应低质量数据，因此泛化能力比较差
网络较晚猜到但其他方法容易猜到的密码，通常与自然语言单词中的单词相似；反之，则是新颖的不同于训练集的密码

浏览器实现

比较压缩模型与所有服务器端模型（大型神经网络、PCFG、马尔可夫模型、JtR 和 Hashcat）的猜测有效性
模型编码
权重与概率曲线量化：预先计算从密码概率到猜测数的映射，并将映射发送到客户端——通过量化猜测数字曲线的概率
评估反馈速度
- 执行了两个测试来测量计算猜测数的速度：
  - 测试使用半缓存密码生成猜测数的时间——测量在密码末尾添加字符时计算猜测数的时间
  - 测试计算每个密码的总时间
- 从 1class8 训练集中随机选择了500个密码子集进行这些测试
- 半缓存测试中，计算猜测数的平均时间为17ms（stdev:4ms）
- 完全密码测试中，平均时间为124ms（stdev:48ms）
与其他密码表的比较
- 最好的估计器是 Dropbox 的 zxcvbn meter，它依靠手工制作的启发式、统计方法和纯文本词典作为训练数据来估计猜测数
- 将结果与 zxcvbn 和进行比较，后者是使用远不那么复杂的启发式方法来估计密码强度。Yahoo meter 不产生猜测数，但将密码分为最弱、较弱、较弱、较强和较强
- 客户端神经网络方法比测试的其他方法更精确，不安全错误和可比较安全错误最多减少两倍
- 对于10000个最可能的密码，zxcvbn 会产生84 个不安全错误，而神经网络只会产生 11 个不安全错误
- 神经网络可以通过简单的再训练轻松地将目标重新定位到其他策略

结论

该网络在效率和有效性方面优于最先进的密码猜测方法
压缩神经网络密码模型，以便它们可以作为网页的一部分下载
剩下的一个挑战是在密码创建期间为用户提供可解释的建议以改进密码

参考文献

[1] CSDN password leak. http://thepasswordproject.com/leaked password lists and dictionaries.

[2] Faith writer leak. https://wiki.skullsecurity.org/Passwords#Leaked passwords.

[3] Hak5 leak. https://wiki.skullsecurity.org/Passwords#Leaked passwords.

[4] Msgpack: It’s like JSON but fast and small. http://msgpack.org/index.html.

[5] Neocortex Github repository. https://github.com/scienceai/neocortex.

[6] Perl monks password leak. http://news.softpedia.com/news/PerlMonks-ZF0-Hack-Has-Wider-Implications-118225.shtml.

[7] Phpbb password leak. https://wiki.skullsecurity.org/Passwords.

[8] Protocol buffer encoding. https://developers.google.com/protocol-buffers/docs/encoding.

[9] Stratfor leak. http://thepasswordproject.com/leaked password lists and dictionaries.

[10] Using Web workers. https://developer.mozilla.org/en-US/docs/Web/API/Web Workers API/Using web workers.Accessed:Feb 2016.

[11] The “web2” file of English words. http://www.bee-man.us/computer/grep/grep.htm#web2, 2004.

[12] Password leaks: Elitehacker. https://wiki.skullsecurity.org/Passwords, 2009.

[13] Password leaks: Alypaa. https://wiki.skullsecurity.org/Passwords, 2010.

[14] Specialforces.com password leak. http://www.databreaches.net/update-specialforces-com-hackers-acquired-8000-credit-card-numbers/, 2011.

[15] Y ouPorn password leak, 2012.http://thepasswordproject.com/leaked password lists and dictionaries.

[16] WOM V egas password leak, 2013. https://www.hackread.com/wom-vegas-breached-10000-user-accounts-leaked-by-darkweb-goons/.

[17] BASTIEN, F . , LAMBLIN, P . , PASCANU, R., BERGSTRA, J., GOODFELLOW, I. J., BERGERON, A., BOUCHARD, N., AND BENGIO, Y. Theano: New features and speed improvements. In Proc. NIPS 2012 Deep Learning workshop (2012).

[18] BERGSTRA, J., BREULEUX, O., BASTIEN, F . , LAMBLIN, P. , PASCANU, R., DESJARDINS, G., TURIAN, J., WARDE-FARLEY, D., AND BENGIO, Y. Theano: A CPU and GPU math expression compiler. In Proc. SciPy (2010).

[19] BISHOP, M., AND KLEIN, D. V. Improving system security via proactive password checking. Computers & Security 14, 3 (1995), 233–249.

[20] BONNEAU, J. The Gawker hack: How a million passwords were lost. Light Blue Touchpaper Blog, December 2010. http://www.lightbluetouchpaper.org/2010/12/15/the-gawker-hack-how-a-million-passwords-were-lost/.

[21] BONNEAU, J. The science of guessing: Analyzing an anonymized corpus of 70 million passwords. In Proc. IEEE Symp. Security & Privacy (2012).

[22] BONNEAU, J. Statistical metrics for individual password strength. In Proc. WPS (2012).[23] BRODKIN, J. 10 (or so) of the worst passwords exposed by the LinkedIn hack. Ars Technica, June 6, 2012. http://arstechnica.com/security/2012/06/10-or-so-of-the-worst-passwords-exposed-by-the-linkedin-hack/.

[24] BUCILU˘A, C., CARUANA, R., AND NICULESCU-MIZIL, A. Model compression. In Proc. KDD (2006).

[25] BURNETT, M. Xato password set. https://xato.net/.

[26] CASTELLUCCIA, C., D¨URMUTH, M., AND PERITO, D. Adaptive password-strength meters from Markov models. In Proc. NDSS (2012).

[27] CHANG, J. M. Passwords and email addresses leaked in Kickstarter hack attack. ABC News, Feb 17, 2014. http://abcnews.go.com/Technology/passwords-email-addresses-leaked-kickstarter-hack/story?id=22553952.

[28] CHOLLET, F. Keras Github repository. https://github.com/fchollet/keras.

[29] CHUN, W. WebGL Models: End-to-End. In OpenGL Insights. 2012.

[30] CIARAMELLA, A., D’ARCO, P . , DE SANTIS, A., GALDI, C., AND TAGLIAFERRI, R. Neural network techniques for proactive password checking. IEEE TDSC 3, 4 (2006), 327–339.

[31] CLERCQ, J. D. Resetting the password of the KRBTGT active directory account, 2014. http://windowsitpro.com/security/resetting-password-krbtgt-active-directory-account.

[32] DAS, A., BONNEAU, J., CAESAR, M., BORISOV, N., AND WANG, X. The tangled web of password reuse. In Proc. NDSS (2014).

[33] DE CARN´E DE CARNA V ALET, X., AND MANNAN, M. From very weak to very strong: Analyzing password-strength meters. In Proc. NDSS (2014).

[34] DELL’AMICO, M., AND FILIPPONE, M. Monte Carlo strength evaluation: Fast and reliable password checking. In Proc. CCS (2015).

[35] DELL’AMICO, M., MICHIARDI, P. , AND ROUDIER, Y. Password strength: An empirical analysis. In Proc. INFOCOM (2010).

[36] DUCKETT, C. Login duplication allows 20m Alibaba accounts to be attacked. ZDNet, February 5, 2016. http://www.zdnet.com/article/login-duplication-allows-20m-alibaba-accounts-to-be-attacked/.

[37] D¨URMUTH, M., ANGELSTORF, F., CASTELLUCCIA, C., PERITO, D., AND CHAABANE, A. OMEN: Faster password guessing using an ordered markov enumerator. In Proc. ESSoS (2015).

[38] FAHL, S., HARBACH, M., ACAR, Y. , AND SMITH, M. On the ecological validity of a password study. In Proc. SOUPS (2013).

[39] FLORˆENCIO, D., HERLEY, C., AND V AN OORSCHOT, P. C . An administrator’s guide to internet password research. In Proc. USENIX LISA (2014).

[40] FOX-BREWSTER, T. 13 million passwords appear to have leaked from this free web host. F orbes, October 28, http://www.forbes.com/sites/thomasbrewster/2015/10/28/000webhost-database-leak/.

[41] GAILL Y, J.-L. gzip. http://www.gzip.org/.

[42] GOODIN, D. 10,000 Hotmail passwords mysteriously leaked to web. The Register, October 5, http://www.theregister.co.uk/2009/10/05/hotmail passwords leaked/.

[43] GOODIN, D. Hackers expose 453,000 credentials allegedly taken from Yahoo service. Ars Technica, July 12, 2012. http://arstechnica.com/security/2012/07/yahoo-service-hacked/.

[44] GOODIN, D. Anatomy of a hack: How crackers ransack passwords like “qeadzcwrsfxv1331”. Ars Technica, May 27, 2013. http://arstechnica.com/security/2013/05/how-crackers-make-minced-meat-out-of-your-passwords/.

[45] GOODIN, D. Why LivingSocial’s 50-million password breach is graver than you may think. Ars Technica, April 27, 2013.http://arstechnica.com/security/2013/04/why-livingsocials-50-million-password-breach-is-graver-than-you-may-think/.

[46] GOODIN, D. Once seen as bulletproof, 11 million+ Ashley Madison passwords already cracked. Ars Technica, September 10, 2015. http://arstechnica.com/security/2015/09/once-seen-as-bulletproof-11-million-ashley-madison-passwords-already-cracked/.

[47] GOOGLE. Web 1T 5-gram version 1, http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006T13.

[48] GRA VES, A. Supervised Sequence Labelling with Recurrent Neural Networks. Springer, 2012.

[49] GRA VES, A. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850, 2013.

[50] GREENBERG, A. The police tool that pervs use to steal nude pics from Apple’s iCloud. Wired, September 2, 2014. https://www.wired.com/2014/09/eppb-icloud/.

[51] HAN, S., MAO, H., AND DA L LY, W. J. A deep neural network compression pipeline: Pruning, quantization, Huffman encoding. arXiv preprint arXiv:1510.00149, 2015.

[52] HENRY, A. Five best password managers. LifeHacker, January 11, 2015. http://lifehacker.com/5529133/.

[53] HERLEY, C., AND V AN OORSCHOT, P. A research agenda acknowledging the persistence of passwords. IEEE Security & Privacy Magazine 10, 1 (Jan. 2012), 28–36.

[54] HOCHREITER, S., AND SCHMIDHUBER, J. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.

[55] HUNT, T. A brief Sony password analysis. http://www.troyhunt.com/2011/06/brief-sony-password-analysis.html, 2011.

[56] HUYNH, T. ABC Australia hacked nearly 50,000 user credentials posted online, half cracked in 45 secs. Techgeek, February 27, 2013. http://techgeek.com.au/2013/02/27/abc-australia-hacked-nearly-50000-user-credentials-posted-online/.

[57] JOHNSTONE, L. 9,885 user accounts leaked from Intercessors for America by Anonymous. http://www.cyberwarnews.info/2013/07/24/9885-user-accounts-leaked-from-intercessors-for-america-by-anonymous/, 2013.

[58] JOZEFOWICZ, R., ZAREMBA, W., AND SUTSKEVER, I. An empirical exploration of recurrent network architectures. In Proc. ICML (2015).

[59] KELLEY, P . G., KOMANDURI, S., MAZUREK, M. L., SH AY, R., VIDAS, T . , BAUER, L., CHRISTIN, N., CRANOR, L. F., AND LOPEZ, J. Guess again (and again and again): Measuring password strength by simulating password-cracking algorithms. In Proc. IEEE Symp. Security & Privacy (2012).

[60] KOMANDURI, S. Modeling the adversary to evaluate password strengh with limited samples. PhD thesis, Carnegie Mellon University, 2016.

[61] KOMANDURI, S., SH AY, R., CRANOR, L. F., HERLEY, C., AND SCHECHTER, S. Telepathwords: Preventing weak passwords by reading users’ minds. In Proc. USENIX Security (2014).

[62] KREBS, B. Fraud bazaar carders.cc hacked. http://krebsonsecurity.com/2010/05/fraud-bazaar-carders-cc-hacked/.

[63] LEE, M. Hackers have released what they claim are the details of over 21,000 user accounts belonging to Billabong customers. ZDNet, July 13, 2012. http://www.zdnet.com/article/over-21000-plain-text-passwords-stolen-from-billabong/.

[64] LOWERRE, B. T. The HARPY speech recognition system. PhD thesis, Carnegie Mellon University, 1976.

[65] MA, J., YANG, W., LUO, M., AND LI, N. A study of probabilistic password models. In Proc. IEEE Symp. Security & Privacy (2014).

[66] MAZUREK, M. L., KOMANDURI, S., VIDAS, T., BAUER, L., CHRISTIN, N., CRANOR, L. F., KELLEY, P . G., SH AY, R., AND UR, B. Measuring password guessability for an entire university. In Proc. CCS (2013).

[67] MCALLISTER, N. Twitter breach leaks emails, passwords of 250,000 users. The Register, Feb 2, 2013.

[68] MIKOLOV, T . , SUTSKEVER, I., DEORAS, A., LE, H.-S., KOMBRINK, S., AND CERNOCKY, J. Subword language modeling with neural networks. Preprint (http://www.fit.vutbr.cz/~imikolov/rnnlm/char.pdf), 2012.

[69] MITZENMACHER, M. Compressed Bloom filters. IEEE/ACM Transactions on Networking (TON) 10, 5 (2002), 604–612.

[70] NARA Y ANAN, A., AND SHMA TIKOV, V. Fast dictionary attacks on passwords using time-space tradeoff. In Proc. CCS (2005).

[71] NEEF, S. Using neural networks for password cracking. Blog post. https://0day.work/using-neural-networks-for-password-cracking/, 2016.

[72] NIELSEN, J., AND HACKOS, J. T. Usability engineering, vol. 125184069. Academic press Boston, 1993.

[73] PERLROTH, N. Adobe hacking attack was bigger than previously thought. The New York Times Bits Blog, October 29, 2013. http://bits.blogs.nytimes.com/2013/10/29/adobe-online-attack-was-bigger-than-previously-thought/.

[74] PESL Y AK, A. John the Ripper. http://www.openwall.com/john/, 1996-.

[75] PROTALINSKI, E. 8.24 million Gamigo passwords leaked after hack. ZDNet, July 23, 2012. http://www.zdnet.com/article/8-24-million-gamigo-passwords-leaked-after-hack/.

[76] RAGAN, S. Mozilla’s bug tracking portal compromised, reused passwords to blame. CSO, September 4, 2015. http://www.csoonline.com/article/2980758/.

[77] SCHNEIER, B. Myspace passwords aren’t so dumb. http://www.wired.com/politics/security/commentary/securitymatters/2006/12/72300, 2006.

[78] SCOWL. Spell checker oriented word lists. http://wordlist.sourceforge.net, 2015.

[79] SH AY, R., BAUER, L., CHRISTIN, N., CRANOR, L. F., FORGET, A., KOMANDURI, S., MAZUREK, M. L., MELICHER, W., SEGRETI, S. M., AND UR, B. A spoonful of sugar? The impact of guidance and feedback on password-creation behavior. In Proc. CHI (2015).

[80] SH AY, R., KOMANDURI, S., DURITY, A. L., HUH, P . S., MAZUREK, M. L., SEGRETI, S. M., UR, B., BAUER, L., CHRISTIN, N., AND CRANOR, L. F. Can long passwords be secure and usable? In Proc. CHI (2014).

[81] SONG, D. X., WAGNER, D., AND TIAN, X. Timing analysis of keystrokes and timing attacks on SSH. In Proc. USENIX Security Symposium (2001).

[82] SPIDEROAK. Zero knowledge cloud solutions. https://spideroak.com/, 2016.

[83] STEUBE, J. Hashcat. https://hashcat.net/oclhashcat/, 2009-.

[84] SUTSKEVER, I., MARTENS, J., AND HINTON, G. E. Generating text with recurrent neural networks. In Proc. ICML (2011).

[85] TRUSTW A VE. eHarmony password dump analysis, June http://blog.spiderlabs.com/2012/06/eharmony-password-dump-analysis.html.

[86] TRUSTW A VE SPIDERLABS. SpiderLabs/KoreLogic-Rules. https://github.com/SpiderLabs/KoreLogic-Rules, 2012.

[87] TSUKA Y AMA, H. Evernote hacked; millions must change passwords. Washington Post, March 4, 2013. https://www.washingtonpost.com/8279306c-84c7-11e2-98a3-b3db6b9ac586 story.html.

[88] UR, B., KELLEY, P . G., KOMANDURI, S., LEE, J., MAASS, M., MAZUREK, M., PASSARO, T . , SH AY, R., VIDAS, T., BAUER, L., CHRISTIN, N., AND CRANOR, L. F. How does your password measure up? The effect of strength meters on password creation. In Proc. USENIX Security (2012).

[89] UR, B., SEGRETI, S. M., BAUER, L., CHRISTIN, N., CRANOR, L. F., KOMANDURI, S., KURILOV A, D., MAZUREK, M. L., MELICHER, W., AND SH AY, R. Measuring real-world accuracies and biases in modeling password guessability. In Proc. USENIX Security (2015).

[90] VANCE, A. If your password is 123456, just make it hackme. New York Times, January 20, 2010. http://www.nytimes.com/2010/01/21/technology/21password.html.

[91] VERAS, R., COLLINS, C., AND THORPE, J. On the semantic patterns of passwords and their security impact. In Proc. NDSS (2014).

[92] WEIR, M., AGGARW AL, S., COLLINS, M., AND STERN, H. Testing metrics for password creation policies by attacking large sets of revealed passwords. In Proc. CCS (2010).

[93] WEIR, M., AGGARW AL, S., MEDEIROS, B. D., AND GLODEK, B. Password cracking using probabilistic context-free grammars. In Proc. IEEE Symp. Security & Privacy (2009).

[94] WHEELER, D. zxcvbn: Realistic password strength estimation. https://blogs.dropbox.com/tech/2012/04/zxcvbn-realistic-password-strength-estimation/, 2012.

[95] WHEELER, D. L. zxcvbn: Low-budget password strength estimation. In Proc. USENIX Security (2016).

[96] XUE, J., LI, J., YU, D., SELTZER, M., AND GONG, Y. Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network. InProc. ICASSP(2014).

[97] YOSINSKI, J., CLUNE, J., BENGIO, Y. , AND LIPSON, H. How transferable are features in deep neural networks? In Proc. NIPS (2014).