August 5th: Gartner Webinar - How PKI and Machines make the Digital World Go Round

Register Now
Close
New research identifies that 1 in every 172 active RSA certificates are vulnerable to compromise or attack.

Researchers JD Kilgallin and Ross Vasko from Keyfactor analyzed more than 75 million active RSA keys across the Internet, discovering that 1 in every 172 digital certificates using these keys are vulnerable to an attack known as ‘factoring.’

How does it work?

RSA public keys are the product of two large, randomly generated prime factors. If you collect public keys from the Internet and find any two that share the same factor, you can crack the corresponding private key. In this case, researchers were able to break nearly 250,000 keys corresponding to about 435,000 digital certificates.

Most of these vulnerable certificates were found on network appliances such as routers and firewalls, and emerging IoT devices. As more and more of these devices are connected to your network, the risk of poorly generated RSA keys increases the likelihood of exposure.

Introduction

As devices are joined to the Internet and other networks, the potential for network vulnerabilities increases. The recent growth of the Internet of Things (IoT) is driving an expansion in both scope and quantity of network-connected devices. These devices are also now seen in increasingly sensitive settings, such as in operating rooms and automobiles. Consumers are also transmitting increasingly sensitive information over the Internet, including financial and personal health data. The necessity for rigorous security standards is especially important given these elements of the current landscape.

Modern security best practice is to use Public-Key Cryptography to securely transmit data to a remote source – any party can encrypt data with the public portion of the remote source’s key, which can then only be decrypted by the remote source’s private key. The RSA algorithm has long served as one of the most popular encryption techniques for this encryption measure. The security of RSA relies on the inability of another party to determine two randomly-chosen prime numbers from which the RSA public key is derived. If these prime factors are discovered, the RSA private key can be re-derived, and an attacker can impersonate the remote source or decrypt stored communications that rely on the confidentiality of the private key.

In an attack that has already received a significant amount of attention, researchers are able to efficiently compute these prime factors for certain RSA public keys that have been generated with poor randomness, as shown in [1], [2], and [3]. Large numbers of RSA public keys can be acquired from many sources, and these can be mined for common factors between the keys. Each key that is found to have a common factor with another key in the dataset is compromised.

We reexamine this attack and its implications in the current landscape of ubiquitous and sensitive communications between users and devices. In 2019, with a large number of devices on the Internet and in other data sets like Certificate Transparency (CT) logs, this attack presents a serious threat if proper precautions are not in place. As the number of keys grows, it is more likely that weakly generated factors in RSA public keys will be discovered. Coupled with the availability of cheap computing resources and sensitivity of communications, the attack is as potent as ever.

As part of an ongoing effort to continuously monitor and improve Internet security, we outline an efficient, highly scalable approach to testing the vulnerability of public RSA keys in use. We use the SSL/TLS certificate discovery capability of the Keyfactor platform to build a database of 60 million RSA keys used on the Internet. We then augment this dataset with 100 million certificates available through Certificate Transparency logs. This data is analyzed on a single virtual machine in the Microsoft Azure cloud, using our scalable GCD algorithm for shared factors adapted from [1]. The analysis reveals that at least 435,000 weak certificates – 1 in 172 of the certificates we found on the Internet – are vulnerable to this attack.

435,000 weak certificates – 1 in 172 of the certificate we found on the Internet – are vulnerable to attack.

JD Kilgallin

Background

The Attack

RSA is used in in the process of encrypting data to send across a network. The server transmits its RSA public key to the client as a part of an SSL or TLS handshake. Part of the RSA public key contains the modulus n = p * q, where p and q are two randomly chosen primes of similar size. Primes p and q are kept secret, as knowing these values allows the private key to be calculated.

Ensuring that p and q are selected with sufficient randomness is a crucial component of keeping the public key secure. Factoring a large modulus n to obtain p and q is not feasible under normal circumstances. However, if keys are generated with poor randomness, then it becomes a concern that two public keys will share a factor once enough keys are generated. If two RSA moduli n1 and n2 share precisely one prime factor p, then computing the Greatest Common Divisor (GCD) of n1 and n2 will reveal the value of p. The GCD computation is significantly easier than straightforward factoring, and can easily be performed in practice. The other factors of n1 and n2 can then trivially be found by the simple calculations n1 ⁄ p and n2 ⁄ p, respectively, compromising both keys. This GCD computation can be scaled to analyze all pairs of keys in sub-quadratic time in the number of keys [1].

Selecting the prime factors of appropriate size with uniform randomness should prevent two moduli from ever sharing a factor in practice. However, if there is a flaw in the random number generation when choosing primes, a collision is likely in a sufficiently large dataset. Attackers can use this knowledge to collect a large number of RSA public keys and then look for GCDs between their moduli to search for factors shared by any pair.

Previous research teams have performed large scans on the Internet to demonstrate and investigate the implications of this attack. The attack received significant attention in 2012 and researchers were able to break tens of thousands of keys. There was a follow up on the attack in 2016 on a significantly larger data set, with a focus on trends in occurrences of weak keys from various vendors. We focused on three previous publications [1], [2], and [3] in developing our analysis.

Lenstra 2012

The authors of [2] collected approximately 6.2 million digital certificates across the Internet and found that approximately 4.3% of these certificates fully share their RSA modulus with another. The authors clustered the certificates together by modulus to further examine this duplication. Notably, single clusters of sizes 16489, 8366, 6351, and 5055 were found, and 58913 of the clusters have precisely two certificates. Additionally, they found that 12934 of the distinct RSA moduli were able to be factored. The authors found 1200 factors that are shared by more than one modulus. The most widely used individual factors were shared by up to 4627 certificates.

Heninger 2012

In [1], researchers collected 5.8 million unique TLS certificates and 6.2 million unique SSH host keys. Between these two collections, they found 11 million distinct RSA moduli and were able to factor 16,717 distinct public keys. This led to breaking 23,576 (.4%) of their TLS certificates and 1,013 (.02%) of the RSA SSH host keys.

The authors investigated the causes of moduli sharing factors by examining random number generation implementations. A major issue in the flawed random number generation was traced back to a low amount of entropy in the random number pool when the keys were generated. This issue most commonly arises on lightweight devices that do not have much input from which to gain entropy.

This analysis found that only two moduli on publicly trusted certificates could be factored, and both certificates had expired. The large majority of the broken keys came from network devices such as routers, modems, or firewalls.

Hastings 2016

The 2012 results of [1] were reexamined in 2016 in [3] with an emphasis on examining how vendors and end-users responded to the vulnerabilities. The authors examine 81 million distinct RSA keys and were able to factor 313,000 keys (.37%). They also examine trends in the number of weak keys from various companies. Since 2012, a significant number of new devices from companies including Huawei, D-Link, and ADTRAN were found to have vulnerable RSA keys.

Additionally, it was exceedingly rare for end-users to patch their devices to fix vulnerabilities found during the 2012 study. The authors note the “distressing implications” this has for the IoT era, as the number of devices on a network is rapidly increasing.

The Rise of IoT and Cloud Computing

Despite the large number of keys broken by this attack previously, it is unlikely that a key that has been properly generated with a sufficient amount of entropy could be broken with this technique. However, the requirement of sufficient entropy may not always be met when generating keys. In particular, some IoT devices may not have enough entropy available to generate keys without an external source. Major websites can prove vulnerable to this flaw as well, as the public and private keys used with their SSL/TLS certificates are generated by the website owner or administrator, and not by the CA that signs the certificate to make it publicly trusted. This means that the process the site operators used to generate their keys is opaque to the CA, and in general they cannot analyze the submitted public key for poor generation practices.

As growth of the Internet and the IoT has continued, network sizes have grown significantly and Internet-connected devices are appearing in new places. Security is a major concern in this landscape because of the increasing amount of data handled by these devices and the introduction of connected devices in sensitive settings such as operating rooms and vehicles. Compromising any device on these critical networks could results in catastrophic failure if proper precautions are not taken. Similarly, consumers are increasingly relying on publicly available web services to handle sensitive data and perform high-impact operations, such as financial transactions, transmission and storage of valuable intellectual property, and applications for credit or other services. The potential for an attacker to obtain this sensitive information or perform fraudulent operations can significantly impact consumers.

The rise of IoT technologies and cloud-computing resources since the initial publication of this technique makes it more dangerous in today’s landscape for several reasons:

The IoT is Comprised Mostly of Lightweight Devices

There is nothing inherently insecure about how such a device would generate an RSA key, but [1] found that lightweight devices are primarily at risk of this attack due to their low entropy states. Entropy in a device is required to prevent the random number generation from being predictable. Researchers were able to find deterministic “random” output when removing entropy. Lightweight IoT devices are particularly prone to being in low entropy states due to the lack of input data they might receive, as well as the challenge of incorporating hardware-based random number generation economically. Keys generated by lightweight IoT devices are therefore at risk of not being sufficiently random, increasing the chance that two keys share a factor and allow the key to be broken. The authors of [1] found that most of the keys that broken were from “low-resource” devices. Only two keys on two certificates were publicly trusted, and both of these certificates had expired.

The Attack Becomes More Successful When More Prime Factors Are Analyzed

As the number of discovered certificates grows, the number of certificate pairs to analyze for shared factors increases quadratically. The IoT landscape has drastically increased the number of devices on networks and this trend will only continue. Estimates and projections vary considerably, but Gartner predicts 20 billion devices will be deployed on the Internet within the next year [4].

Devices are in New and More Critical Environments

Compromising an RSA key has much more potential to be catastrophic in 2019. An RSA key being compromised now means more than personal or enterprise data being compromised. Critical real-time environments such as operating rooms, automobiles, industrial control devices, and home security systems now operate using RSA keys. Physical property and lives are therefore now at risk with RSA keys being compromised.

IoT Devices are More Difficult to Patch

Patching many IoT devices across a network can be tedious for a user. There can be many independent devices for a user to consider and manage, often without centralized automation across the devices. In some cases – where original device design did not support patching, or where the device is inaccessible – patching may be impossible. Additionally, due to the long life span of some IoT devices, it is possible that vulnerabilities can be found on live devices that are no longer being actively supported by the manufacturer. This means that a vulnerability to key compromise can be exploitable for much longer, and more difficult to remediate.

Computing and Scanning Resources are More Readily Accessible

Cloud services are readily available for rent to allow an adversary to both acquire and analyze the data efficiently. The only obstacle that an adversary truly faces in this attack is the implementation details. The attack itself has been well studied and understood, and the resources needed to execute it are now readily available at modest cost as well. To illustrate, development and computation resources spent for this study, excluding data acquisition, totaled less than $3000. Furthermore, pre-collected datasets of certificates and keys are available for purchase or even for free download, which can reduce the time and cost for an attacker to perform this computation.

For the above reasons, we feel it is necessary to reevaluate the implications of this attack in a modern landscape. As in previous works, the goal of this study is to provide an awareness of this attack and improve Internet security overall. Given the nature of the research, it is necessary to use real-world data for our analysis. However, as the results produce compromised keys used to secure real-world communications, we do not provide details on broken keys beyond high-level statistical summaries and we do not retain the broken keys past their use for this project.

Results

When analyzing the 45 million moduli from our 2015-2017 Internet scans, along with the first million low primes, we were able to break 192,709 keys. These 192,709 broken keys corresponded to 344,055 distinct certificates in the original dataset, or 0.56%. Twelve certificates failed signature validation when correlating the broken key with the original certificate. These appear to represent minor variations of legitimate certificates that occur in our dataset but were not compromised. The variants all produced an invalid public key, in that it is not a product of two primes, which causes the signature check to fail. These certificates were all CA certificates obtained by multiple ICI scans, so we attribute this to file corruption in the certificate chain on the hosting server.

When incorporating the results of the 2019 ICI scan and the 100 million certificates from CT logs, we were able to compromise 249,553 distinct keys corresponding to 435,694 certificates. Of these results, only five certificates are publicly-trusted certificates found in CT logs, including one domain with two compromised certificates, issued by different CAs. As of this writing, none of these certificates remain in use online. The remainder of the certificates are self-signed, privately-rooted, or device certificates. Excluding the CT logs, these numbers indicate that around 0.58% of certificates online during the scanning period are vulnerable to this attack. Statistical summaries are given in Table 1.

RSA Weak Certificates

It is interesting to note that some of the broken moduli are used by considerably more than one certificate, with the average number of certificates per key being 1.75. The most commonly occurring modulus value was used by 1737 certificates, and 19 moduli were used by more than 300 certificates. The usage statistics for individual factors are even more concentrated, and Table 2 gives the top ten most used factors. These ten factors together, while less than .01% of the factors across all keys, account for more than 6% of the broken certificates. To reduce the odds of causing harm by helping others to re-derive real private keys, we do not provide any association between these factors and keys or certificates that used them. These factors have also been redacted to prevent abuse while giving manufacturers the opportunity to recognize if they’ve generated a common weak factor; the first and last 8 bytes are provided to facilitate this.

RSA Weak Certificates 2

Of these top factors, eight appear to represent factors of an appropriate length for a 1024-bit RSA key and two appear to represent factors of a 512-bit RSA key. Across all reused factors in a typical range for each common key size, the distribution is given in Table 3. In addition, one 2204-bit factor was shared by two keys, along with 16 factors of 13 bits or less found by including the low primes.Of these top factors, eight appear to represent factors of an appropriate length for a 1024-bit RSA key and two appear to represent factors of a 512-bit RSA key. Across all reused factors in a typical range for each common key size, the distribution is given in Table 3. In addition, one 2204-bit factor was shared by two keys, along with 16 factors of 13 bits or less found by including the low primes.

RSA Weak Certificates 3

As with previous results in this field, we are able to determine some information about the usage of the broken certificate by looking at its subject. Of these certificates, for example, 217,988 – almost exactly 50% of compromised certificates – contain the name of a large network equipment manufacturer previously notified as a result of [1] and [3]. This includes 66,939 certificates discovered in the 2019 ICI scan, indicating that the manufacturer and/or consumers of their products continue to have issues with security in the field. In fact, many of these devices used the same key. Unsurprisingly for reasons already discussed, at least 12 of the 20 most common certificate subjects in our dataset appear to represent consumer IoT devices. In some cases, we were able to determine the exact model of device represented. Other common subjects include very simple IP addresses like “CN=192.168.1.1” (25,075 certificates) and subjects that appear to have been taken from examples without modification, such as “[email protected], CN=NOHOSTNAME.NODOMAINNAME, O=ANY COMPANY, L=ANY LOCALITY, ST=ANY STATE, C=US” (242 certificates). As we have not been able to identify or notify all manufacturers, we do not provide further information about specific subjects.

The fact that a quarter million keys from the past five years are vulnerable to this attack, even for an attacker with limited resources, is concerning. A party with a re-derived private key for an SSL/TLS server certificate can impersonate that entity, and network clients attempting to connect to that endpoint cannot distinguish the attacker from the legitimate holder of the certificate. This leads to a risk that an unsuspecting client will transmit sensitive information, such as login credentials or financial information, to the attacker, who can then use this information to cause a variety of adverse effects. In today’s landscape where these clients may represent automobiles, medical implants, or other critical devices connecting to a backend system, the impersonated service may cause the device to malfunction and cause physical harm. Furthermore, in some cases, the private key can be used to decrypt stored communications to a server long after those communications have taken place. This risk can be eliminated by encrypting a session with a cryptographic standard that employs forward secrecy, a property that ensures prior communications are not disclosed simply by key compromise. Although the value these communications may provide to the attacker tends to decrease with age of the communications, a large amount of information transmitted without forward secrecy may be exploited by a party with the resources to obtain and store the encrypted messages.

One Internet security company that monitors the cryptographic features offered by servers finds that only about 50% of servers during the time period of the ICI scans offered forward secrecy, leaving the other half vulnerable to compromised keys [8]. While the situation has improved since – in particular, TLS 1.3 requires forward secrecy – this older traffic is all vulnerable to disclosure. It is worth noting that even RSA keys that were well-generated and not currently compromised will likely be broken in the future with the advent of practical quantum computers, for which integer factoring is an easy problem. Other cryptographic systems such as Eliptic-Curve Cryptography (ECC) are also vulnerable to compromise by quantum computers. It is also worth noting that, while the particular attack described here is specific to RSA keys, ECC does still rely on secure random number generation and other attacks could still potentially exploit a lack of entropy in key generation to compromise an ECC key.

Conclusion

With modest resources, we were able to obtain hundreds of millions of RSA keys used to protect real-world traffic on the Internet. Using a single cloud-hosted virtual machine and a well-studied algorithm, over 1 in 200 certificates using these keys can be compromised in a matter of days. These weak keys expose users to a wide variety of harm, from disclosure of confidential information to failures of safety-critical systems. This vulnerability is only one of many potential threats that can cause a key that appears secure – or that was once secure – to be unexpectedly compromised. These concerning findings highlight the need for device manufacturers, website and network administrators, and the public at large to consider security, and especially secure random number generation, as a paramount requirement of any connected system. Systems should attempt to use security best practices from inception, and in any case must have the ability to securely update both software and cryptography to protect against risks like this.

References

[1] N. Heninger, Z. Durumeric, E. Wustrow, and J. A. Halderman, “Mining your Ps and Qs: detection of widespread weak keys in network device,” Proceedings of the 21st USENIX Security Symposium, August 2012.

[2] A. K. Lenstra, J. P. Hughes, M. Augier, J. W. Bos, T. Kleinjung, and C. Wachter, “Ron was wrong, Whit is
right,” 32nd International Cryptology Conference, CRYPTO ’12, August 2012.

[3] M. Hastings, J. Fried, and N. Heninger, “Weak keys remain widespread in network devices,” Proceedings of the 2016 Internet Measurement Conference (IMC ’16), ACM, New York, NY, USA, 49-63.

[4] Gartner, “Gartner says 8.4 billion connected “Things” will be in use in 2017, up 31 percent from 2016”. https://www.gartner.com/en/newsroom/press-releases/2017-02-07-gartner-says-8-billion-connected-things-will-be-in-use-in-2017-up-31-percent-from-2016.

[5] Certificate Transparency, https://www.certificate-transparency.org/.

[6] RSA Laboratories, “RSA Numbers”, https://www.ontko.com/pub/rayo/primes/rsa_fact.html.

[7] B. Cloostermans, “Quasi-linear GCD computation and factoring RSA moduli”, August 2012. https://pdfs.semanticscholar.org/4124/edde77caeecbbee378ccdd12ffbaa2e9f322.pdf.

[8] Qualys, “SSL Pulse,” https://www.ssllabs.com/ssl-pulse/.

Find out how the Keyfactor platform can modernize your PKI, prevent
certificate outages, accelerate DevOps security, and more.