Several collections of login credentials reveal one of the largest data breaches in history, totaling a humongous 16 billion exposed login credentials. The data most likely originates from various infostealers.

This story, based on unique Cybernews findings and originally published on the website on June 18, is constantly being updated with clarifications and additional information in response to public discourse. The most recent version of the article features comments from Cybernews researcher Aras Nazarovas and Bob Diachenko who unveiled this recent data leak. We’ve also added a few screenshots as proof of the leak.

Key takeaways:

The largest data breach in history involves 16 billion login credentials
The records are scattered across 30 different databases, and some records are or might be overlapping
The data most likely comes from various infostealers
The data is recent, not merely recycled from old breaches
Cybercriminals now have unprecedented access to personal credentials and could exploit them for account takeovers, identity theft, and targeted phishing attacks

Unnecessarily compiling sensitive information can be as damaging as actively trying to steal it. For example, the Cybernews research team discovered a plethora of supermassive datasets, housing billions upon billions of login credentials. From social media and corporate platforms to VPNs and developer portals, no stone was left unturned.

Our team has been closely monitoring the web since the beginning of the year. So far, they’ve discovered 30 exposed datasets containing from tens of millions to over 3.5 billion records each. In total, the researchers uncovered an unimaginable 16 billion records.

None of the exposed datasets were reported previously, bar one: in late May, Wired magazine reported a security researcher discovering a “mysterious database” with 184 million records. It barely scratches the top 20 of what the team discovered. Most worryingly, researchers claim new massive datasets emerge every few weeks, signaling how prevalent infostealer malware truly is.

“This is not just a leak – it’s a blueprint for mass exploitation. With over 16 billion login records exposed, cybercriminals now have unprecedented access to personal credentials that can be used for account takeover, identity theft, and highly targeted phishing. What’s especially concerning is the structure and recency of these datasets – these aren’t just old breaches being recycled. This is fresh, weaponizable intelligence at scale,” researchers said.

The only silver lining here is that all of the datasets were exposed only briefly: long enough for researchers to uncover them, but not long enough to find who was controlling vast amounts of data. Most of the datasets were temporarily accessible through unsecured Elasticsearch or object storage instances.

What do the billions of exposed records contain?

Researchers claim that most of the data in the leaked datasets is a mix of details from stealer malware, credential stuffing sets, and repackaged leaks.

There was no way to effectively compare the data between different datasets, but it’s safe to say overlapping records are definitely present. In other words, it’s impossible to tell how many people or accounts were actually exposed.

However, the information that the team managed to gather revealed that most of the information followed a clear structure: URL, followed by login details and a password. Most modern infostealers – malicious software stealing sensitive information – collect data in exactly this way.

Information in the leaked datasets opens the doors to pretty much any online service imaginable, from Apple, Facebook, and Google, to GitHub, Telegram, and various government services. It’s hard to miss something when 16 billion records are on the table.

According to the researchers, credential leaks at this scale are fuel for phishing campaigns, account takeovers, ransomware intrusions, and business email compromise (BEC) attacks.

“The inclusion of both old and recent infostealer logs – often with tokens, cookies, and metadata – makes this data particularly dangerous for organizations lacking multi-factor authentication or credential hygiene practices,” the team said.

What dataset exposed billions of credentials?

The datasets that the team uncovered differ widely. For example, the smallest, named after malicious software, had over 16 million records. Meanwhile, the largest one, most likely related to the Portuguese-speaking population, had over 3.5 billion records. On average, one dataset with exposed credentials had 550 million records.

Some of the datasets were named generically, such as “logins,” “credentials,” and similar terms, preventing the team from getting a better understanding of what’s inside. Others, however, hinted at the services they’re related to.

For example, one dataset with over 455 million records was named to indicate its origins in the Russian Federation. Another dataset, with over 60 million records, was named after Telegram, a cloud-based instant messaging platform.

“The inclusion of both old and recent infostealer logs – often with tokens, cookies, and metadata – makes this data particularly dangerous for organizations lacking multi-factor authentication or credential hygiene practices,” the team said.

While naming is not the best way to deduce where the data comes from, it seems some of the information relates to cloud services, business-oriented data, and even locked files. Some dataset names likely point to a form of malware that was used to collect the data.

It is unclear who owns the leaked data. While it could be security researchers that compile data to check and monitor data leaks, it’s virtually guaranteed that some of the leaked datasets were owned by cybercriminals. Cybercriminals love massive datasets as aggregated collections allow them to scale up various types of attacks, such as identity theft, phishing schemes, and unauthorized access.

A success rate of less than a percent can open doors to millions of individuals, who can be tricked into revealing more sensitive details, such as financial accounts. Worryingly, since it’s unclear who owns the exposed datasets, there’s little impact users can do to protect themselves.

However, basic cyber hygiene is essential. Using a password manager to generate strong, unique passwords, and updating them regularly, can be the difference between a safe account and stolen details. Users should also review their systems for infostealers, to avoid losing their data to attackers.

No, Facebook, Google, and Apple passwords weren’t leaked. Or were they?

With a dataset containing 16 billion passwords, that’s equivalent to two leaked accounts for every person on the planet.

We don’t really know how many duplicate records there are, as the leak comes from multiple datasets. However, some reporting by other media outlets can be quite misleading. Some claim that Facebook, Google, and Apple credentials were leaked. While we can’t completely dismiss such claims, we feel this is somewhat inaccurate.

Bob Diachenko, a Cybernews contributor, cybersecurity researcher, and owner of SecurityDiscovery.com, is behind this recent major discovery.

CYBERNEWS

16 billion passwords exposed in record-breaking data breach

What do the billions of exposed records contain?

What dataset exposed billions of credentials?

No, Facebook, Google, and Apple passwords weren’t leaked. Or were they?