Back to all articles

Early Warning Detection for Credential Theft: Why Behavioral Analysis Fails

November 18, 202510 min readThreat Research

External Discovery Remains the Norm

Mandiant's M-Trends 2025 report identifies a persistent detection gap: 57% of organizations discovered their 2024 breaches through external notifications. Fourteen percent learned of compromise when threat actors sent ransom notes, while 43% received notification from external entities such as law enforcement or cybersecurity companies.

These figures remain consistent with 2023 data. Despite continued investment in endpoint detection and response (EDR), network detection and response (NDR), and security information and event management (SIEM) platforms, organizations maintain this detection gap.

The resurgence of infostealer malware drives this pattern. Mandiant observed stolen credentials in 16% of intrusions in 2024, up from 10% in 2023. Underground credential markets trade stolen credentials from infostealer infections dating back years.

The Snowflake Breach: Four Years from Theft to Detection

The UNC5537 campaign against Snowflake customers demonstrates the timeline problem. Beginning in April 2024, this financially motivated threat actor accessed Snowflake instances at multiple organizations using credentials stolen from contractor and employee systems infected with infostealer malware.

Contractors used personal devices for both work and personal activities, including gaming and downloading pirated software. These systems became infected with VIDAR, RISEPRO, REDLINE, RACCOON STEALER, LUMMA, and METASTEALER. The infostealers harvested Snowflake credentials stored in browsers and password managers.

The oldest credential used traced back to an infostealer infection from November 2020. A credential stolen nearly four years earlier remained valid and undetected until the threat actor validated it in 2024.

Why Underground Markets Make Credential Theft Effective

Infostealers collect wide swaths of credentials from a single infected host. Browser passwords, SSH keys, cloud credentials, VPN access, database passwords, API tokens, and session cookies.

Underground markets allow searching these massive collections for specific credential types based on operational goals. Actors targeting cloud infrastructure search for AWS access keys and Azure service principals. Those focused on data theft search for database credentials and file server access. Cryptocurrency mining operations search for cloud compute credentials.

The UNC5537 campaign demonstrates this targeting. Threat actors searched massive collections of infostealer logs specifically for Snowflake credentials. Among millions of harvested credentials from gaming sites and personal accounts, they identified and extracted the credentials that enabled their operation.

Contractors and employees use personal devices for work. Browser password synchronization spreads corporate credentials to personal systems. A single infected contractor device accessing systems across multiple client organizations exposes credentials for dozens of enterprises.

Why Behavioral Detection Struggles With Credential-Based Intrusions

Modern breaches follow two distinct paths with different detection requirements.

Malware-Based Intrusions

Threat actors deploy malicious payloads to enterprise endpoints. EDR detects process behavior, network connections, and file modifications. Security teams investigate and respond based on behavioral alerts. This detection model works when attacks occur on monitored endpoints within your perimeter.

Credential-Based Intrusions

The attack path differs:

  1. Infostealer harvests credentials outside enterprise perimeter
  2. Credentials circulate through underground markets
  3. Threat actor validates stolen credentials through legitimate authentication
  4. Post-authentication behavior mimics legitimate user activity

Diagram

Behavioral detection faces challenges at two critical points.

Jurisdictional Limitations

Credential theft occurs on personal devices, contractor laptops, or third-party systems outside your EDR deployment. When credentials are harvested from a contractor's personal laptop infected while gaming or downloading pirated software, your EDR has no visibility. The infection happens outside your monitoring scope.

Behavioral Ambiguity

When threat actors use those credentials, they authenticate directly to cloud services, SaaS applications, or other systems through legitimate channels. The fundamental problem: these attacks look like normal user behavior.

The threat actor authenticates to Snowflake using valid credentials. Queries databases. Downloads data. All through legitimate API calls and web interfaces. EDR on enterprise endpoints often never sees the activity.

If the threat actor accesses enterprise endpoints using stolen VPN credentials, EDR observes what appears to be normal user behavior. Reading documentation, accessing repositories, searching collaboration platforms. These activities generate no anomalous signals when performed with valid credentials.

Supply chain compromise demonstrates both challenges. A malicious npm package executes on a developer's workstation and scans for credentials in configuration files and AWS credential stores, then exfiltrates to external services.

EDR observes processes reading files from home directories and making HTTPS requests to external services. Developers constantly read credential files. They upload content to remote services. The exfiltration payload measures in kilobytes. The destination might be a webhook developers use for notifications.

Distinguishing malicious credential harvesting from legitimate development work when the behaviors are identical creates significant detection challenges. Infostealers often establish no persistence mechanisms. They exfiltrate credentials and exit.

The 57% external discovery rate reflects how these detection challenges affect organizations broadly.

Give Adversaries Options

Underground markets give threat actors the advantage of choice. They search massive credential collections for specific targets. They select which credentials to validate. They choose their operational approach based on available access.

Early warning honey tokens shift this dynamic by giving adversaries options where all paths lead to detection.

You plant monitored credentials throughout your environment. When threat actors enumerate infostealer logs, they find your tokens alongside legitimate stolen credentials. AWS access keys in repositories. Database credentials in documentation. SSH keys in configuration files. API tokens in CI/CD pipelines.

They cannot distinguish your tokens from legitimate stolen credentials during enumeration. Both appear as valid targets in their infostealer logs. When they choose which credentials to validate before operational use, they select from options you planted.

Validate your token, trigger immediate alert. Detection before lateral movement.

Diagram

The detection trigger is binary: this credential should never be used, yet someone used it. No behavioral analysis required. No tuning thresholds. This dramatically reduces false positives compared to behavioral detection, though tokens should be placed strategically in locations where legitimate discovery is unlikely.

Honey tokens address both detection challenges. They detect regardless of where credential theft occurs because detection happens at validation, not at theft. They detect regardless of whether behavior appears normal because the detection mechanism is credential-specific, not behavior-based.

Credential Diversity Matches Threat Actor Targeting

Threat actors search infostealer logs for specific credential types. AWS access keys for cloud targeting. Database credentials for data theft. SSH keys for lateral movement. API tokens for service access.

Honey token deployment must match this diversity. When threat actors search for AWS keys, they find your tokens mixed with legitimate stolen keys. When searching for database credentials, your tokens appear alongside real credentials.

Deploy only one credential type at scale and pattern recognition becomes possible. Deploy diverse credential types across realistic locations and individual credentials blend into normal credential sprawl during reconnaissance.

The Snowflake credentials stolen in 2020 and used in 2024 existed as long-lived access credentials. AWS access keys, database passwords, SSH private keys planted in repositories, documentation, configuration management systems follow the same pattern. These credentials persist until explicitly revoked.

Real-time monitoring begins at token issuance. The UNC5537 campaign used credentials from a 2020 infection validated in 2024. Organizations that deployed honey tokens in contractor-accessible documentation or configuration files in 2020 would have received alerts when those tokens were validated in 2024. The four-year gap between theft and usage becomes irrelevant. The credential exists. Someone validated it. Alert triggers.

When validation occurs, immediate correlation provides deployment context. Which repository contained the credential. Which documentation page. Which configuration file. Organizations achieve different detection precision based on deployment density. Sparse deployment across infrastructure provides organization-level alerts. Dense deployment across individual repositories and documentation sources provides asset-level precision showing exactly which resource was compromised.

Detection Before Lateral Movement

Threat actors validate stolen credentials before using them. They test the credential to confirm it still works before attempting lateral movement or data access. This validation step is the detection opportunity.

Honey tokens deploy into repositories, documentation, and configuration files where credentials naturally exist. AWS keys in config files. Database passwords in wikis. SSH keys in runbooks. When threat actors enumerate these locations-either from infostealer logs or by searching after initial access-they find tokens mixed with legitimate credentials.

From the adversary's perspective, a honey token is indistinguishable from a real credential. Same format, same location, same context. When they validate it to check if it works, the alert fires immediately.

The detection happens at credential validation, before any operational impact. Before lateral movement. Before data access. Before persistence. This timing advantage persists regardless of how long the credential sat in underground markets-four years or four minutes makes no difference.

Want more insights like this?