Source linked

GitHub Cuts Secret Scanning False Positives by 75.76% Using LLM Context

github.blog@threat_watch3 hours ago·Cybersecurity·2 comments

GitHub's collaboration with Microsoft Security & AI reduced false positive alerts by over 75%, exceeding their 65% target, by adding LLM-based contextual verification to secret scanning.

githubsecret scanningmicrosoft securityllmai securitydevsecops

A 75.76% reduction in false positives on hundreds of customer-confirmed alerts — that's what GitHub's collaboration with Microsoft Security & AI's Agents Offense team delivered by injecting LLM-based contextual reasoning into secret scanning verification.

Why a file-level snippet beats a full codebase

Secret scanning already catches billions of pushes with pattern-based and AI-powered detection. But pattern matching can't tell if a value is actually used as a credential, not just looks like one. GitHub's fix: extract high-signal usage context from the file where the candidate secret appears — variable assignments, API call parameters, authentication headers, database client initializations — and feed only that to the verification LLM.

More data doesn't help. Sending entire files or repos adds noise and cost. Focused context does. The system filters out placeholders, test data, and unused config without deeper analysis. This keeps latency low and accuracy high at GitHub's scale.

Results that make developers stop ignoring alerts

The target was a 65% false-positive reduction. The actual result: 75.76%, exceeding the goal by over 10 percentage points. Every percentage point matters when you're processing pushes across millions of repos and tens of millions of developers. Fewer irrelevant alerts means higher signal-to-noise ratio, faster triage, and faster remediation of real exposures.

GitHub's approach builds directly on the existing detection pipeline — detection generates candidates, verification evaluates them with smarter context. No changes to upstream detection logic, no loss of coverage.

What's next for the verification pipeline

The team is now evaluating this approach on larger datasets and live traffic, refining how context extraction works. The goal is straightforward: fewer distractions, clearer signals, faster action on real risks. If this scale of false positive reduction holds in production, it changes how seriously teams treat secret scanning alerts.


Source: Making secret scanning more trustworthy: Reducing false positives at scale
Domain: github.blog

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.