Over 100,000 GitHub Repos Have Leaked API or Cryptographic Keys (zdnet.com)

Posted by msmash on Friday March 22, 2019 @02:42AM from the security-woes dept.

A scan of billions of files from 13 percent of all GitHub public repositories over a period of six months has revealed that over 100,000 repos have leaked API tokens and cryptographic keys, with thousands of new repositories leaking new secrets on a daily basis. From a report: The scan was the object of academic research carried out by a team from the North Carolina State University (NCSU), and the study's results have been shared with GitHub, which acted on the findings to accelerate its work on a new security feature called Token Scanning, currently in beta. The NCSU study is the most comprehensive and in-depth GitHub scan to date and exceeds any previous research of its kind. NCSU academics scanned GitHub accounts for a period of nearly six months, between October 31, 2017, and April 20, 2018, and looked for text strings formatted like API tokens and cryptographic keys.

2 of 52 comments (clear)

Min score:

Reason:

Sort:

Re:A "scan and ban" function? by tepples · 2019-03-22 02:51 · Score: 3, Insightful

I'm interested in the algorithm that you propose that GitHub use to determine whether a 32-character alphanumeric string embedded in the source code is an API key or something else.
but by alessi_brand · 2019-03-22 03:17 · Score: 3, Insightful

How do they differentiate bogus keys from real keys? In my projects I deliberately include keys that are valid, but won't get you into anything but 'local' applications running with no sensitive data. There are plenty of valid reasons (integration tests, clone-and-run dev applications, etc) to have 'valid' but practically useless keys in github.