https://github.com/zeinlol/repo-scraper
Repo Scraper is a tool designed to analyze projects for possible password leaks or other sensitive data exposure. It scans files and repositories to detect potentially dangerous elements such as passwords, IP addresses, and other confidential information. The tool also supports Git repositories by analyzing changes between commits.
The tool provides the following commands for operation:
check_dir (Directory Check)
This command traverses all files in the specified folder and its subdirectories. It applies regular expressions to search for passwords, IP addresses, and other sensitive information. To optimize the process and avoid scanning irrelevant files, several filters are applied:
File size: Files larger than 1 MB are ignored, but a warning is displayed.
File extension: Files with unsupported extensions are ignored, and a warning is displayed. (See the "Notes" section for details on why extensions are used instead of MIME types.)
Base64 data: If a file contains Base64-encoded data, it is removed before applying regular expressions to avoid unnecessary processing.
check_repo (Git Repository Check)
The check_repo command works similarly to check_dir but is focused on analyzing changes in the Git repository's commit history. Instead of checking each commit entirely (which would be slow), the tool uses git diff to analyze only the changes between commits:
Filters similar to those in check_dir are applied:
Repo Scraper is distributed under the MIT License https://github.com/zeinlol/repo-scraper/blob/master/LICENSE