Skip to Main Content
May 14, 2024

Introducing Meta-Detector

Written by Joe Sullivan
Penetration Testing Research

In this blog post, I’m going to discuss a new Open-Source Intelligence (OSINT) tool I created to assist with collecting information about target organizations during penetration testing engagements. I call it, Meta-Detector.

Lately, I've noticed that a number of automated tools for the OSINT gathering process, specifically Google Dorking, have been less than efficient, or not yielding any results at all. This is what led me to the creation of Meta-Detector.

Google Dorking

Before I get into discussing Meta-Detector, let me explain what Google Dorking is, just in case you are not familiar with the term.

Google Dorking refers to the use of search techniques and specific search queries (known as "Google Dorks") to locate hidden or sensitive information that may not be easily accessible through the usual search methods. It involves utilizing Google's search capabilities to find specific types of content or data that may be inadvertently exposed or publicly available on the Internet. 

Google Dorking relies on operators and search modifiers that allow users to refine and narrow down search results based on specific criteria. Some common Google Dorking techniques include:

  • File Type Searches: Using the "filetype" operator to search for specific file types, such as PDFs, spreadsheets, or configuration files. For example, "filetype:pdf site:example.com" searches for PDF files on the example.com domain.
  • Site Searches: Using the "site" operator to search within a specific website or domain. For example, "site:example.com" restricts the search results to the example.com domain.
  • Intitle and inurl Searches: Using the "intitle" and "inurl" operators to search for specific words or phrases in the title or URL of webpages. For example, "intitle:index of /" searches for webpages with "index of" in the title, often revealing directory listings.
  • Authentication Bypass: Using specific search queries to identify websites or web applications with known vulnerabilities or misconfigurations that may allow unauthorized access. For example, "inurl:/admin/login" searches for login pages within website URLs.
  • Sensitive Information Disclosure: Using search queries to identify sensitive information such as usernames, passwords, credit card numbers, or confidential documents that may have been inadvertently exposed online. For example, "intext:password filetype:txt" searches for text files containing the word "password."

Google Dorking can be used for various purposes, including cybersecurity research, penetration testing, and investigations. However, it's important to use Google Dorking responsibly and ethically, respecting privacy and legal considerations, and avoiding any unauthorized or intrusive activities. Additionally, organizations should regularly review and secure their online assets to prevent inadvertent exposure of sensitive information through Google Dorking techniques. 

Real-World Use Cases

If you are new to this type of research, here are some ideas to get you thinking about how this could be beneficial.

Identifying Exposed Credentials:

Penetration testers can use Google Dorking to search for exposed credentials such as usernames and passwords. By searching for specific strings or patterns that indicate login credentials (e.g., "admin," "password", "login", etc.), they can discover instances where sensitive information has been inadvertently exposed on the internet.

Discovering Misconfigured Servers and Services:

Google Dorking can help identify servers and services that are misconfigured or left in default states, which may pose security risks. Penetration testers can use search operators to find instances of default login pages, open ports, or exposed configuration files, allowing them to assess the security posture of these systems.

Locating Vulnerable Web Applications:

Penetration testers often use Google Dorking to find vulnerable web applications that may be susceptible to common security flaws such as cross-site scripting (XSS), SQLi (SQL Injection), or file inclusion vulnerabilities. By searching for specific keywords or error messages associated with these vulnerabilities, testers can identify potential targets for further assessment.

Mapping Internal Network Infrastructure:

In some cases, Google Dorking can be used to gather information about an organization's internal network infrastructure. By searching for publicly accessible documents, network diagrams, or configuration files that reference internal IP addresses or domain names, penetration testers can gain insights into the layout and connectivity of the target network.

Assessing Third-Party Vendor Security:

Organizations often rely on third-party vendors for various services and solutions. Penetration testers can use Google Dorking to assess the security of these vendors by searching for information such as vendor-specific documentation, customer portals, or data breaches involving the vendor's infrastructure. This helps organizations evaluate the security risks associated with their vendor relationships.

Meta-Detector

Meta-Detector was built on Go v1.22.1 and can be compiled on any system where at Go is installed (Go v1.22.1 is recommended for best results). The function of the tool is to take a domain name as an argument, then create an HTML page consisting of a number of preconfigured Google Dork Searches as hyperlinks to assist with finding information of interest for the target organization in Google’s search results.

Figure 1- Example of Meta-Detector Results

This will simplify access to Google search results with custom links for specific search operators and domain name. These preconfigured Google Dork links eliminate the need to remember complex syntax or overlook useful searches.

Each link will take you directly to the Google search results tailored to the specified search operator and domain name. It's important to note that the results displayed may vary based on the organization's exposure of the particular type of data or file you are searching for.

Additionally, each link is designed to open in a new tab automatically, ensuring a seamless browsing experience. If you opt to open all links simultaneously using the button at the bottom, please ensure that any pop-up blockers are disabled to avoid any interruptions.

Please be aware that Google might throttle searches if it detects dorking activity, which could occasionally result in a "Prove you are not a robot" captcha. Simply complete the captcha to continue browsing without any interruptions.

You might be wondering, "Where do we find the Google Search Operators?" Well, I'm glad you asked. Meta-Detector relies on a configuration file named "search.config", which contains a range of useful search operators tailored for your OSINT endeavors. This file comes bundled with Meta-Detector, ensuring you have access to a comprehensive set of preconfigured search operators. You can edit the search.config file with a text editor and add your own Google Search Operators to fit your particular use case as well. 

Figure 2 - Example of search.config contents

Obtaining and Using Meta-Detector

In this section, I will delve into the intricacies of utilizing Meta-Detector, and deeper into concepts previously introduced in this blog post.

Create Directory: Begin by creating a directory on the system where you intend to install Meta-Detector.

Figure 3 - Example of Creating a Directory

Clone Repository: From the command line within that directory, use the command git clone https://github.com/stolenusername/Meta-Detector/.

Figure 4 - Cloning the Repository

Build Meta-Detector: Run the Go build, meta-detector.go. Note: This step requires Go to be installed on your system.

Figure 5 - Building Meta-Detector

Meta-Detector operates by reading search parameters and descriptions from the search.config file. Generating Google search URLs based on the provided parameters.

Embedding these URLs into an HTML page, which is then saved as a file.

The search.config file comprises search parameters and descriptions, following the format, ": ". Parameters can include operators such as site:, filetype:, inurl:, intitle:, intext:, and logical operators like OR.

Figure 6- search.config Contents

To Update search.config: Open the file in a text editor. Add or modify parameters following the specified format. Save the changes. You and also download the latest version of search.config with the following command: ./meta-detector --download 

Usage: Ensure search.config is configured with desired parameters.

Run Meta-Detector with the domain as an argument (e.g., ./meta-detector domain.com).

Figure 7 - Running Meta-Detector

Meta-Detector generates an HTML file named "_search_results.html" containing Google Dork search results. Each link in the HTML file opens search results in a new tab, with a button to open all links at once.

Figure 8 - Example Links in HTML File

Example Output: For instance, if the domain is thisdomaindoesntexist.com, the tool generates thisdomaindoesntexist.com_search_results.html with search results for that domain.

Figure 9 - Example HTML Output File

Troubleshooting:

Pop-up Blocker: Disable or allow pop-ups to ensure links open successfully.

"Prove You Are Not a Robot" Prompt: Complete CAPTCHA verification if prompted due to Google's throttling.

 Adjusting Delay: Modify the delay in the code (var delay = 1000;) if throttling issues persist. 1000 = one second. You will need to build the application after any adjustments. Alternatively, you can edit this in the HTML file after it’s generated. I have found that it is best to have a longer delay to avoid throttling and to address captchas as they are presented. This also helps prevent getting multiple consecutive captchas.

Figure 10 - Delay Variable in Code

More Features Coming

Meta-Detector will have a companion application called Meta-Spider. This companion application spiders a domain while utilizing a spider.config file similar to what Meta-Detector utilizes to locate items of interest within a site. Look for Meta-Spider to be released in the coming weeks. Follow me on X or Mastadon to keep up with the latest updates:

X: https://twitter.com/replicanthacker

Mastodon: https://infosec.exchange/@ReplicantHacker