You have a target to test. Before you write a single line of code or launch a scan, you need to know everything about it. Name, emails, domains, suppliers, technologies, people. Without this phase, your penetration test is a shot in the dark. At Meteora Web, we have seen companies with sensitive data exposed on public boards, expired SSL certificates, and forgotten subdomains. OSINT reconnaissance is the Swiss army knife of any ethical hacker. In this guide we go beyond simple Google dorks: we will cover tools, techniques, and automation to gather information systematically, legally, and operationally.
OSINT is not just Google: the full spectrum of public sources
OSINT (Open Source Intelligence) means exploiting publicly accessible data. But be careful: not everything public is easy to find. You need to look at domains, search engines, social networks, leak databases, SSL certificates, WHOIS records, public APIs. We always start with one question: what could an attacker know about you without ever touching a firewall?
The digital perimeter of the target
First step: list all domains and subdomains. Not just the main site. Tools like theHarvester and Sublist3r give you an initial list. But the real value comes when you cross-reference data with SSL certificates (Certificate Transparency logs).
Sponsored Protocol
# Install theHarvester
pip install theHarvester
# Basic domain search
python3 theHarvester -d example.com -b google
Action now: take your domain (or a test one) and run the command above. How many subdomains appear? Compare with known ones. If you find something your client didn't know about, you've already won the first battle.
WHOIS and registries: not just expiration dates
WHOIS reveals the registrant's name and contacts, DNS servers, creation dates. An attacker can use it for social engineering. We have seen companies with personal phone numbers in WHOIS records. Use whois from terminal or online services like WhoisXMLAPI. Caution: many registrars offer privacy protection, but often forget to apply it to secondary domains.
whois example.com
Action now: check at least 3 domains in your perimeter (including .com, .net, .org) and see if any personal data is exposed.
OSINT on people: emails, social, and work profiles
Gathering information on individuals is sensitive and must respect privacy laws. But during an authorized penetration test, knowing employee email addresses allows you to test password spray, targeted phishing, and more. We always start with LinkedIn and public sources.
Sponsored Protocol
Email harvesting with theHarvester and Hunter.io
In addition to Google, theHarvester supports sources like LinkedIn, Yahoo, Bing, PGP key servers. Hunter.io provides an API to find emails associated with a domain. Beware of false positives: always verify with holehe if the email exists on known services.
# Install holehe
pip install holehe
# Verify email
holehe email@example.com
Action now: choose an authorized target domain, extract emails with theHarvester -b linkedin, then verify the most likely one with holehe. Record which services the email is present on (important for credential stuffing attacks).
Advanced Google Dorking for people
Dorking isn't just for vulnerable files. You can search for public profiles with site:linkedin.com "Company" "Role" or PDF documents with filetype:pdf "example.com". Powerful combos: intext:"password" site:example.com (if you find clear-text passwords, you have a critical flaw).
Minimum dorks to try:
site:example.com ext:log– exposed log filessite:example.com intitle:"index of"– directory listing"example.com" "confidential" filetype:pdf– sensitive documents
Infrastructure and technologies: Shodan, Censys, Certificate Transparency
You don't need server access to know what software is running. Shodan indexes banners of exposed services. Censys does the same for hosts. SSL certificate logs (CT) reveal subdomains and issue dates. An attacker looks for outdated Apache, Nginx, OpenSSH versions. At Meteora Web, we use this data to anticipate vulnerabilities.
Sponsored Protocol
Shodan: find your target on the network
# Shodan filter by domain
shodan search hostname:example.com
# Search by specific port
shodan search "port:443 hostname:example.com"
Action now: (with a free API key) search your target on Shodan. How many services are exposed? Any open databases (MongoDB, Elasticsearch)? Immediately report the most critical ones.
Certificate Transparency: the subdomain goldmine
crt.sh is the database of all issued certificates. A single query gives you unknown subdomains.
curl -s "https://crt.sh/?q=%25.example.com&output=json" | jq '.[].name_value' | sort -u
Action now: run that curl against your target domain. How many subdomains do you get? Compare with Sublist3r. The difference is the “hidden” subdomains.
Automation with OSINT frameworks: Recon-ng and Maltego
Doing everything by hand is inefficient. A serious penetration test requires automation and correlation. Recon-ng is a modular OSINT framework. Modules range from WHOIS to Google+, Shodan to Have I Been Pwned. Maltego is visual: it creates graphs of relationships between domains, emails, people. We prefer Recon-ng because it's scriptable and integrates with our workflow.
Sponsored Protocol
Recon-ng workspace example
# Launch Recon-ng
recon-ng
# Create workspace
workspaces create pentest_target
# Load module
modules load recon/domains-hosts/google_site_web
# Set source
set SOURCE example.com
# Run
run
Action now: install Recon-ng, create a workspace for a test target, and run at least 5 collection modules (hosts, contacts, emails). Export the report in HTML.
Operational OSINT: how not to burn the reconnaissance phase
Anecdote: once, a client had an exposed Jenkins server with default credentials. We found it in 10 minutes with Shodan. But if the penetration tester hadn't been authorized, he could have gained access immediately. Golden rules:
- Never test on a target without written authorization.
- Use a dedicated VM or isolated container to avoid contaminating your tracks.
- Do not download sensitive files (if you find a database dump, stop and report).
- Document every step: for the final report you need reproducible evidence.
Tools you should always have at hand
| Tool | Use |
|---|---|
| theHarvester | Emails and subdomains |
| Sublist3r | Subdomains |
| Shodan | Exposed services |
| crt.sh | SSL certificates |
| Recon-ng | Modular framework |
| holehe | Email presence on services |
| Google dork queries | Indexed data |
What to do now – Operational checklist
- Define the perimeter: main domains, subdomains, known IPs.
- Collect WHOIS and certificates: use crt.sh and whois for every domain.
- Extract emails and users: theHarvester + holehe.
- Scan public services: Shodan with hostname filter.
- Automate: create a bash script that unifies all commands into a single report.
- Check Google dorks: at least 5 specific queries for the target sector.
- Document everything: screenshots, commands, JSON output. Don't trust your memory.
OSINT reconnaissance is what separates a amateur penetration test from a professional one. At Meteora Web, we place it at the heart of every assessment. If you want to dive deeper into the full ethical hacking cycle, read our definitive pillar guide. Remember: an ethical hacker's skill is measured by how much information they can gather without ever touching the target.