📡Passive (OSINT)

OSINT (short for Open-Source Intelligence Ghering) is a way of knowing your target without any sorts of direct contact or leaving any evidence of the recon.

In OSINT you should always ask questions like: how, who, when, where and why. also try to collect and sort everything you find and make a structured map of the intel you have gathered using a mind mapping tool like XMind or Mind Master.

The OSINT Process

OSINT reconnaissance can be further broken down into the following 5 sub-phases:

Source Identification: as the starting point, in this initial phase the attacker identifies potential sources from which information may be gathered from. Sources are internally documented throughout the process in detailed notes to come back to later if necessary.

Data Harvesting: in this phase, the attacker collects and harvests information from the selected sources and other sources that are discovered throughout this phase.

Data Processing and Integration: during this phase, the attacker processes the harvested information for actionable intelligence by searching for information that may assist in enumeration.

Data Analysis: in this phase, the attacker performs data analysis of the processed information using OSINT analysis tools.

Results Delivery: in the final phase, OSINT analysis is complete and the findings are presented/reported to other members of the Red Team.

Workflow

Domain

Website

Email

Location

Username / Real Name

Phone

OSINT Framework

the OSINT framework is a great collection of OSINT resources that you should definitely check them out.

Maltego

is an open source intelligence (OSINT) and graphical link analysis tool for gathering and connecting information for investigative tasks. its preinstalled in kali linux. you can download it from here

DNS Harvesting

Email Harvesting

Google Hacking

also named Google dorking, is a hacker technique that uses Google Search and other Google applications to find security holes in web applications, finding loots, targeting databases, login pages or even exposed backup files and directories.

Google Dorks

there is an expanding database of these search queries maintained by offensive security folks called the google hacking database (GHDB) . you can use the site search to find dorks for specific types of targets.

Google Dork Collections

Automated Dork Tools

GoogD0rker

./googD0rker-txt.py -d example.com

Goohak

./goohak domain.com

Pagado

# first run the scrapper to get the dorks and store them
python3 ghdb_scraper.py -j -s

# then run the tool to use gathered dorks
# -d option can be used to target a domain
python3 pagodo.py -d example.com -g dorks.txt -l 50 -s -e 35.0 -j 1.1

Advanced Search Keywords

besides the google dorks which are more advanced, there are some google search tricks (keywords) that will make your life easier. these are the keywords used in advanced google searches:

cache:

If you include other words in the query, Google will highlight those words within the cached document. For instance, [cache:www.google.com web] will show the cached content with the word “web” highlighted. This functionality is also accessible by clicking on the “Cached” link on Google’s main results page. The query [cache:] will show the version of the web page that Google has in its cache. For instance, [cache:www.google.com] will show Google’s cache of the Google homepage. Note there can be no space between the “cache:” and the web page url.

The query [link:] will list webpages that have links to the specified webpage. For instance, [link:www.google.com] will list webpages that have links pointing to the Google homepage. Note there can be no space between the “link:” and the web page url.

related:

The query [related:] will list web pages that are “similar” to a specified web page. For instance, [related:www.google.com] will list web pages that are similar to the Google homepage. Note there can be no space between the “related:” and the web page url.

info:

The query [info:] will present some information that Google has about that web page. For instance, [info:www.google.com] will show information about the Google homepage. Note there can be no space between the “info:” and the web page url.

define:

The query [define:] will provide a definition of the words you enter after it, gathered from various online sources. The definition will be for the entire phrase entered (i.e., it will include all the words in the exact order you typed them).

stocks:

If you begin a query with the [stocks:] operator, Google will treat the rest of the query terms as stock ticker symbols, and will link to a page showing stock information for those symbols. For instance, [stocks: intc yhoo] will show information about Intel and Yahoo. (Note you must type the ticker symbols, not the company name.)

site:

If you include [site:] in your query, Google will restrict the results to those websites in the given domain. For instance, [help site:www.google.com] will find pages about help within www.google.com. [help site:com] will find pages about help within .com urls. Note there can be no space between the “site:” and the domain.

allintitle:

**** If you start a query with [allintitle:], Google will restrict the results to those with all of the query words in the title. For instance, [allintitle: google search] will return only documents that have both “google” and “search” in the title.

intitle:

If you include [intitle:] in your query, Google will restrict the results to documents containing that word in the title. For instance, [intitle:google search] will return documents that mention the word “google” in their title, and mention the word “search” anywhere in the document (title or no). Note there can be no space between the “intitle:” and the following word. Putting [intitle:] in front of every word in your query is equivalent to putting [allintitle:] at the front of your query: [intitle:google intitle:search] is the same as [allintitle: google search].

inurl:

If you include [inurl:] in your query, Google will restrict the results to documents containing that word in the url. For instance, [inurl:google search] will return documents that mention the word “google” in their url, and mention the word “search” anywhere in the document (url or no). Note there can be no space between the “inurl:” and the following word. Putting “inurl:” in front of every word in your query is equivalent to putting “allinurl:” at the front of your query: [inurl:google inurl:search] is the same as [allinurl: google search].

and these are some simple rules for combining the queries and dorks as well:

OR : ( | )

AND: (&)

NOT

define : define a word or phrase

Github Dorks

we can use github advanced search keywords and dorks to find sensitive data in repositories.

Github dorks work with filenames and extentions

filename:bashrc
extension:pem
langage:bash

some examples of github search keywords:

extension:pem private # Private SSH Keys
extension:sql mysql dump # MySQL dumps
extension:sql mysql dump password # MySQL dumps with passwords
filename:wp-config.php # Wordpress config file
filename:.htpasswd # .htpasswd
filename:.git-credentials # Git stored credentials
filename:.bashrc password # .bashrc files containing passwords
filename:.bash_profile aws # AWS keys in .bash_profiles
extension:json mongolab.com # Keys/Credentials for mongolab
HEROKU_API_KEY language:json # Heroku API Keys
filename:filezilla.xml Pass # FTP credentials
filename:recentservers.xml Pass # FTP credentials
filename:config.php dbpasswd # PHP Applications databases credentials
shodan_api_key language:python # Shodan API Keys (try others languages)
filename:logins.json # Firefox saved password collection (key3.db usually in same repo)
filename:settings.py SECRET_KEY # Django secret keys (usually allows for session hijacking, RCE, etc)

Open Job Requisitions

Job requisitions can help us get information about the information technology products used in a target organization, such as:

  • Web server type

  • Web application dev environment

  • Firewall type

  • Routers

Google searches to find job reqs

  • site: [ companydomain ] careers Q , Keyword or Title 9

  • site: [ companydomain ] jobs .

  • site: [ companydomain ] openings

  • Also, searches of job-related site

PGP Public Key Servers

Organizations maintain servers that provide public PGP keys to clients. You can query these to reveal user email addresses and details.

CloadFlare / Tor IP Detection

some tips to find real IP addresses hiding behind CloadFlare and Tor

Identify Host Sharing

see if a single server or ip is hosting multiple websites/domains:

# Bing dorks to identify host sharing
ip:xxx.xxx.xxx.xxx

Shodan

Search engine for the Internet of everything. Shodan is the world's first search engine for Internet-connected devices including computers, servers, CCTV cameras, SCADA systems and everything that is connected to the internet with or without attention. Shodan can be used both as a source for gathering info about random targets for mass attacks and a tool for finding weak spots in a large network of systems to attack and take the low-hanging fruit. Shodan has a free and commercial membership and is accessible at shodan.io . the search syntax in the search engine is somehow special and can be found in the help section of the website. with shodan you can search for specific systems, ports, services, regions and countries or even specific vulnerable versions of a software or OS service running on systems like SMB v1 and much more.

here the keywords that are mostly used in shodan search queries:

Shodan Queries

title: Search the content scraped from the HTML tag
html: Search the full HTML content of the returned page
product: Search the name of the software or product identified in the banner
net: Search a given netblock (example: 204.51.94.79/18)
version: Search the version of the product
port: Search for a specific port or ports
os: Search for a specific operating system name
country: Search for results in a given country (2-letter code)
city: Search for results in a given city
! : NOT

for example these are some queries you can use with these keywords:

Also, searches of job-related sites
.ir
- www . monster . com: Search on Info Tech and Internet / E - commerce• Also, searches of job-related sites• Also, searches of job-related siteshostname:megacorpone.com
title:"smb" !port:139,445
product:IIS 8.5
Microsoft-IIS/5.0 title:"outlook web"
net:100.10.23.0/24 unauthorized
html:"eBay Inc. All Rights Reserved"
"Authentication: disabled" port:445
shodan count microsoft iis 6.0
shodan host 189.201.128.250
shodan myip
shodan parse --fields ip_str,port,org --separator , microsoft-data.json.gz 
shodan search --fields ip_str,port,org,hostnames microsoft iis 6.0

Shodan CLI & nmap

there are several other ways to use the search engine without the website for example with the nmap NSE scripts like this:

nmap -sn -Pn -n --script=shodan-api -script-args shodan-api.apikey=[api key] [target ip]

there is also a CLI shodan interface written in python for linux which you can use or integrate in your own scripts or tools. to install and setup the CLI tool:

   pip install shodan
   shodan init <api key>
   shodan -h

you can use the CLI tool by simply specifying a single host/IP:

shodan host [target ip]

shodan will return ports, services and even some possible CVEs ( which are not very reliable ).

for the free API keys you cant use the same method to scan a whole net block but with some bash voodoo you can use the free API instead of paid ones to scan the whole /24 net block this way and see if any systems on this net block is exposed :

for host in {1..254}; do shodan host 192.168.1.$host 2>&1 | grep -v "Error" ; done

and if you want to aggressively scan a /16 net block you can do this:

 for netblock in {1..254};do for host in {1..254}; do shodan host 192.168.$netblock.$host 2>&1 | grep -v "Error" ; done ; done

and here are some other useful resources about shodan:

https://www.sans.org/blog/getting-the-most-out-of-shodan-searches https://thor-sec.com/cheatsheet/shodan/shodan_cheat_sheet https://github.com/jakejarvis/awesome-shodan-queries

Credential Leaks

Tools and resources for credential leaks available online:

Social Media Investigation

For more in-depth social search check the social platforms page.

Other OSINT Websites

I have put together a list of the most used OSINT sources that will usually cover about 90% of your needs in a regular pentest. remember there are endless ways to find Intel about your target. the OSINT process is limited to your own imagination.

Top sources ( most used )

Dark web engines

pubpeer.com

scholar.google.com

arxiv.org

guides.library.harvard.edu

deepdotweb.com

Core.onion

OnionScan

Tor Scan

ahmia.fi

not evil

Monitoring and alerting

Google Alerts

HaveIBeenPwned.com

Dehashed.com

Spycloud.com

Ghostproject.fr

weleakinfo.com/

Tools and Frameworks

There are countless number of tools out there designed for active/passive recon. you wont need to know about every single one of them because most of them use the same techniques for gathering these information. in this section i will briefly introduce you to the best/well-known tools that i usually use:

Theharvester

a well-known tool among pentesters and OSINT investigators which is mostly good for collecting subdomains and email addresses.

harvester is preinstalled on pentesting OSs like kali and parrot but for others you can install it from github and run it in docker:

apt install theharvester
Theharvester --help

theHarvester -d target.com -b google,bing,baidu,bufferoverun,crtsh,dnsdumpster,duckduckgo,github-code,hackertarget,netcraft,rapiddns,rocketreach,sublist3r,trello,urlscan  -n -r -v -s --screenshot target/harvester -g 
git clone https://github.com/laramies/theHarvester
cd theHarvester
docker build -t theharvester .
docker run theharvester -h 

h8mail

An email OSINT and breach hunting tool using different breach and reconnaissance services, or local breaches such as Troy Hunt's "Collection1" and the infamous "Breach Compilation" torrent.

python3 -m pip install h8mail

h8mail -t target@example.com

gitrob

a tool to help find potentially sensitive files pushed to public repositories on Github. Gitrob will clone repositories belonging to a user or organization down to a configurable depth and iterate through the commit history and flag files that match signatures for potentially sensitive files. The findings will be presented through a web interface for easy browsing and analysis.

inspy

LinkedIn enumeration tool. preinstalled on kali.

inspy --empspy /usr/share/inspy/wordlists/title-list-large.txt --emailformat flast@google.com 'Google'

--email format is how the emails work

inspy --empspy /usr/share/inspy/wordlists/title-list-large.txt --emailformat

amass

In-depth Attack Surface Mapping and Asset Discovery, preinstalled on kali. The OWASP Amass Project performs network mapping of attack surfaces and external asset discovery using open source information gathering and active reconnaissance techniques.

run the help to see the options:

amass --help

spiderfoot

an open source intelligence (OSINT) automation tool. It calmes to integrate with just about every data source available and utilities a range of methods for data analysis, making that data easy to navigate.

spiderfoot -l 127.0.0.1:5001  → run web GUI on local host, connect with browser

recon-ng

By far, the best recon framework out there with both active and passive modules. designed like metasploit framework and each type of recon has its own specific module and options. the modules are installed from the "marketplace" plus a bunch of reporting modules for different formats. recon-ng is preinstalled on kali linux and parrot OS.

you can see the full ocumentation in the wiki:

here are some of the useful commands for a quick start:

help >>> help
marketplace install/search/info [modules names] >>> add or search for a module
modules load [module] >>> load a module
info >>> show module options
options set >>> set module options
run >>> run the module

to install all recon modules at once:

marketplace install recon/

some modules need api keys add it with :

e.g: shodan for example keys add shodan_api <API>
show keys — list available API keys
keys add api_key_name #api_key_value →  add key to module

a list of modules i usually use:

recon/domains-domains/brute_suffix 
recon/domains-hosts/bing_domain_web  
recon/domains-hosts/brute_hosts
recon/domains-hosts/netcraft
recon/domains-hosts/ssl_san
recon/hosts-hosts/bing_ip
recon/domains-hosts/hackertarget
recon/netblocks-hosts/reverse_resolve       → find hosts in a netblock
recon/hosts-hosts/reverse_resolve
discovery/info_disclosure/cache_snoop       → useful for finding AVs in use

Sherlock

find user accounts on social media

git clone https://github.com/sherlock-project/sherlock.git

cd sherlock
python3 -m pip install -r requirements.txt
python3 sherlock --help
python3 sherlock user1 user2 user3

TWINT

advanced Twitter scraping tool written in Python that allows for scraping Tweets from Twitter profiles without using Twitter's API.

social-analyzer

for analyzing and finding a person's profile across +800 social media websites

python3 -m pip install social-analyzer

social-analyzer --username "johndoe" --metadata --extract --mode fast

Last updated