Information Gathering With theHarvester
‘theHarvester’ is a tool designed to be used in the early stages (Information Gathering Phase) of a penetration test.
As the name suggests, ‘theHarvester’ is used to harvest/gather sensitive information that can help in determining a company’s external threat landscape on the internet. Not just company but even individual information of particular users available on the internet. ‘theHarvester’ largely depends on public sources and the information can gather include:
⦁ Emails
⦁ Names
⦁ Subdomains
⦁ IPs
⦁ URLs
⦁ VirtualHosts
⦁ Even Port Scanning.
TheHarvester Public engine
One of the interesting things about ‘theHarvester’ is that it supports more than one public source to harvest information. These sources appear to be more than 20+ public sources supported by the information gathering tool. From Baidu to Yahoo. Some of these public sources require API. And this public sources that require API include;
⦁ Bing(bingapi)
⦁ Github
⦁ Hunter
⦁ Intelx
⦁ SecurityTrails
⦁ Shodan
⦁ Spyse
But if you don’t have API you can still use some of its other public sources.
Getting started with ‘theHarvester’on ubuntu 18.04
Getting started with these tools is very easy. You just have to have some major dependencies on the system particular python3.6+. Some of its major dependencies include:
⦁ Python 3.7+
⦁ Python3 -m pip install pipenv
⦁ Pipenv install
virtualenv -p python3 theharvester
git clone https://github.com/laramies/theHarvester.git
Source theharvester/bin/activate
Most Effective sources of ‘theHarvester’
I have used ‘theHarvester’, and each source supported by the tool has it’s kind of information it can harvest if you want. While using the tool, some of the sources that are effective in gathering info are:
⦁ Google( But google blocks queries very often so at times google will not give any result)
⦁ Censys
⦁ Shodan
⦁ Hunter
⦁ Bing
Note that each engine has its own particular data it can scrape which the other can’t. Also, google blocks query if used very often. The reason behind this is that may see the queries as bots. The way around this is perhaps to make google use API.
Resources
Github Repository – https://github.com/laramies/theHarvester
An online integration of ‘theHarvester’ – https://www.nmmapper.com/kalitools/theharvester/email-harvester-tool/online/
Leave A Reply
You must be logged in to post a comment.
1 Comment
Alright! Thanks for sharing!