Reliable Username Enumeration: A step-by-step guide

Overview

Collecting and validating an organization’s employee base is critical for any successful offensive information security operation. With this information, we’re able to:

Conduct social engineering campaigns
Brute force and password spray endpoints

The flowchart below demonstrates the process of fully enumerating an organization’s employee base and how to collect a list of publicly available authentication endpoints used by those employees.

User/Employee Enumeration Process

Initial Recon

We need to do two things before proceeding:

Analyze MX (mail) DNS records for our target domain
Uncover the organization’s username format

*Note that all examples in this blog will use the fake domain, acme.com

DNS Record Analysis

DNS records store everything we need to get a picture of authentication endpoints and mail services that an organization uses. This information can be easily collected by tooling pre-installed on all operating systems. When we enumerate these records at Sprocket, the script detailed below is used. To get started:

Create a directory to store output.
Navigate into that directory.

Make sure you have these tools installed:

mailspoof (pip)
dnsutils (cli)
checkdmarc (pip)

Download and make the following bash script executable:

#!/bin/bash

domain="$1"

mailspoof -d $domain -o mailspoof."$domain".json
checkdmarc $domain -o checkdmarc."$domain".json
dig +short $domain MX | tee digmx."$domain".txt
dig +short $domain TXT | tee digtxt."$domain".txt

for i in owa post mail mymail exchange autodiscover exch exch01 outlook
do
host $i.$domain | tee -a exchange-enum."$domain".txt
done

Mailspoof can be difficult, so you might have to run it manually. In addition, if the last few lines of the script above produces results, dance around the room because your job probably got 100% easier.

Run the script with your target apex domain apended.

./mailspoof.sh acme.com

Looking through that output should give you more than enough to report on and enumerate the mail services in use. This is an important step. Don’t skip this. Without this data you can’t validate addresses at a later point.
Take note of some things in the output of the script you just ran. Based on your output:
- If you see "microsoftonline" or O365 referenced, you CAN validate users collected but NOT quickly or reliably.
- If you see Google or anything similar referenced, you CAN validate users collected, but NOT quickly or reliably.
- If you uncover an Outlook web app instance, you CAN validate users and CAN do it en masse.
- If you don’t find anything, the domain is most likely unused for corporate mail services.

Username Format Discovery

What do we mean when we say, "username format"?

A username/email format denotes how a company structures employee account names.
For example, John Smith may have the username jsmith@acme.com. As you can see, this aligns with the format of first initial + last name.

When we write the format out, we describe it using something similar to the syntax shown below:

If an employee username is jsmith@acme.com, and we have high confidence this is true for all employees, we say the format is {f}{last}@acme.com
If an employee username is john.smith@acme.com, and we have high confidence this is true for all employees, we say the format is {first}.{last}@acme.com

How to find a username format

hunter.io

Using hunter.io is a fairly easy way to discover an organization's username formats. It can be inaccurate, however, depending on the number of email addresses the site has been collected. Navigate to the link below and enter your domain.

Find email addresses in seconds * Hunter (Email Hunter)

The Domain Search lists all the people working in a company with their name and email address found on the web. With 100+ million email addresses indexed, effective search filters and

https://hunter.io/
Commonly, hunter.io will provide you with the following, and it’s how you should denote the format once discovered.

The hard way

If hunter.io has never heard of your domain, one of two possibilities exist:
That domain has either never been used or is an alias for another domain. How to validate if that’s the case is not part of this article.
The domain you’re targeting is obscure, or there isn’t a standard format for the company (the lack of a standard format is highly unlikely).

The best way to start searching for a username format is to use some Google dorks and possibly theHarvester.

These Google dorks may be relevant:

inurl:rocketreach.co acme.com
intext:"@acme.com"

You may also use sites like dehashed.com to search for company email addresses that have shown up in a breach.

Additional notes on username format discovery

The external username format is possibly different than what employees use to access internal resources. By this, we mean that if a user has the email address john.smith@acme.com, they may actually use the username jsmith to login to Active Directory. In this instance, the email address is considered an alias.
Take note of other formats you find. They may be internal usernames formats that were leaked out to the internet.
You can also attempt to send an email to one of the addresses you find to see if the company’s mail server bounces back.
For example, if you found the employee John Smith and you have high confidence he works there, attempt to send him an email. Use varying formats to see what's accepted. This allows you to enumerate additional internal formats in some instances.

Collecting the information

Now, we need to go through the process of collecting employee names. The easiest and straightforward way of doing this is to:
Collect publicly available employee names from social networks.
Scrape external sources for email addresses and employee names.

Social Network Collection

Social Network Scraping

When we say we’re going to scrape a social network, we generally are referring to LinkedIn. This network provides us with the most reliable and up-to-date data.
You can scrape LinkedIn using several methods. Fair warning, LinkedIn tries to actively combat and prevent these methods from being used.

Potential methods for scraping LinkedIn, include:

A burner LinkedIn account
These are harder to make, so you’ll need to have a burner number and burner email, both of which are trusted. How to create and gain access to these is outside the scope of this article.
A pre-setup system with all the prerequisite tooling we need installed.
We can possibly look to use a tool like our proxycannon-ng to prevent yourself from being banned while accessing services.

Tooling to scrape social networks and search engines

Below is some of the tooling I actively use. I’ve also included a reliability score for each.

linedin2username

Scrapes: LinkedIn Reliability: High Benefits:

Does not require LinkedIn API keys
Attempts to prevent rate-limiting

Cons:

Can get banned from LinkedIn
Doesn’t let you set a specific username format

Find it, here:

initstring/linkedin2username

OSINT Tool: Generate username lists from companies on LinkedIn. This is a pure web-scraper, no API key required. You use your valid LinkedIn username and password to login,

https://github.com/initstring/linkedin2username

About: linkedin2username is a solid tool that scrapes LinkedIn directly. It can be hit with search limit restrictions after 1,000 searches but attempts to thwart the restrictions using several methods. This requires you to have a burner LinkedIn account. Read the directions closely to review what the company name is supposed to look like when running the tool.

BridgeKeeper

Scrapes: Search Engines Reliability: Medium Benefits:

Rate-limiting prevention
Pre-formatted output support
Handling for odd names. (i.e. names with hyphens or apostrophes)

Cons:

Less accurate than other scraping methods
Could contain inactive employees
Can cause you to get blacklisted by search-engine providers

Find it, here:

0xZDH/BridgeKeeper

Scrape employee names from search engine LinkedIn profiles. Convert employee names to a specified username format. usage: bridgekeeper.py [-h] (-c COMPANY | -F FILE)

https://github.com/0xZDH/BridgeKeeper

About: BridgeKeeper does not require a LinkedIn account. It instead scrapes search engine results for employee data. This is less accurate as recent hires may not have been indexed by search engines yet.

EmailGen

Scrapes: Search engines Reliability: Low Benefits:

Incredibly fast
Grabs anything and everything

Cons:

We have found it to be overzealous and often inaccurate
Written in Ruby ;)

Find it, here:

navisecdelta/EmailGen

In our research, Bing is liberal when scraping with mechanize. Using well-known google dorks, we can obtain all the names of employees at a company and using a

https://github.com/navisecdelta/EmailGen

About: Scrapes Bing because it only wants traffic and doesn’t care where it comes from. Super easy to set up and run compared to other tooling. This tool is only worth using if you’re having a lot of trouble or have a method for validating tool results.

Additional sources and information

Some additional places you can look for employee information:
The primary company homepage
Internet archive repositories
Hidden content on company sites
Sites like hunter.io and other email collection sites
Breach data associated with the organization
Why do you want to use anything besides LinkedIn?
Other services may lead you to the name of service accounts and shared mailboxes
Things like info@acme.com or support@acme.com
May get more coverage or stumble across other gems such as full company directories hidden away ready to scrape.
Collect additional employee data like phone numbers and job descriptions

Primary company homepage

Some companies are goofy enough to list all their email addresses and names on their site.
These may or may not be up to date. Some organizations are religious in their practice of updating company directories, however.
This is often true of government agencies, nonprofits and law firms because they often have to make things available to the public
Here’s a great example:
Scraping these pages requires a little know-how. You may need to write a Python script if the emails aren’t embedded into the site’s HTML. If it's embedded in the webpage's HTML, and you only need a quick list, you can use wget and grep to grab the emails. An example of this is shown below:

wget -q -l 5 -O - https://www.acme.com/379/Staff-Directory \
| grep -E -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b"

This method won’t pull names, titles and phone numbers, so it’s best you write something in a higher-level language to pull all information and format it.

Internet archives

I only use this if I’m desperate. The email addresses you pull are often old.
With that said, old emails aren’t always a bad thing. If you’re brute-forcing, and the company doesn’t deactivate inactive accounts regularly, the users you grab may be gone but still have live accounts created with a weak and old password policy.
For example, it’s possible previous versions of the company homepage included an employee directory that they’ve since removed.
Navigate to archive.org to search your site and to look for a company directory page.

Hidden content

This is rare and has only happened once in my experience.
On a company site, I found a SQL dump file in a subdirectory containing a complete list of users.
It’s cool but rare. Still, it’s worth keeping an eye out for during other testing efforts
I’ve also found IIS error pages with service account names in the past that were useful.

Third-party sites

Hunter.io does more, of course, than only telling you username format
Additionally, you can use it to find employee email addresses With Hunter, an API key is required to export the listed email addresses programmatically. Additionally, there are limits on the number of API requests you can make monthly before having to cough up some money.
Additionally, search through these sites for additional data:
https://buckets.grayhatwarfare.com
https://rocketreach.co
https://findemailaddress.co
https://snov.io
https://finder.app
https://intelx.io

Another recent addition to my toolbox is ZIPuller.py

ZIPuller

Scrapes: ZoomInfo
Reliability: High

Benefits:

Mega fast
Grabs anything and everything
Very reliable output

Cons:

Doesn’t format usernames for you
ZoomInfo does heavy rate-limiting, so you will need a tool like ProxyCannon-NG

About: Scrapes companies listed on the site ZoomInfo for you. The tool outputs raw employee names, so you will need to do a little command-line magic to get everything looking the way you want.

Breach data

Breach data isn’t always reliable, but it’s worth checking out, especially if you have it readily available.
You either need to download it or use a service like the one listed below. Dehashed costs money but has high value for not just username enumeration:
https://dehashed.com

Wrap Up

Well, that's all folks for our username enumeration methodology. In a follow-up article, we'll explore username validation and the abuse of authentication endpoints.

Reliable Username Enumeration: A step-by-step guide

Overview

Initial Recon

DNS Record Analysis

Username Format Discovery

How to find a username format

hunter.io

The hard way

Additional notes on username format discovery

Collecting the information

Social Network Collection

Social Network Scraping

Tooling to scrape social networks and search engines

linedin2username

BridgeKeeper

EmailGen

Additional sources and information

Primary company homepage

Internet archives

Hidden content

Third-party sites

ZIPuller

Breach data

Wrap Up

Subscribe to our newsletter

Nicholas Anastasi

The Expert-Driven Offensive
Security Platform

Expert-Driven Offensive Security Platform

Our Cookie Policy

Manage Cookies

Use cases

Top Blog Posts

Reliable Username Enumeration: A step-by-step guide

Overview

Initial Recon

DNS Record Analysis

Username Format Discovery

How to find a username format

hunter.io

The hard way

Additional notes on username format discovery

Collecting the information

Social Network Collection

Social Network Scraping

Tooling to scrape social networks and search engines

linedin2username

BridgeKeeper

EmailGen

Additional sources and information

Primary company homepage

Internet archives

Hidden content

Third-party sites

ZIPuller

Breach data

Wrap Up

Subscribe to our newsletter

Nicholas Anastasi

Explore Latest Content.

Ahead of the Breach - Parthasarathi Chakraborty, Former Deputy CISO at Natixis

​​Leveraging Threat Intelligence for Better Attack Surface Management

Ahead of the Breach - What Are the Common Myths About Continuous Pentesting?

The Expert-Driven Offensive Security Platform

Expert-Driven Offensive Security Platform

Watch the Demo

Leveraging Threat Intelligence for Better Attack Surface Management

The Expert-Driven Offensive
Security Platform