Overview

Collecting and validating an organization’s employee base is critical for any successful offensive information security operation. With this information, we’re able to:

  • Conduct social engineering campaigns
  • Brute force and password spray endpoints

The flowchart below demonstrates the process of fully enumerating an organization’s employee base and how to collect a list of publicly available authentication endpoints used by those employees.

User/Employee Enumeration Process

Initial Recon

We need to do two things before proceeding:

  1. Analyze MX (mail) DNS records for our target domain

  2. Uncover the organization’s username format

*Note that all examples in this blog will use the fake domain, acme.com

DNS Record Analysis

DNS records store everything we need to get a picture of authentication endpoints and mail services that an organization uses. This information can be easily collected by tooling pre-installed on all operating systems. When we enumerate these records at Sprocket, the script detailed below is used. To get started:

  1. Create a directory to store output.

  2. Navigate into that directory.

  3. Make sure you have these tools installed:

    mailspoof (pip)
    dnsutils (cli)
    checkdmarc (pip)
  4. Download and make the following bash script executable:

    #!/bin/bash  
    
    domain="$1"
    
    mailspoof -d $domain -o mailspoof."$domain".json
    checkdmarc $domain -o checkdmarc."$domain".json
    dig +short $domain MX | tee digmx."$domain".txt
    dig +short $domain TXT | tee digtxt."$domain".txt
    
    for i in owa post mail mymail exchange autodiscover exch exch01 outlook
    do
            host $i.$domain | tee -a exchange-enum."$domain".txt
    done


    Mailspoof can be difficult, so you might have to run it manually. In addition, if the last few lines of the script above produces results, dance around the room because your job probably got 100% easier.


  5. Run the script with your target apex domain apended.

    ./mailspoof.sh acme.com


  6. Looking through that output should give you more than enough to report on and enumerate the mail services in use. This is an important step. Don’t skip this. Without this data you can’t validate addresses at a later point.

  7. Take note of some things in the output of the script you just ran. Based on your output:

    • If you see "microsoftonline" or O365 referenced, you CAN validate users collected but NOT quickly or reliably.

    • If you see Google or anything similar referenced, you CAN validate users collected, but NOT quickly or reliably.

    • If you uncover an Outlook web app instance, you CAN validate users and CAN do it en masse.

    • If you don’t find anything, the domain is most likely unused for corporate mail services.

Username Format Discovery

What do we mean when we say, "username format"?

  • A username/email format denotes how a company structures employee account names.

  • For example, John Smith may have the username jsmith@acme.com. As you can see, this aligns with the format of first initial + last name.

When we write the format out, we describe it using something similar to the syntax shown below:

  • If an employee username is jsmith@acme.com, and we have high confidence this is true for all employees, we say the format is {f}{last}@acme.com

  • If an employee username is john.smith@acme.com, and we have high confidence this is true for all employees, we say the format is {first}.{last}@acme.com

How to find a username format

hunter.io
  1. Using hunter.io is a fairly easy way to discover an organization's username formats. It can be inaccurate, however, depending on the number of email addresses the site has been collected. Navigate to the link below and enter your domain.

    Find email addresses in seconds * Hunter (Email Hunter)
    The Domain Search lists all the people working in a company with their name and email address found on the web. With 100+ million email addresses indexed, effective search filters and

    https://hunter.io/
  2. Commonly, hunter.io will provide you with the following, and it’s how you should denote the format once discovered.


The hard way
  1. If hunter.io has never heard of your domain, one of two possibilities exist:
    1. That domain has either never been used or is an alias for another domain. How to validate if that’s the case is not part of this article.

    2. The domain you’re targeting is obscure, or there isn’t a standard format for the company (the lack of a standard format is highly unlikely).

The best way to start searching for a username format is to use some Google dorks and possibly theHarvester.

These Google dorks may be relevant:

inurl:rocketreach.co acme.com
intext:"@acme.com"


You may also use sites like dehashed.com to search for company email addresses that have shown up in a breach.

Additional notes on username format discovery
  • The external username format is possibly different than what employees use to access internal resources. By this, we mean that if a user has the email address john.smith@acme.com, they may actually use the username jsmith to login to Active Directory. In this instance, the email address is considered an alias.

    • Take note of other formats you find. They may be internal usernames formats that were leaked out to the internet.

  • You can also attempt to send an email to one of the addresses you find to see if the company’s mail server bounces back.

    • For example, if you found the employee John Smith and you have high confidence he works there, attempt to send him an email. Use varying formats to see what's accepted. This allows you to enumerate additional internal formats in some instances.

Collecting the information

  • Now, we need to go through the process of collecting employee names. The easiest and straightforward way of doing this is to:

    • Collect publicly available employee names from social networks.

    • Scrape external sources for email addresses and employee names.

Social Network Collection

Social Network Scraping

  • When we say we’re going to scrape a social network, we generally are referring to LinkedIn. This network provides us with the most reliable and up-to-date data.

  • You can scrape LinkedIn using several methods. Fair warning, LinkedIn tries to actively combat and prevent these methods from being used.

Potential methods for scraping LinkedIn, include:

  • A burner LinkedIn account

    • These are harder to make, so you’ll need to have a burner number and burner email, both of which are trusted. How to create and gain access to these is outside the scope of this article.

  • A pre-setup system with all the prerequisite tooling we need installed.

  • We can possibly look to use a tool like our proxycannon-ng to prevent yourself from being banned while accessing services.

Tooling to scrape social networks and search engines

Below is some of the tooling I actively use. I’ve also included a reliability score for each.

linedin2username

Scrapes: LinkedIn
Reliability: High
Benefits:

  • Does not require LinkedIn API keys
  • Attempts to prevent rate-limiting

Cons:

  • Can get banned from LinkedIn
  • Doesn’t let you set a specific username format

Find it, here:

initstring/linkedin2username
OSINT Tool: Generate username lists from companies on LinkedIn. This is a pure web-scraper, no API key required. You use your valid LinkedIn username and password to login,

https://github.com/initstring/linkedin2username

About: linkedin2username is a solid tool that scrapes LinkedIn directly. It can be hit with search limit restrictions after 1,000 searches but attempts to thwart the restrictions using several methods. This requires you to have a burner LinkedIn account. Read the directions closely to review what the company name is supposed to look like when running the tool.

BridgeKeeper

Scrapes: Search Engines
Reliability: Medium
Benefits:

  • Rate-limiting prevention
  • Pre-formatted output support
  • Handling for odd names. (i.e. names with hyphens or apostrophes)

Cons:

  • Less accurate than other scraping methods
  • Could contain inactive employees
  • Can cause you to get blacklisted by search-engine providers

Find it, here:

0xZDH/BridgeKeeper
Scrape employee names from search engine LinkedIn profiles. Convert employee names to a specified username format. usage: bridgekeeper.py [-h] (-c COMPANY | -F FILE)

https://github.com/0xZDH/BridgeKeeper

About: BridgeKeeper does not require a LinkedIn account. It instead scrapes search engine results for employee data. This is less accurate as recent hires may not have been indexed by search engines yet.

EmailGen

Scrapes: Search engines
Reliability: Low
Benefits:

  • Incredibly fast
  • Grabs anything and everything

Cons:

  • We have found it to be overzealous and often inaccurate
  • Written in Ruby ;)

Find it, here:

navisecdelta/EmailGen
In our research, Bing is liberal when scraping with mechanize. Using well-known google dorks, we can obtain all the names of employees at a company and using a

https://github.com/navisecdelta/EmailGen

About: Scrapes Bing because it only wants traffic and doesn’t care where it comes from. Super easy to set up and run compared to other tooling. This tool is only worth using if you’re having a lot of trouble or have a method for validating tool results.

Additional sources and information

  • Some additional places you can look for employee information:

    • The primary company homepage
    • Internet archive repositories
    • Hidden content on company sites
    • Sites like hunter.io and other email collection sites
    • Breach data associated with the organization

  • Why do you want to use anything besides LinkedIn?

    • Other services may lead you to the name of service accounts and shared mailboxes

      • Things like info@acme.com or support@acme.com

    • May get more coverage or stumble across other gems such as full company directories hidden away ready to scrape.

    • Collect additional employee data like phone numbers and job descriptions

Primary company homepage

  • Some companies are goofy enough to list all their email addresses and names on their site.

  • These may or may not be up to date. Some organizations are religious in their practice of updating company directories, however.

  • This is often true of government agencies, nonprofits and law firms because they often have to make things available to the public

  • Here’s a great example:

    A library’s employee directory

  • Scraping these pages requires a little know-how. You may need to write a Python script if the emails aren’t embedded into the site’s HTML. If it's embedded in the webpage's HTML, and you only need a quick list, you can use wget and grep to grab the emails. An example of this is shown below:

    wget -q -l 5 -O - https://www.acme.com/379/Staff-Directory \
    | grep -E -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b"


This method won’t pull names, titles and phone numbers, so it’s best you write something in a higher-level language to pull all information and format it.

Internet archives

  • I only use this if I’m desperate. The email addresses you pull are often old.

  • With that said, old emails aren’t always a bad thing. If you’re brute-forcing, and the company doesn’t deactivate inactive accounts regularly, the users you grab may be gone but still have live accounts created with a weak and old password policy.

  • For example, it’s possible previous versions of the company homepage included an employee directory that they’ve since removed.

  • Navigate to archive.org to search your site and to look for a company directory page.

Hidden content

  • This is rare and has only happened once in my experience.

  • On a company site, I found a SQL dump file in a subdirectory containing a complete list of users.

  • It’s cool but rare. Still, it’s worth keeping an eye out for during other testing efforts

  • I’ve also found IIS error pages with service account names in the past that were useful.

Third-party sites

Another recent addition to my toolbox is ZIPuller.py

ZIPuller

Scrapes: ZoomInfo
Reliability: High
Benefits:

  • Mega fast
  • Grabs anything and everything
  • Very reliable output

Cons:

  • Doesn’t format usernames for you
  • ZoomInfo does heavy rate-limiting, so you will need a tool like ProxyCannon-NG

Find it, here:

waffl3ss/ZIPuller
ZIPuller - Pull company employee names from ZoomInfo

About: Scrapes companies listed on the site ZoomInfo for you. The tool outputs raw employee names, so you will need to do a little command-line magic to get everything looking the way you want.

Breach data

  • Breach data isn’t always reliable, but it’s worth checking out, especially if you have it readily available.

  • You either need to download it or use a service like the one listed below. Dehashed costs money but has high value for not just username enumeration:

Wrap Up

Well, that's all folks for our username enumeration methodology. In a follow-up article, we'll explore username validation and the abuse of authentication endpoints.