Every week, Sprocket Security CEO Casey Cammilleri interviews an expert leading the charge on empowering security experts and practitioners with the knowledge and insights needed to excel in the future of cybersecurity.
He recently spoke with Alex Ronquillo, Vice President at WhoisXML API, at RSA 2025. Here are the top takeaways from the interview.
#1: Combine Multiple Intelligence Sources When Cloudflare Masks Your Target
“I think especially these days, everyone's building everything cloud and on top of other services and stuff. I think the majority of companies, unless they're really big, generally don't operate their own name service. They're using Cloudflare or something from their hosting company, which has a lot of benefits to it.
“But again, from the pentesting side, that gives us a lot more work now because you can't just reverse a Cloudflare name server. You're going to have millions of domains that have nothing to do with your target organization. So that's where I think a lot of the other open source intelligence from SSL certificates, DNS infrastructure, IP address and it's hosting, that type of stuff starts to give you a better, I guess, cohesive picture. But there's definitely still a lot of, I think, I don't know, it's still part art, part science.
“Sometimes you can pivot on just an org name with no email, other times you're going to find criminals can go register something from Malaysia and just say that it's Wells Fargo organization and then leave no email and it's like, ‘Oh, is it or not?’”
Actionable Takeaway: Cloud adoption has made reconnaissance more challenging as most companies no longer operate their own nameservers. When facing Cloudflare or similar services, you'll need to leverage multiple data points like SSL certificates, DNS infrastructure, and IP hosting details, to build an accurate picture. This intelligence fusion creates a coherent view where single data points might mislead you.
#2: Hunt for Genuine Subdomains Beyond Certificate Transparency Logs
“We've been dipping our toes into passive DNS for at least 10 years now or so and have really shorn up those efforts. So because a lot of, I think, the vulnerability scanning tools rely on us to find domains. It's been natural for us to go to the left of that and try to find subdomains. Well, there's no great repository for all the subdomains that exist. There are certificate transparency logs. Maybe you have some kind of network telemetry of your own that your customers are sharing with you and you can find new names there. But even still, there's not a single real repository of all the subdomains that exist.
“You're probably not going to get access to your customers’ zone files in most cases, for instance. So then it becomes who can find, out of this whole community, of who we have trying to protect people who can scan all the domains and try to find their subdomains and see are they really there or they maybe a wildcard configuration that would have resolved anything. So again, a lot of art comes into that.”
Actionable Takeaway: Discovering true subdomains requires a multi-faceted approach because no complete repository exists. Although certificate logs provide some visibility, they're incomplete. Instead, combine network telemetry, specialized scanning, and passive DNS to verify whether subdomains are legitimate or merely wildcard configurations. Success demands both technical precision and creative problem-solving skills.
#3: Download WHOIS Data Locally for Unlimited Machine Learning Analysis
“Imagine all the records you get in a normal WHOIS query structured, parsed, you know, so you see when we collect the record, the official information, all that stuff. But rather than having to go and do hundreds of millions of WHOIS queries for yourself being able to go and just pull in or if you will almost download Google Maps and then have that locally. If you have that on your own servers, you can do as many lookups as you need to. And we've gotten a lot of very positive encouragement from our buddies in the registrar communities and things like that. Because I think we've, by doing that, I think we've offloaded a lot of the pressure on the natural WHOIS servers for some of the big organizations who in many cases are not even interested in the contact data.
“They just need to figure out the domain age of everything, but at billions of times a day, it's not a thing you can approach with any API. So if they want to build sophisticated machine learning to find domains that actor groups have spun up that are otherwise hard to connect or on the protector side, spin up machine learning to go and then figure out at scale, ‘Hey, maybe we can figure out automatically when my customers make new domains or when their domains drop or what have you, rather than having to go and do a tremendous amount of WHOIS queries and burden.’ The systems of these registrars, we provide data licenses. Not everyone can get them. They have to be improved use cases. So all partners the data comes from are happy and everything is above board. But at this point, maybe 700 different cybersecurity products.”
Actionable Takeaway: Traditional API-based WHOIS queries can't scale to billions of lookups. By downloading structured WHOIS records locally, you gain unlimited access for machine learning applications, whether detecting threat actor infrastructure or monitoring customer domains. This approach drastically reduces the burden on registrar servers while enabling sophisticated analysis at scale across 700+ cybersecurity products.