Website Investigations Flow

Website Investigations: Process & Techniques

Website investigations follow a basic, logical flow. The launching point tends to be a domain or an IP address. The endpoint is usually identification of the operator or finding enforcement options. But, what happens in between these points varies by investigator and website.

This post outlines a standardized process that guides new hires through website investigations with the assistance of a website investigations tool. The tool is actually a set of bookmarks organized to lead users through the process via search forms. Each form offers exposure to different techniques and different results.

Website Investigation Flow

There are advantages to standardizing this process. Each investigation is unique. Cases may solve differently, but the process of probing and then reporting the connections tends to be the same.

Process is especially important in team environments because investigators and attorneys are more effective on a common ground. By defining the investigation in steps, team members can envision and discuss websites in logical compartments. Each compartment represents a different type of investigation, a different type of enforcement.

Compartments also add  organization to investigative notes. It’s infinitely easier to read and understand another investigator’s notes when the links and observations are organized with the same titled sections that you use, as opposed to the order the information was found. Organized notes are also easier to revisit and transfer into a report.

For this process, the website investigation flows through 5 compartments.

Domain / IP Addresses > Web Assets > Related Sites > Social Media > Operator ID

Website Investigation Flow Chart
The process is simple. Look for data to connect the first and last compartments. Sometimes that data is found in the domain registrant contact information. In many cases, it’s found elsewhere, such as a custom name server, a unique script shared between two websites or an image hosting account. Social media bridges connections between these findings, the website and the persons involved. Operator ID confirms these connections against real world information.

The website investigation tool outlines this process. Each compartment has a dedicated tab with search forms. The search options advance in complexity from the top of the page to the bottom. The options you select depend on the circumstances at hand. Additional resources can also be found on NetBootCamp’s OSINT Tools page.

Let’s look at this process by compartment.

Domain and IP Address

This section is about operator contact info and server control. Domain registrations and IP addresses are the usual launching point for a website investigation and that’s what this section is about.

Domain and IP Address searches include, but are not limited to:

Domain WhoisWhois HistoryIP address WhoisReverse IP address IP reassignments Troubleshooting
Name, Address, Phone, Email, Organization 
Review includes changes in name servers & domain registrars
ID the ISP, network, routing and data center location
ID connected websites in order to review more domains registrations
In some cases, the website operator is also the ISP Hosting history ID past ISPs and name servers. Look for current connections
Traceroute, Ping and Network/ASN profiling

 

Website Assets

A website is more than a domain and an IP address. It relies on name servers to be found and mail servers to send correspondence. Pages are rendered with the assistance of images, JavaScript, stylesheets and other assets. These assets can be hosted on other services. These services and the contents of these assets sometimes contain clues like an author’s screen name.

This section identifies assets used to operate websites. Observations in this section often connect to social profiles and screen names that are examined later.

Website Assets searches include, but are not limited to:

Custom name serversReverse name serversName server historySOA Start of Authority DNS propagationMX / Reverse MX HTMLRobots.txt Sitemap.xmlSSL CertificatesSubdomainsUser ProfilesArchives
Domain registration investigations 
Other websites using these custom servers
Custom name servers used by these websites at an earlier date
Email address of the person managing the name servers
The IP address for a large website can change by region
Other websites managing or sharing email with a domain
Unique codes, i.e. image hosting, scripts, analytics, ads, widgets
Files and folders the operator does not want search engines to crawl
Pages, publishing dates, author screen names
Other / Alternative domains and websites sharing an SSL certificate 
More potential hosting locations / IP addresses to investigate
Blogger/WordPress author and forum user screen names
Review contacts and HTML of early website pages via services like Archive.org and Oldweb.today

 

Related Websites

Entrepreneurs often operate more than one website. These websites can host advertisements, media, and JavaScript used by the target website. They can house similar operations, but sometimes the content of these websites and their purpose is personal to the operator. These related websites provide additional investigative opportunities from domain registrations and servers to HTML scripts.

HTML code is commonly used to connect websites. These codes include visitor statistics accounts like Google Analytics, advertising publisher accounts like AdSense, as well as the publisher codes found on social media sharing widgets. Many operators avoid or discontinue using these accounts for this reason, though the codes are sometimes found in earlier versions of their websites through Archive.org.

Related Website searches include, but are not limited to:

Reverse domain searchHTML Reverse searchSimilar web designsShared contentNearby websitesBacklinks & referral links
Shared domain registrant email, phone or address
Shared HTML codes like analytics, some with screen names
Shared style sheets, scripts and custom templates
Unique text or images shared by websites
Websites or files hosted on nearby IP address(es) that could also be operated by the same person
Linked assets, partner websites, ad networks, widgets

 

Social Media

Screen names and registrant contacts are often tied to social profiles. Profiles can also be found when these connections are not apparent. This section probes for posts and profiles that bridge connections between websites and potential operators.

Social Media searches include, but are not limited to:

Email searchScreen namesName searchWebsite search
Accounts listed in social profiles of Facebook & other social networks
Networks, blogs, forum users, image hosting and more
Social network search filtered by city or other suspected details
Hashtags, Likers, First followers, Profiles posting keywords, links, etc.

 

Operator ID

Operator ID verification determines that the investigative findings are grounded, that they’re associated with an identifiable person, business or physical location. An email tied to a Facebook profile might also be found in a person’s credit report. The phone number used to initially register a website might be found in an online resume. This section finds relationships between investigative observations and identities.

Operator ID searches include, but are not limited to:

Email searchName/Address/Phone
Searching for online resumes, credit backgrounds, etc.
 LinkedIn (business /website history), resumes, phone listings,
company data, incorporation data, property ownership,
building permits, political donations, freelancers, newspapers, classmates, unclaimed property, business listings, etc.

 

Summary

Process organizes how we collect and report data. It can also help troubleshoot the findings. You can develop your process with the assistance of the website investigation tool. It’s a training device intended to inspire problem solving and organization among new hires and junior investigators.

Resource
NetBootCamp Website Investigation Tool