Our measurement and fingerprinting pipeline is separated into the following steps:
- Domain resolution
- Input preparation
- TLS scanning
- Post processing
Every domain from the input sets needs to be resolved first. For this have used a local unbound server and massdns. The following command can generate the appropriate input file for the TLS scanner.
This input file is then joined with the IP address inputs we collect from the blocklists (because the lists do not contain any domain names we could resolve).
To scan the targets we additionally need to provide the Client Hello for scanning. The final input for the scanner is a cross-product between the input from the last section and a set of client hello names. We use the goscanner to calculate this cross-product. Shall the directory
client-hellos contain the Client Hellos for the scan. This could be e.g., the general-purpose Client Hellos desined for the paper - downloadable here.
We shuffle the input to distribute the load among the scanned servers.
To scan the targets we configure the TLS scanner with the following configuration:
The raw output of the scanner does not contain the final fingerprints of the servers, because they are a combination of the results from multiple requests. To generate the fingerints for each target we added this functionality to the goscanner.