David Fifield, Jiang Jian, Paul Pearce
Code and data download:
git clone https://www.bamsoftware.com/git/sniproxy.git
An investigation into SNI proxies, special servers that forward TLS streams to the host specified in the Server Name Indication field (SNI) of the TLS handshake. SNI proxies could be useful for Internet censorship circumvention or censorship measurement. We manually investigated and characterized a small number of SNI proxies that had been publicly reported, and did a targeted scan of the IPv4 space to find how many there are. Our scan of port 443 on found 2,500 SNI proxies on the Internet.
An SNI proxy is a TLS server that proxies traffic to any destination given in the Server Name Indication field (SNI). An SNI proxy receives connections on port 443, peeks at the SNI, then forwards the connection to the host given in the SNI. It's not a MITM; the proxy forwards the client's ClientHello and all other traffic unmodified, and the client has a true end-to-end TLS connection to the server, through the SNI proxy. An SNI proxy is like an open proxy, with the restriction that it only works for port 443 and only for TLS.
We first heard of SNI proxies from people in China who were using the proxies to circumvent Internet censorship. (See for example https://github.com/phuslu/goproxy/issues/853.) We speculated that the SNI proxies might be related to ad injection by ISPs on port 80. The ISP falsifies DNS responses to point to its own ad-injection proxy server, which injects ads into HTTP on port 80. But the DNS-based redirection cannot redirect plain HTTP without also redirecting HTTPS. The proxy cannot tamper with the HTTPS, but it also cannot cannot cause HTTPS to break; therefore it just proxies the HTTPS unmodified.
SNI proxying is a standard feature of various firewalls and proxies. HAProxy has it:
http://blog.haproxy.com/2012/04/13/enhanced-ssl-load-balancing-with-server-name-indication-sni-tls-extension/There are a few projects on GitHub doing it:
https://trick77.com/haproxy-and-sni-based-ssl-offloading-with-intermediate-ca/
https://github.com/dlundquist/sniproxyWe found some servers in the wild that do it, with brand names like Sophos and Blue Coat.
https://github.com/gpjt/stupid-proxy
https://github.com/yrutschle/sslh
The obvious application of SNI proxies is the circumvention of Internet censorship, against a censor that blocks IP addresses and DNS requests but does not examine the SNI field of TLS connections. You can access a blocked HTTPS server by connecting to it indirectly through an SNI proxy located beyond the censor's control. It doesn't even require any special client-side software, just modification of the hosts file. For example, if https://example.com/ is blocked by the censor, and 192.0.2.100 is an SNI proxy, add this line to the hosts file:
192.0.2.100 example.com
You could use SNI proxies to measure censorship. An SNI proxy inside the network controlled by the censor serves as a vantage point to test the reachability of destinations outside the censor's network (port 443 only).
It might be possible to DoS an SNI proxy by asking it to connect to 127.0.0.1, or explore an internal network by giving it a private IP address. Technically, "literal IPv4 and IPv6 addresses are not permitted" in the SNI field (RFC 4366). Even if the SNI proxy does not support an IP address in SNI, it would be easy to set up a DNS server that gives every IP address a domain name (e.g. 127.0.0.1.example.com → 127.0.0.1). There are already some public names that map to internal addresses, like fuf.me → 127.0.0.1.
We started by looking at 26 purported SNI proxy IP addresses from a ticket of GoProxy, a circumvention system:
https://github.com/phuslu/goproxy/issues/853We probed them all on ports 80 and 443 (with and without SNI), with the goal of characterizing in detail some specific SNI proxy implementations. Of the 26 IP addresses, 10 were no longer running, and among the remaining ones we found three distinct fingerprints.
On port 80 (HTTP), we saw three distinct responses, described in the next subsection. On port 443 (HTTPS), we saw two distinct behaviors, described in the following subsection.
This table summarizes the results of our manual check of the 26 IP addresses from the GoProxy ticket. "—" means the TCP connection failed.
host | HTTP | HTTPS | HTTPS+SNI | reverse DNS |
---|---|---|---|---|
45.127.92.217 | — | — | — | |
103.15.187.54 | dns.auth.fail | alert | proxies | 103-15-187-54.op-net.com |
110.4.12.173 | — | — | — | |
110.4.12.175 | — | — | — | |
110.4.12.176 | — | — | — | |
110.4.12.178 | — | — | — | |
110.4.24.170 | ChinaCache | silence | proxies | |
110.4.24.175 | ChinaCache | silence | proxies | |
110.4.24.176 | HAProxy | silence | proxies | |
110.4.24.178 | HAProxy | silence | proxies | |
182.239.95.136 | — | — | — | 182.239.95.136.hk.chinamobile.com |
182.239.95.137 | — | — | — | 182.239.95.137.hk.chinamobile.com |
182.239.127.136 | — | — | — | 182.239.127.136.hk.chinamobile.com |
182.239.127.137 | — | — | — | 182.239.127.137.hk.chinamobile.com |
203.78.36.234 | — | — | — | m203-78-36-234.smartone.com |
218.254.1.13 | ChinaCache | silence | proxies | cm218-254-1-13.hkcable.com.hk |
218.254.1.15 | ChinaCache | silence | proxies | cm218-254-1-15.hkcable.com.hk |
219.76.4.3 | HAProxy | silence | proxies | tswc5b003.netvigator.com |
219.76.4.4 | HAProxy | silence | proxies | tswc5b004.netvigator.com |
219.76.4.9 | ChinaCache | silence | proxies | tswc5b009.netvigator.com |
219.76.4.11 | ChinaCache | silence | proxies | tswc5b011.netvigator.com |
219.76.4.14 | ChinaCache | silence | proxies | tswc5b014.netvigator.com |
219.76.4.69 | HAProxy | silence | proxies | tswc5b069.netvigator.com |
219.76.4.70 | HAProxy | silence | proxies | tswc5b070.netvigator.com |
219.76.4.75 | ChinaCache | silence | proxies | tswc5b075.netvigator.com |
219.76.4.76 | ChinaCache | silence | proxies | tswc5b076.netvigator.com |
On port 80, we found 3 distinct HTTP fingerprints, not counting no-responses:
9 | ChinaCache |
6 | HAProxy |
1 | dns.auth.fail |
10 | no response |
HTTP/1.0 404 Not Found Server: FC Date: Fri, 16 Sep 2016 17:53:42 GMT Content-Type: text/html Content-Length: 432 Powered-By-ChinaCache: MISS from PCW-HK-3-3X5.8 Connection: close <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <HTML><HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=gb2312"> <TITLE>错误:您所请求的网址(URL)无法获取</TITLE> <STYLE type="text/css"><!--BODY{background-color:#ffffff;font-family:verdana,sans-serif}PRE{font-family:sans-serif}--></STYLE> </HEAD><BODY> <H1>错误</H1> <H2>您所请求的网址(URL)无法获取</H2>
The Chinese text, originally encoded in gb2312,
means "Error: The requested URL (URL) could not be retrieved."
The Powered-By-ChinaCache
header varies,
apparently reflecting the server name.
HTTP/1.0 403 Forbidden Cache-Control: no-cache Connection: close Content-Type: text/html <html><body><h1>403 Forbidden</h1> Request forbidden by administrative rules. </body></html>
Nmap identifies this response as HAProxy.
HTTP/1.0 302 Found Location: http://dns.auth.fail/ Content-Length: 0 Connection: close Date: Fri, 16 Sep 2016 17:53:57 GMT Server: lighttpd/1.4.35
This is a weird one, covered in a separate document.
On port 443, we found 2 distinct fingerprints, not counting no-responses. The 10 no-responses are the same 10 as for HTTP.
15 | with SNI, proxies; without SNI, sends nothing |
1 | with SNI, proxies; without SNI, sends TLS "handshake failure" alert |
10 | no response |
When provided with an SNI, all the responsive servers were indistinguishable because they just proxied the connection. Without an SNI, most of the servers simply sent nothing in return; the only exception was the single "dns.auth.fail" server, which sent a TLS alert. It makes sense that an SNI proxy would not be able to do anything useful when not supplied with an SNI.
This is what we did to connect without SNI:
timeout 10 openssl s_client -ign_eof -connect $host:443
And with SNI:
timeout 10 openssl s_client -ign_eof -connect $host:443 -servername www.example.com
To check for proxying, we looked for the proper server certificate:
subject=/C=US/ST=California/L=Los Angeles/O=Internet Corporation for Assigned Names and Numbers/OU=Technology/CN=www.example.org issuer=/C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert SHA2 High Assurance Server CA
Providing SNI causes the server to proxy to the desired destination.
Omitting SNI causes the server to send nothing after receiving the ClientHello, and terminate the connection after a few seconds.
This fingerprint was only seen on one host, the dns.auth.fail one.
Providing SNI causes the server to proxy to the desired destination.
Omitting SNI causes the server to send a TLS "handshake failure" alert, viz.:
15 03 01 00 02 02 28
Incidentally, you get the same alert if you send a plaintext HTTP request to port 443.
We did a scan of the IPv4 space on port 443 on and found 2,500 SNI proxies. We used ZMap and a custom program, scan-sniproxy. scan-sniproxy simply connects to an IP address, does a TLS handshake with a given SNI, does certificate validation, and saves a hash of the leaf certificate (as proof that the IP address actually proxied to the destination given in the SNI).
For this experiment, we set up a dedicated HTTPS server
at the domain sni-scan-for-research-study.bamsoftware.com.
The experiment doesn't require a dedicated server—you
could use any HTTPS server—but we wanted to be able to
host a web page with contact information in case anyone
objected to the scan.
(A copy of the web page.)
We also wanted to capture and observe
the traffic arriving at the server specified in the SNI.
If you want to look at the traffic information, see
443.pcap.xz
, access.log
,
error.log
, and ssl.log
in the scan-sniproxy-20161024.0
subdirectory of
the source code.
This is the command line we used:
zmap -p 443 --output-fields=* -r 150000 -u zmap_updates.log -l zmap_log.log -v5 -q | \ ztee -u ztee_updates.csv zmap_output.csv | \ scan-sniproxy -input /dev/stdin -maxprocs 4 -maxthreads 10000 -timeout 10s > scan-sniproxy.csv
The scan found 2,500 SNI proxies. You can download a CSV file: scan-sniproxy-20161024.0.proxiesonly.csv. Here they are along with reverse DNS:
There are concentrations of SNI proxies in certain networks:
40.6% | are *.rdns.cloudradium.com |
6.4% | are 101.95.144.68-71, 101.96.9.138-165, 101.96.10.19-74, 101.96.11.19-74, 101.110.119.19-75 |
5.2% | are *.protectedgroup.com (23.19.41.90-94, 23.19.93.2-126) |
5.0% | are 213.184.119.2-126 |
Clearly, cloudradium.com is a heavy hitter when it comes to SNI proxies. We did not find out much information about cloudradium.com, whatever it may be. Anthony Joseph found for us a fraud report for "CloudRadium LLC" from 2014 complaining about the accumulation of IP addresses:
Fri Oct 10 01:33:25 EDT 2014
Fraud Report on one of your current member
Some basic info about the company who has deceived its IPv4 application.
Name of the Comnapy: CloudRadium LLC.
Registry address: 1603 Capitol Ave St3 310, Cheyenne, WY.
Owner’s name: Li Xuan (李轩) & Deng Xiu Ping (邓秀平)As the IPv4 has been depleted in the AP regions, some so-called clever guys from China registered a LLC named CloudRadium LLC in WY in 2012-10-01, meanwhile they registered a domain named http://www.cloudradium.com on 2012-09-26. From then, they’v started their journey of deceiving the Cherish IPv4 resources which should belong to the Arin Internet Community from ARIN. They have carefully studied and researched all Arin’s Policy on IPv4 management and has abused the policy to apply for more IPv4 steps by steps by using the fraud information. See details below.
...
5.In a word, their main purpose is to have as many IPv4 as they can. Their company is not founded for the purpose of doing business in USA, but for accumulating IPv4 addresses
During the scan, we expected to see TLS connections to port 443 on the dedicated web server at sni-scan-for-research-study.bamsoftware.com. Since there were 2,500 SNI proxies, we should have gotten 2,500 connections from them alone, ignoring Internet background traffic. In fact, we received 6,627 connections (6,474 with the correct SNI), and 3,696 came from a single IP address, 192.107.156.196.
Here are the 10 most common source IP addresses
taken from ssl.log
during the 8 hours of the scan:
8 | 211.138.60.14 | |
9 | 219.76.4.73 | tswc5b073.netvigator.com |
12 | 207.249.174.253 | na-207-249-174-253.static.avantel.net.mx |
12 | 208.80.194.26 | static-208-80-194-26.as13448.com (Websense) |
13 | 110.4.24.175 | |
13 | 219.76.4.9 | tswc5b009.netvigator.com |
16 | 120.198.243.35 | |
16 | 65.202.124.7 | |
113 | 66.133.109.36 | outbound1.letsencrypt.org |
3696 | 192.107.156.196 |
66.133.109.36 (outbound1.letsencrypt.org) is easy to explain. sni-scan-for-research-study.bamsoftware.com had a certificate from Let's Encrypt and these were connections related to the ACME protocol. 208.80.194.26 (static-208-80-194-26.as13448.com) is interesting: AS13448 belongs to Websense, a web filtering company. They must do centralized, automatic scans of SNIs seen by their firewall installations.
The clear outlier is 192.107.156.196. During the 8 hours of the scan, this IP address made TLS connections with the correct SNI to our dedicated HTTPS server at a roughly constant rate. It stopped making connections after our scan stopped. Our scan had not touched anything in 192.107.156.0/24 when the first connections from 192.107.156.196 begin to arrive.
To be clear: we measured these connections at the dedicated web server running at sni-scan-for-research-study.bamsoftware.com, the host specified in our SNI, not at the host running ZMap. In other words, it was not a matter of a host back-scanning the source of a detected scan. Instead, while we were scanning random HTTPS servers, an unrelated host—192.107.156.196—was scanning the same host that we specified in our SNI.
One might suppose that 192.107.156.196 is merely a shared exit point for many other SNI proxies. But this can't be the case. For one thing, 192.107.156.196 made more connections to the HTTPS server than we made in total. And for another, the TLS client handshake differed from the ones that our scanner program produced—for example, it used session resumption. If it were merely tunneling our own connections, it would not have been able to modify the TLS handshake and still have validation succeed. Instead, this host was independently sending its own TLS probes while we were doing our scan.
whois says the IP address belongs to Harris Corporation.
NetRange: 192.107.156.0 - 192.107.156.255 CIDR: 192.107.156.0/24 NetName: HARRISNET5 NetHandle: NET-192-107-156-0-1 Parent: NET192 (NET-192-0-0-0-0) NetType: Direct Assignment OriginAS: Organization: Harris Corporation (HARRIS-6) RegDate: 1991-05-20 Updated: 2015-05-26 Ref: https://whois.arin.net/rest/net/NET-192-107-156-0-1 OrgName: Harris Corporation OrgId: HARRIS-6 Address: Mail Stop 57 Address: 1024 West Nasa Blvd City: Melbourne StateProv: FL PostalCode: 32919 Country: US RegDate: 1989-05-17 Updated: 2011-09-24 Ref: https://whois.arin.net/rest/org/HARRIS-6
We looked for the three bootstrap HTTP fingerprints in the Project Sonar HTTP data set dated 20160830. "Project Sonar includes a regular HTTP GET request for all IPv4 hosts with an open 80/TCP." The idea was to use an existing scan of port 80 rather than do our own. The input contains 61,683,005 hosts.
The grepsonar subdirectory in the sniproxy.git repo contains a program that searches the sonar.http file for the 3 HTTP fingerprints found in the GoProxy list. It found:
514 | ChinaCache |
2511 | HAProxy |
2460 | dns.auth.fail |
Of these only a small number are actually SNI proxies, despite having the same fingerprint:
22 | of | 514 | ChinaCache are SNI proxies |
37 | of | 2511 | HAProxy are SNI proxies |
37 | of | 2460 | dns.auth.fail are SNI proxies (fell to 30 a week later) |
110.4.24.170 110.4.24.175 111.11.184.10 111.11.184.117 111.11.184.119 111.12.251.162 111.12.251.206 116.246.6.39 116.246.6.46 117.156.13.7 120.198.243.2 183.207.229.136 203.210.6.39 211.138.60.14 218.254.1.13 218.254.1.15 219.76.4.9 219.76.4.11 219.76.4.14 219.76.4.73 219.76.4.75 219.76.4.76
101.95.144.68 101.95.144.69 101.95.144.70 101.95.144.71 110.4.24.176 110.4.24.178 111.11.153.19 111.11.153.36 111.11.153.37 111.11.153.38 111.11.153.40 111.11.153.42 111.11.153.43 111.11.184.5 111.11.184.7 111.11.184.9 111.11.184.13 111.11.184.116 111.12.251.172 111.12.251.173 111.12.251.175 116.246.6.35 116.246.6.36 116.246.6.38 117.156.13.5 211.138.60.6 211.138.60.7 211.138.60.8 211.138.60.9 211.138.60.10 211.138.60.11 211.138.60.18 211.138.60.19 219.76.4.3 219.76.4.4 219.76.4.69 219.76.4.70
* denotes hosts that disappeared in the second scan. 5.187.20.130 31.220.5.61 37.235.49.150 46.37.190.246 46.108.39.132 69.12.80.155 * 69.162.73.203 82.103.128.85 83.170.70.124 83.170.70.125 91.210.104.52 95.154.233.196 103.15.187.54 103.19.16.13 104.250.97.19 104.250.98.55 104.250.98.56 104.250.98.67 * 104.250.98.168 104.250.98.169 109.123.113.195 113.20.28.16 158.255.211.127 172.97.82.35 173.44.17.3 185.18.79.87 185.57.116.51 185.70.11.133 * 185.70.11.135 * 185.70.11.137 * 185.70.11.139 * 185.125.168.110 193.182.144.202 194.54.80.146 202.155.223.171 204.152.217.140 * 211.22.145.212
We scanned all the hosts in the Project Sonar HTTPS data set dated 20160906. "Project Sonar includes an HTTPS GET request for all IPv4 hosts with an open 443/TCP." The idea was to use an existing scan of port 443 rather than do our own. In retrospect, this was a bad idea, because the Project Sonar scan omits servers that don't return a certificate when contacted without SNI—none of the servers in the GoProxy list returned a certificate in this case. Our scan of these hosts turned up only a few oddball SNI proxies. The input contains 32,327,703 hosts.
First we extracted the IP addresses from the sonar.https-20160906 data set:
cd sonar-ips ./sonar-ips ../20160906-http.gz > ips-20160906-https.txt
Then we scanned all of them for SNI proxy support:
cd scan-sniproxy ./scan-sniproxy -input ../sonar-ips/ips-20160906-https.txt > scan-sniproxy-full-20160906-https.csv
The scan took about 8 hours. The output CSV file is 5.9 GB, but there are only 38 SNI proxies:
grep -a ,T, scan-sniproxy-full-20160906-https.csv > scan-sniproxy-20160906-https.csv
IP address | reverse DNS |
---|---|
1.32.63.161 | |
20.138.2.35 | |
40.129.146.22 | h22.146.129.40.static.ip.windstream.net |
65.202.124.19 | intranet.mcwane.com |
65.202.124.20 | francais.clowcanada.com |
65.202.124.23 | helpdesk.mcwane.com |
65.202.124.28 | office.mcwane.com |
65.202.124.47 | autodiscover.mantank.com |
65.202.124.48 | |
65.202.124.55 | |
65.202.124.56 | ci.synapse-wireless.com |
65.202.124.60 | |
65.202.124.85 | aca.mcwane.com |
65.202.124.103 | apps.mcwane.com |
65.202.124.231 | |
68.232.35.34 | |
68.232.44.115 | |
72.22.21.20 | |
72.22.21.30 | prostream.net |
72.22.21.33 | |
72.22.21.49 | |
72.22.21.58 | |
78.133.124.150 | |
94.124.157.198 | |
119.18.234.60 | |
129.143.4.2 | wwwproxy.belwue.de |
133.43.191.119 | |
159.12.192.12 | |
159.12.192.13 | |
159.12.192.14 | |
162.243.47.142 | proxy2.threatworking.com |
183.207.227.132 | cache.IDC.js.chinamobile.com |
208.250.32.193 | |
209.195.175.8 | mail.komenpittsburgh.org |
209.195.175.13 | hosted.prostream.net |
212.200.253.210 | |
212.200.253.211 | |
218.57.200.13 |
We investigated a few of them manually.
https://40.129.146.22/ says:
40.129.146.22 uses an invalid security certificate.
The certificate is not trusted because the issuer certificate is unknown.
The server might not be sending the appropriate intermediate certificates. An additional root certificate may need to be imported.
The certificate is only valid for 10.170.10.116
Clicking through gives:
Could not connect to server
Overview: Could not connect to 40.129.146.22 .
Details: Peer suddenly disconnected found
Options: Pressing the button allows you to go to the previous page. You can try to reload the page or check if the URL is correct.
[Websense Logo]
https://65.202.124.231/ says:
65.202.124.231 uses an invalid security certificate.
The certificate is only valid for citrix.mcwane.net
The certificate expired on 01/30/2013 12:00 PM. The current time is 09/24/2016 04:07 PM.
Clicking through times out after redirecting to http://65.202.124.231/Citrix/XenApp/. https://citrix.mcwane.net/, after many redirects, lands at https://citrix.mcwane.net/Citrix/XenApp/clientDetection/downloadNative.aspx.
Citrix XenApp
Your Windows desktops and apps on demand - from any PC, Mac, smartphone or tablet.
https://72.22.21.30/ raises the expected certificate error:
The certificate is only valid for the following names: www.prostream.net, prostream.net, hosted.prostream.net, mail.prostream.net, send.prostream.net, spam.prostream.net.
Clicking through gives:
Blocked request
Dear ,
This is a message from the IT Department.
The web site you are trying to access:
https://72.22.21.30/
is listed as a site within the category IPAddress
Current Internet Access Configuration for you does not allow visiting sites within this category at this time.
If the website has been erroneously blocked, please submit it for re-evaluation.
You are currently browsing as an unauthenticated user.
Click here to log in
Sophos Firewall
This server allows the SNI to be an IP address. Here it is proxying to a Tor relay on port 443:
$ torsocks -i openssl s_client -ign_eof -connect 72.22.21.30:443 -servername 82.146.47.17 subject=/CN=www.mn7ntdslcv.net issuer=/CN=www.x4aigwinbg6m.com