These are the visual aids I used to deliver a talk on IPv6 OS fingerprinting on October 16, 2015 at AISec 2015.
For the full paper see: https://www.bamsoftware.com/papers/ipv6-os.pdf.
Audio recording
(Authors are in alphabetical order.)
David Fifield UC Berkeley
Alexandru Geana TU Eindhoven / Fox-IT, Delft
Luis MartinGarcia ETSIT, Polytechnic University of Madrid
Mathias Morbitzer Fox-IT, Delft
J. D. Tygar UC Berkeley
This talk is about the IPv6-based OS fingerprinting engine in Nmap, a widely used network security scanner.
# nmap -6 -O ipv6.google.com Starting Nmap 6.49SVN ( https://nmap.org ) at 2015-09-28 11:36 MDT Nmap scan report for ipv6.google.com (2607:f8b0:4009:804::1002) Host is up (0.022s latency). rDNS record for 2607:f8b0:4009:804::1002: ord08s10-in-x02.1e100.net Not shown: 998 filtered ports PORT STATE SERVICE 80/tcp open http 443/tcp open https Device type: general purpose Running: Linux 3.X OS CPE: cpe:/o:linux:linux_kernel:3 OS details: Linux 3.12 - 3.18 OS detection performed. Please report any incorrect results at https://nmap.org/submit/ . Nmap done: 1 IP address (1 host up) scanned in 8.54 seconds
OS fingerprinting is relevant for network inventory, vulnerability scanning, and exploit tailoring.
Design goals, based on extensive experience with IPv4:
If there is no good match, the system displays a raw fingerprint and asks the user to submit it.
We use LIBLINEAR in its L2-regularized logistic regression mode.
Different OSes speak different “dialects” of TCP/IP.
Linux 3.12 | Windows 7 |
MFC-9440CN printer |
|
---|---|---|---|
S1.PLEN= | 40 | 40 | 44 |
S1.HLIM= | 64 | 128 | 64 |
S1.TCP_MSS= | 1440 | 1440 | 1420 |
S1.TCP_WSCALE= | 5 | 8 | 0 |
S1.TCP_WINDOW= | 28560 | 8192 | 8448 |
S2.TCP_WINDOW= | 28560 | 8192 | 8328 |
S3.TCP_WINDOW= | 28560 | 8192 | 8792 |
We looked for “SHOULD”s and “MAY”s in IPv6 standards.
Some IPv6 standards documents |
---|
RFC 2460 (IPv6) |
RFC 2463 (ICMP for IPv6) |
RFC 2473 (Generic Packet Tunneling) |
RFC 2675 (Jumbograms) |
RFC 3122 (Inverse Discovery) |
RFC 3775 (Mobility) |
RFC 3971 (Secure Neighbor Discovery) |
RFC 4620 (Node Information Queries) |
RFC 4782 (Quick-Start) |
RFC 4861 (Neighbor Discovery) |
RFC 5570 (CALIPSO) |
We built a test program with 154 candidate OS probes.
Volunteers tested all 154 probes against a “seed” set of OSes.
We selected 18 probes that offer good efficiency: 13 TCP, 4 ICMPv6, 1 UDP.
IPv6 | TCP | ICMPv6 | ||
---|---|---|---|---|
PLEN TC HLIM |
TCP_ISR
TCP_WINDOW TCP_FLAG_F TCP_FLAG_S TCP_FLAG_R TCP_FLAG_P TCP_FLAG_A TCP_FLAG_U TCP_FLAG_E TCP_FLAG_C TCP_FLAG_RES8 TCP_FLAG_RES9 TCP_FLAG_RES10 TCP_FLAG_RES11 TCP_OPT_0 … TCP_OPT_15 TCP_OPTLEN_0 … TCP_OPTLEN_15 TCP_MSS TCP_SACKOK TCP_WSCALE TCP_CORR_WINDOW_MSS |
ICMPV6_TYPE ICMPV6_CODE |
A major challenge is identifying when the classifier doesn’t have a good answer (e.g. a never-before-seen type of network printer).
Novelty is the distance of a feature vector from the mean of a class, where each dimension is scaled by the inverse of its variance. (One-sample classes have their variance set to a small constant.)
We rely on user submissions to grow the database.
But IPv6 adoption is still not as high as we would like :(
In the time it took to get 4,700 IPv4 submissions, we got only 97 IPv6 submissions.
Low ratio of training samples to classes. 16% of training samples are the only member of their class; 10% are in a two-sample class.
20–30% of classes are unknown, embedded OSes (identified only by hardware model number).
Network-corrupted and missing features.
Lack of ground truth. Very few training samples compared to IPv4.
10-fold cross-validation on our training set of 290 samples has an accuracy of 69%.
If we allow near misses (e.g., one Linux 3.x class confused for another Linux 3.x class), accuracy rises to 80%.
Happy scanning!