The many faces of an IP address
Most tech-savvy people know that 127.0.0.1
is the Internet Protocol (IP) address of localhost
, and many tech-savvy people know ::1
is the Internet Protocol version 6 (IPv6) address of localhost
.
Are 127.0.0.1
and ::1
the only ways of representing the IP address of localhost
?
No, there are an infinite different valid ways to represent it 💥
One IP address, many ways of representing it#
Here are the nine broad ways an IP address can be represented. We will use 127.0.0.1
as the subject of our example. You might want to run some server listening on 127.0.0.1
to confirm my claims.
1. Dotted-decimal notation#
We are most familiar with the dotted-decimal notation, which has the format of N.N.N.N
, where N
can range from 0
to 255
.
In this notation, localhost
is the familiar 127.0.0.1
.
2. 0-optimized dotted-decimal notation#
The digitally correct format of 127.0.0.1
is actually 127.000.000.001
. 127.0.0.1
is the zero suppressed form.
Zero compression is the exlusion of segments whose value sums up to zero.
Using zero compression, the 0
value segments of an IP address can be ommitted. So 127.0.0.1
becomes:
127.1
Try pinging it for proof:
$ ping 127.1
PING 127.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.051 ms
Likewise, 192.168.0.1
becomes 192.168.1
.
Because of zero suppression and zero compression, the following also is interpreted as 127.0.0.1
:
127.0.00000000000000000000000000000000001
Go ahead, ping it.
Wondering if 000127.0.1
will also resolve to 127.0.0.1
?
Nah, 000127
will be read as octal 0127
, which equals to 87
in decimal. So 000127.0.1
will resolve to 87.0.0.1
.
3. Octal notation#
Each number of the dotted-decimal IP address can be represented in the octal format too. So in octal notation 127.0.0.1
is:
0177.0.0.01
It's important to note that the leading 0
before the numbers are required for marking them as octal. And you can put as many 0
s before them. So, the following also refer to 127.0.0.1
:
00000000177.000.0.00000001
0177.0.0.0000001
000177.0000.00000.01
0000177.000000000000000000.00000000000.00000000001
00000000000000000000000000000000000000000000000000177.0.0.01
Don't believe me?
Trying pinging them:
$ ping 00000000000000000000000000000000000000000000000000177.0.0.01
PING 00000000000000000000000000000000000000000000000000177.0.0.01 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.060 ms
The underlying networking library is converting the octal IP address to the decimal format (and then to binary).
4. Hexadecimal notation#
"If something can be represented in octal, it probably can be represented in hexadecimal too."
Yes, you are correct. The numbers of the dotted-decimal IP address can be represented in the hexadecimal format too. So in hexadecimal notation 127.0.0.1
is:
0x7f.0x0.0x0.0x1
The dots are optional if you preceed the concatenated hex values with a 0x
:
0x7f000001
Try pinging that:
$ ping 0x7f000001
PING 0x7f000001 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.072 ms
And, you can left-pad the hex value with any amount of any random hex values. So, the following also refer to 127.0.0.1
:
0xDEADBEEF7f000001
0xBADF00D7f000001
0xDEADC0DE7f000001
0xBADC0DE7f000001
0xBAAAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa7f000001
5. Decimal notation a.k.a dword notation#
Dword is the non-dotted decimal representation of an IP address. In dword notation 127.0.0.1
is:
2130706433
Try pinging it:
$ ping 2130706433
PING 2130706433 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.031 ms
6. Binary notation#
You must have bee wondering if IP addresses can be represented in binary notation; since they can be repersented in decimal, octal, and hexadecimal.
You are right, IP addresses can be represented in binary too. 127.0.0.1
in binary notation is:
01111111000000000000000000000001
Note, however, not all HTTP clients support IP address in binary format.
7. Mixed notation#
How about mixing what we have learnt so far? A different notation for one or more segments of the address? Just keep the left-most segment intact. Here are some examples:
00000000000000000000000000000000000000000000000000177.1
0x7f.1
127.0x1
With three of the four segments represented the same in dec, oct, and hex, 127.0.0.1
doesn't give us much room to play. Let's see how 172.217.166.174
(google.com) may be represented in the decimal-octal-hexadecimal-dword mixed notation:
172.14263982
0254.0xd9a6ae
0xac.000000000000000000331.0246.174
0331.14263982
8. IPv6 format#
Then we have IPv6. All of the following resolve to ::1
:
0000000000000:0000:0000:0000:0000:00000000000000:0000:1
0000:0000:0000:0000:0000:0000:0000:0001
0:0:0:0:0:0:0:1
0:0:0:0::0:0:1
Remember zero compression and zero suppression?
9. URL-encoded IP address#
URL-encoded IP addresses are accepted as valid IP addresses in most browsers and HTTP clients. So, the following refers to http://127.0.0.1
:
http://%31%32%37%2E%30%2E%30%2E%31
And the following refers to http://[::1]
:
http://[%3A%3A%31]
Explanation#
So why does an IP address have some many forms?
The fact is, an IP address is actually a 32-bit number in IPv4 and 128-bit number in IPv6. The binary notation is the correct representation of an IP address. Every other notation is a simply a convenience (at various degrees) for humans interacting with machine standards; and all of them are eventually converted to the binary notation.
The various number systems and the various optimizations conventions are what make the weird phenomena of having an unlimited different formats of IP addresses possible.
Summary#
Valid IP addresses necessarily need not look "valid". IP addresses can be represented in idefinitely different ways, therefore it cannot be determined if a value is an IP address or not. If you have a regex for detecting IP addresses, it is broken by default. If this regex is used for access control, you have a vulnerability on your system.
How's your IP address detection algorithm feeling today?