An attempt to explain IPv6 and IP Routing to the layperson

Keywords: #IPv4 #IPv6 #routing

Me again, yup. Been a while eh? Well, I’ve been busy. Rebuilding a pretty big site essentially from scratch. Trust me, I have plenty of things to rant about! This post though I hope to be another informative, less ranting, post about IPv6.

I keep seeing a LOT of well meaning but mis-informed or mis-understood claims about IPv6, even in technical circles. What I am going to try to address here though is from the every persons point of view. What it is, why we need it, what it fixes, why it’s hard to deploy/make available, what it (may) mean for an individual user.

The article here was sparked by IO9’s Article.

What Is IPv6?

Well simply put it is Internet 2.0 or Web 2.0, despite what you may have heard from the media. IPv6 is short for Internet Protocol Version 6. We currently use IPv4. IPv6 has a truely massive number of addresses (really, it doesn’t relate in simple terms). IPv4 has around 4 Billion addresses, of which about 3 Billion are useable. IPv6 though is big enough to give every person on the earth, every device, every item, it’s own group of say a million addresses, and still have many trillions left over.

Why is IPv6 the Real Web 2.0?

AKA Why is it so had to get IPv6 out there?

Because it requires touching and replacing or modifying every router, every piece of software, every device, in order to support it. Your web browser, your operating system (Windows, Linux, OS/X), your Internet router/gateway (which a LOT of people confuse between ethernet switches and these things), your wireless access points, your ISPs equipment, your TiVo, your smart phone, everything. This is also why it’s so very hard to get out there.

Now the tech heads and geniuses out there responsible for this have developed a number of ways to assist this migration to IPv6. To allow IPv4 and IPv6 to sort of talk to each other. They can easily exist together, but talking to each other is another matter entirely. These methods are not perfect, they suck actually. From the IPv4 side, it’s like sending a letter addressed to a city rather than a person. For IPv6 it’s easier, in fact, there’s a block of IPv6 addresses (these blocks of addresses are called a prefix, like an area code, so I’ll use the term prefix from here on out) that are set aside to map directly to the old IPv4 addresses. That’s how big the address space in IPv6 is! Whats the number? OK you REALLY sure you want to know? Fine. 2128 — Two to the power of 128. That’s in scientific notation 3.4*1038 or a 34 followed by 38 zeroes (rounded). How big is that? Every single dollar bill of the American national debt could be individually numbered. And we’d still have a LOT of space left over. Heck we could give out a Trillion addresses to every person, device, or object on the planet, and still be likely to have leftovers. The TCP/IP Guide has a Section On IPv6 Address Space Size

IPv4 addresses are everywhere. Dotted quad’s we call them. 4.2.2.1 — 127.0.0.1 … etc. Largely people are ignorant of them, and they damn well should be. Numbers are for computers. Humans name things, computers number them, and computers are REALLY good at translating and mapping between the two. DNS is the protocol that does this. And in that it’s been so successful that the vast majority of Internet users have no clue whatsoever that IP addresses (v4 or v6 or otherwise) even exist! DNS itself needs to be revamped as a protocol in order to support IPv6 (and it largely has been) — and then redeployed too, globally. This is taking place bit by bit.

E-Mail. Every mail server has an IP address (or more than one in many cases). It receives connections on that address from other mail servers and mail clients asking them to receive mail for, or send mail to, a given email address (user at domain). Spam filtering software. Anti-Virus software.

All of this stuff is on the list of things that need to be modified, or replaced for IPv6 support. The list is huge.

Why Do We Need IPv6?

We’re running out of IPv4 addresses. No one in the beginning could possibly imagine that there would be such a huge number of devices connected to the Internet. Now almost every phone, game console, and electronic device has some form of Internet connectivity. That doesn’t necessarily mean each of these devices needs a globally unique address, but it makes things easier, faster, more reliable, and cheaper if each device does. The reason is that if you use NAT (many many homes do this) your private address has to be mapped to a public one at some point. This device has to keep track of each and every connection from each and every device that it’s performing this mapping for. Worse some protocols put IP addresses inside of their data, and so the NAT has to know about these protocols, identify them, and modify the information inside the packets for these protocols! (FTP is one such protocol, HTTP is not).

Well why not reuse all those “Web 1.0” addresses?

IPv4 is “Web 1.0.” The media gave us all that term, and most people have no idea what it means. Web 2.0 (Go ahead and look, we’ll wait here) really only describes a bunch of web browser, JavaScript, and HTML technologies and says nothing about the actual core guts of the internet IP, DNS, BGP (this is the ISP to ISP route sharing protocol — every ISP “core” router HAS to speak this to other ISPs), OSPF (this is one of a number of ISP internal route sharing protocols, MPLS. Nor anything about a lot of other core internet protocols like HTTP, SMTP, IMAP, etc.

So wow I will get my own unique addresses?!

No, not likely. This is because of the way that “core routers” (there’s no such thing by the way, which I will try to address in a moment) have to keep track of each unique destination. Right now, and for the foreseeable future with both IPv6 and IPv4 the ay this works is that a ISP get a BIG block of addresses (BIG being relative in the terms of IPv4 or IPv6 — with IPv6 they get a LOT more space…enough in fact to have an IPv6 address within their own network for each IPv4 address and still have a billion left)… So they tell the other ISPs they’re connected to about that one big block, not about individual customers or devices. They say to their neighbor “I can deliver packets to addresses beginning with 127.0, pass it along.” Another ISP might have 127.1 another might have 127.2.0-15, etc. IPv6 does the same thing. IPv6 addresses are just so much longer I’m not using them in this example. The neighbors only know about and remember the big block of addresses, not the individual addresses or smaller blocks given to individual customers.

Now within an ISP they keep track of many more much smaller blocks of addresses, maybe even down to individual addresses. Inside an ISP similar trading of information on what addresses are served by which of their routers happens (no this does NOT happen with the average end user!). The difference here is that since they’re all internal addresses, and a router notices when two or more addresses or blocks occur contiguously, they are often aggregated into a single larger block. Think of it like this. Router A is connected to B C D and E, E is connected to F and G. F has 1 2 and G has 3 4. E knows this, instead of telling A about 1 2 3 4 (and A further telling B C and D about 1 2 3 4) it just tells A 1-4. Imagine this for a few hundred, and you can see the savings. Instead of passing along each individual number it just tells it a range of numbers. There are restrictions on how these ranges are made up (for the geeks out there it has to be on a bit boundary), but that’s the basic idea.

Wait what’s so different about inside an ISP versus outside?! — simple, inside the ISP they know the adjacent addresses STAY adjacent and are inside the same entity, themselves. Out in the bigger internet you can’t do that. You might own 1 and 2, but someone else is 3 and 4. And you don’t want packets for 3 and 4 arriving at your doorstep, now do ya? Well that’s what would happen if the big ISPs aggregated routes together like that, because once a route is aggregated it loses it’s own unique identity.

Whats so wrong with having lots of routes then? Two things, memory and speed. Memory is finite. And the memory used in big “core routers” is far more expensive (and far faster too) than your desktop or laptop memory. Speed is the other reason. Big routers have less than a microsecond to decide where a packet is supposed to be going, and do something about it. They make a huge number of these decisions in parallel too, and each of these decisions have to reference some part of the database of what-goes-where that the router has built up for itself based on who it’s connected to, and what they say they are connected to.

Earlier you said there’s no such thing as a “core router”?

Indeed I did. For this discussion, you don’t have a router. Indeed we at ISPs call what you have CPE, Cutomer Premise Equipment, or an End User Gateway Device. They’re meant to connect one machine, or a very small number of machines (4-5 at most typically) to the ISPs router and from there the internet at large.

The internet is a bit more like a web. A cobweb. Lots of different parts connected in lots of different ways. You as a end user are only connected at one point, to your ISP via your cable modem, DSL line, satellite, smart phone, or, old fashioned dial up modem. Your ISP, if it’s a small local ISP will be connected to 2 or more (usualy atleast 3 or 4) larger ISPs, and possibly some other small local ISPs or local business customers that have their own routers. Each of these routers tell each other who they’re connected to. As connections between ISPs are made, and broken, this changes. Each of these changes ripples through the internet, so when an ISP in say Missoula, MT disconnects from another ISP here in Montana that has been telling everyone it’s connected to that it is connected to that ISP, every big ISP knows in seconds, and every small ISP in some seconds after that. So what just happened in Missoula, MT is known in Beijing, China in very short order.

This is also another reason why individuals can’t have unique addresses that move between ISPs You may not move from one ISPs territory to another very often, but there are billions of people out there. Imagine now that those updates too have to be propagated and stored. Starting to see the problem?

Larger businesses with dozens or hundreds or workstations, or on site servers, or other specil high reliability requirements connect to ISPs in much the same way as ISPs connect to each other, they just don’t say to ISP B “hey I am connected to ISP A so you can reach ISP A through me” but they do tell both A and B that they have the addresses 6 7 and 8 say. This is called multihoming. Why? Well think of an ISP as a “home” for an address. Your address exists at multiple “homes” when you connect with multiple ISPs and advertise to each of them your block of addresses. There’s an intentional barrier to entry here because ISPs do not want, and cannot support, an unlimited number of these connections, because each of these connections requires the Internet as a whole to see and remember the unique block of addresses assigned to that business. And whenever that business disconnects (say they’re upgrading their network or have a long lasting power outage) from one ISP or the other, the whole Internet hears about it, each router tells all it’s neighbors about that change in connectivity.

There’s a LOT of research going on into better ways of dealing with the global routing table (that’s what it’s called…but there really isn’t one table, it’s more like each router has it’s own idea or ideas at what the routing table looks like right *now* and if you wait even half a second, it’s going to change, probably several times) but no one has found a silver bullet yet. And even if/when they do, there’s still the same problem we have with IPv6, all the ISPs have to adopt and deploy it, everywhere.

If there’s interest I’ll go into TCP/IP, UDP/IP, DNS, and BGP/OSPF/Routing in a separate article (or articles). How a connection is established, what NAT is, what a Firewall is/does and why NAT and firewalling are different, and why routing is different than those two.