Prolixium Communications Network: Difference between revisions

From Prolixium Wiki
Jump to navigation Jump to search
 
(73 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[File:pcn.png|thumb|280px|Prolixium Communications Network Logo]]The Prolixium Communications Network (known also as '''PCN''', '''mynet''', '''My Network''', and '''Prolixium .NET''') is a collection of small, geographically disperse, computer networks that provide [[IPv4]] and [[IPv6]], [[VPN]], and [[VoIP]] services to the [[Kamichoff]] family.  Owned and operated solely by [[Mark Kamichoff]], PCN often serves as a testbed for various network experiments.  The majority of the PCN nodes are connected via residential data services ([[cable modem]]), while some located in [[data center|data centers]] have [[Fast Ethernet]] connections to the [[Internet]].
[[File:pcn.png|thumb|280px|Prolixium Communications Network Logo]]The Prolixium Communications Network (known also as '''PCN''', '''mynet''', '''My Network''', '''Prolixium .NET''', and '''My Hobby Network''') is a collection of small, geographically disperse, computer networks that provide [[IPv4]] and [[IPv6]], [[VPN]], and [[VoIP]] services to the [[Kamichoff]] family.  Owned and operated solely by [[Mark Kamichoff]], PCN often serves as a testbed for various network experiments.  Some of the PCN nodes are connected via residential data services ([[cable modem]]), while others are located in [[data center|data centers]] have [[Gigabit Ethernet]] (or better) connections to the [[Internet]].


== Current State ==
== Current State ==
Line 5: Line 5:
=== Overview ===
=== Overview ===


[[file:wan.png|thumb|PCN WAN Architecture]]As of May 8, 2009, PCN is composed of several networks along the east coast of the [[United States]], connected via [[OpenVPN]] and [[6in4]] tunnels:
[[file:wan.png|thumb|PCN WAN Architecture]][[file:pcn-world.png|thumb|PCN World Map]]As of March 10, 2024, PCN is composed of several networks in the [[United States]] and across the globe, connected via [[OpenVPN]] and [[WireGuard]] with the IPv6 backbone connected via [[6in4]] tunnels:


* [[North Brunswick, NJ]] (home): [[nat]].prolixium.com on [[HFC]] via [[Optimum Online]]
* [[North Brunswick, NJ]]: [[nat]].prolixium.com on [[FTTH]] via [[Verizon FiOS]]
* [[New York, NY]] (vox): [[dax]].prolixium.com on Fast Ethernet via [[Voxel dot Net]]
* [[Piscataway, NJ]]
* [[Atlanta, GA]] (sago): [[nox]].prolixium.com on Fast Ethernet via [[Sago Networks]]
** [[excalibur]].prolixium.com on Virtual I/O via [https://www.vultr.com/ Vultr]
* [[Charlotte, NC]] (cha): [[starfire]].prolixium.com on HFC via [[Road Runner]]
** [[dax]].prolixium.com on Virtual I/O via Vultr
* [[Sarasota, FL]] (srst): [[scimitar]].prolixium.com on HFC via [[Comcast]]
* [[Toronto, Canada]]: [[tiny]].prolixium.com on Virtual I/O via [http://atlantic.net/ atlantic.net]
* [[Carteret, NJ]] (car): [[tachyon]].prolixium.com on HFC via Comcast
* [[Dallas, TX]]: [[nox]].prolixium.com on Virtual I/O via [http://www.linode.com/ Linode]
* [[Amsterdam, Netherlands]] (ams): [[firefly]].prolixium.com on Fast Ethernet via [http://www.xeneurope.co.uk/ xenEurope]
* Dallas, TX: [[concorde]].prolixium.com on Virtual I/O via Vultr
* [[Ashburn, VA]]: [[pegasus]].prolixium.com on Virtual I/O via [https://freerangecloud.com/ Free Range Cloud]
* Ashburn, VA: [[daedalus]].prolixium.com on Virtual I/O via [https://tier.net/ Tier.Net]
* Ashburn, VA: [[matrix]].prolixium.com on Virtual I/O via [https://cloud.oracle.com/ Oracle Cloud]
* Ashburn, VA: [[elise]].prolixium.com on Virtual I/O via [https://cloud.oracle.com/ Oracle Cloud]
* Ashburn, VA
** [[discovery]].prolixium.com via [[Verizon FiOS]]
** [[sprint]].prolixium.com via [[Verizon Wireless]] (LTE)
* [[Seattle, WA]]: [[orca]].prolixium.com on Virtual I/O via Vultr
* Seattle, WA: [[interstellar]].prolixium.com on Virtual I/O via Vultr
* [[Sarasota, FL]]: [[scimitar]].prolixium.com on DOCSIS via Comcast Xfinity
* [[Los Angeles, CA]]: [[trident]].prolixium.com Virtual I/O via [http://www.arpnetworks.com/ ARP Networks]
* [[Clover, SC]]: [[trefoil]].prolixium.com on ADSL via [[Spectrum]]
* [[York, SC]]: [[exodus]].prolixium.com on ADSL via [[AT&T]]
* [[Austin, TX]]: [[photonic]].prolixium.com on FTTH via Google Fiber
* [[Charlotte, NC]]: [[storm]].prolixium.com on FTTH via AT&T
* [[Arlington, VA]]: [[merlin]].prolixium.com on Ethernet via Comcast Business / Zayo
* [[Agawam, MA]]: [[galactica]].prolixium.com on DOCSIS via Comcast Xfinity
* [[Amsterdam, Netherlands]]: [[firefly]].prolixium.com on Virtual I/O via [http://www.digitalocean.com/ DigitalOcean]
* [[Singapore]]: [[centauri]].prolixium.com on Virtual I/O via [http://ec2.amazon.com/ Amazon EC2]


Each site has multiple (network is almost fully-meshed) OpenVPN tunnels to other locations, each with a 6in4 tunnel inside, providing both IPv4 and IPv6 communications with data protection and security.  [[Quagga]]'s ospfd, ospf6d, and bgpd are used in the production network (the term ''production'' is relative) on commodity [[PC]] hardware, while the Charlotte site also utilizes [[Juniper]] [[NetScreen]] and [[SSG]] [[firewall|firewalls]].
Each site has multiple OpenVPN tunnels to other locations supporting both IPv4 and IPv6.  The network is primarily powered by [[Free Range Routing]] (FRR) with some sites using [[BIRD]].


=== Routing ===
=== Routing ===


The routing infrastructure consists of several autonomous systems, taken from the IANA-allocated private range: 64512 through 65534.  Each site runs IBGP, possibly with a route reflector, and its own [[IGP]] for local next-hop resolution.  EBGP is used between sites and peering connections.  IPv4 Internet connectivity for each site is achieved by advertisement of default routes from machines performing NAT.  The [[Prolixium Communications Network#Lab|lab]] is connected to starfire in Charlotte (cha).  The PCN used to use one large OSPF area: no EGP.  It was converted to a [[BGP]] confederation setup, then reconverted to its current state.
The routing infrastructure consists of several autonomous systems, taken from the IANA-allocated private range: 64512 through 65534.  Each site runs IBGP, possibly with a route reflector, and its own [[IGP]] for local next-hop resolution.  EBGP is used between sites and peering connections.  IPv4 Internet connectivity for each site is achieved by advertisement of default routes from boxes performing NAT.  The [[Prolixium Communications Network#Lab|lab]] is connected to [[starfire]] (core router) in Ashburn, VA.  The PCN used to use one large OSPF area with no EGP.  It was converted to a [[BGP]] confederation setup, which was a bad idea (but educational!), then reconverted to its current state.
 
[[file:bgpnet.png|280px|BGP on PCN]]


=== IPv6 Connectivity ===
=== IPv6 Connectivity ===


IPv6 connectivity is provided by Voxel dot Net in New York, NY, via dax.prolixium.com.  The IPv6 default route (and 2000::/3 for crappy old Linux kernels) is learned by the rest of the network via BGP on dax.prolixium.comdax.prolixium.com runs [[OpenBSD]]'s [[pf]], which regulates inbound and outbound IPv6 traffic, through stateful inspection.
IPv6 connectivity is provided by four (5) direct connections to Vultr (The Constant Company), ARP Networks, Free Range Cloud, and Tier.Net. A Hurricane Electric BGP tunnel is used as backups off excalibur & trident but is depreferenced.  The border transit network piece of the PCN provides this connectivity.
 
IPv6 addressing is out of 2620:6:2000::/44, which is a direct allocation from ARIN.
 
==== Border Transit Network ====
 
The border transit network operates in AS395460 and consists of [[excalibur]], [[trident]], [[orca]], [[pegasus]], [[daedalus]], and [[concorde]].  Connectivity is provided by the following transit peers:
 
* trident: AS25795 and AS6939
* excalibur: AS20473 and AS6939
* orca: AS20473
* concorde: AS20473
* pegasus: AS53356
* daedalus: AS397423
 
This network injects a default route into the rest of the PCN, which can be referred to PEN (Prolixium Enterprise Network)The border network receives a full table from all transits and advertises 2620:6:2000::/44 out each peer along with some sites advertising /48 specifics for networks that are nearby.
 
Hurricane Electric (AS6939) is only used as backup because it is a tunneled connection and is suspected to be throttled.
 
[[file:bgpnet-transit.png|280px|Border Transit Network]]
 
[[file:pcn-world2-transit.png|280px|Border Transit Network Map]]
 
The following hosts do not default route to the border transit network and use their own native IPv6 connectivity:
 
* centauri
* firefly
* storm
 
The following hosts may have IPv6 connectivity but it's not currently enabled (at time of writing):
 
* exodus
* galactica
* photonic


=== DNS ===
=== DNS ===


[[DNS]] is somewhat tricky, but handled nicely by BIND9's views function.  PCN has two external nameservers, and four internal ones, all which perform zone transfers from the master nameserver, ns3.antiderivative.net.  antiderivative.net is used for all NS records, as well as glue records at the GTLD servers.  The internal nameservers are ns{1-4} and external ones are ns{2,3}.  Each zone has two views, internal and external, and a common file that is included in both views (SOA, etc.).  The zones include the following:
[[DNS]] is done with two views: internal and external.  PCN has two external nameservers, and four internal ones, all which perform zone transfers from the master nameserver, ns3.antiderivative.net.  antiderivative.net is used for all NS records, as well as glue records at the GTLD servers.  The internal nameservers are ns{1-4} and external ones are ns{2,3}.  Each zone has two views, internal and external, and a common file that is included in both views (SOA, etc.).  The zones include the following:


* Internal view, answering to 10/8, 172.16/12, and 192.168/16 addresses
* Internal view, answering to 10/8, 172.16/12, and 192.168/16 addresses
Line 35: Line 89:
** prolixium.com, prolixium.net, antiderivative.net, etc.'s external A/CNAME records
** prolixium.com, prolixium.net, antiderivative.net, etc.'s external A/CNAME records
* Common information, answering for all hosts
* Common information, answering for all hosts
** 180/30.189.9.69.in-addr.arpa., 232/29.186.9.69.in-addr.arpa, and 6.d.a.8.0.7.4.0.1.0.0.2.ip6.arpa. reverse zones
** 0.0.0.2.6.0.0.0.0.2.6.2.ip6.arpa., and other reverse zones
** prolixium.com, prolixium.net, antiderivative.net, etc.'s common MX records
** prolixium.com, prolixium.net, antiderivative.net, etc.'s common MX records


Line 44: Line 98:
All transit links on PCN are addressed using the prolixium.net domain.  The format is {unit/VLAN}.{interface}.{host}.prolixium.net.  For example, the xl1 interface on starfire would be: xl1.starfire.prolixium.net.  There is a collection of DNS entries for every IPv4 and IPv6 transit link.  There is not one hop in my network which has no PTR record (or a PTR record w/out a corresponding A or AAAA record).  Each router has a loopback interface with IPv4 and IPv6 addresses (if supported).
All transit links on PCN are addressed using the prolixium.net domain.  The format is {unit/VLAN}.{interface}.{host}.prolixium.net.  For example, the xl1 interface on starfire would be: xl1.starfire.prolixium.net.  There is a collection of DNS entries for every IPv4 and IPv6 transit link.  There is not one hop in my network which has no PTR record (or a PTR record w/out a corresponding A or AAAA record).  Each router has a loopback interface with IPv4 and IPv6 addresses (if supported).


=== Charlotte-Specific Setup ===
=== Ashburn-Specific Setup ===


[[file:charlotte.png|thumb|Charlotte LAN]]The network setup in Charlotte is slightly different from the other sites, where there is one router, and all Internet and WAN traffic leaves through that host.  Road Runner provides three public IP addresses, which have been assigned to:
[[file:charlotte.png|thumb|Ashburn LAN]]The network setup in Ashburn (formerly Seattle, WA and Charlotte, NC) is slightly different from the other sites, where there is a single router with a dynamic address.  In the Ashburn location there are two ISPs and they're terminated in separate LXC instances (all with VPNs to at least one of interstellar, nox, dax, or elise - the "enterprise" network):


* A Juniper Networks NetScreen-5GT (einstein)
* discovery (on evolution) - Verizon FiOS
* A Juniper Networks SSG 5 Wireless (e)
* sprint (on evolution) - Verizon Wireless (LTE)
* A Linux router (starfire)


starfire is the core router with 3x Fast Ethernet and 2x Gigabit Ethernet interfaces.  VPN traffic leaves starfire, but all non-RFC1918 traffic is balanced via L4 protocols between the two firewalls. I also have a Netflow collector running on atlantis, which is depicted in the drawing below:
starfire and evolution are the two core routers with multiple Gigabit Ethernet interfaces.  The current routing setup is as follows:
 
* IPv6 (Internet & internal) inbound & outbound traffic traverses discovery (Verizon FiOS) via VPN
* IPv4 Internet inbound & outbound traffic traverses discovery (Verizon FiOS) via NAT
* All LXCs above advertise an IPv4 default route into OSPFv2
* LOCAL_PREF and AS_PATH prepending influence the traffic flow
 
In the case of backup, discovery is replaced with the LXC sprint.
 
In the past, NetFlow was used on atlantis, which was depicted in the drawing below:


[[file:netflow.png|280px|PCN NetFlow Setup]]
[[file:netflow.png|280px|PCN NetFlow Setup]]


Previously, the NetFlow collector ran [http://www.ntop.org/ ntop], but this was uninstalled due to instability.
The NetFlow collector ran [http://www.ntop.org/ ntop], but this was uninstalled due to instability.


=== Printing ===
=== Printing ===


The whole printing/CUPS/lpd setup is mostly an annoyance.  Most people would want to run CUPS on every Unix client on the network.  Mark Kamichoff believes it's better to have a lightweight client send a PostScript file via lpd to a CUPS server rather than sending a huge RAW raster stream across the network and have both the client and server do print processing.  Bonjour connection is just horrid.  See the diagram to the bottom:
The whole printing/CUPS/lpd setup is mostly an annoyance.  Most people would want to run CUPS on every Unix client on the network.  Mark Kamichoff believes it's better to have a lightweight client send a [http://en.wikipedia.org/wiki/PostScript PostScript] file via lpd to a CUPS server rather than sending a huge RAW raster stream across the network and have both the client and server do print processing.  See the diagram to the bottom:


[[file:printing.png|280px|PCN Printing Setup]]
[[file:printing.png|280px|PCN Printing Setup]]
Line 70: Line 132:
[[file:smokeping.png|280px|SmokePing]]
[[file:smokeping.png|280px|SmokePing]]


[[nox]] is the master for two slaves:
[[nox]] is the master for a few slaves:


* [[mercury]] - Soekris box connected to ???
* [[tiny]] - VPS connected to atlantic.net
* [[tachyon]] - Soekris box connected to Comcast
* [[storm]] - RPi 3 connected to AT&T Fiber
* [[exodus]] - RPi 3 connected to AT&T DSL
* [[galactica]] - RPi 4 B connected to Comcast Xfinity
* [[photonic]] - RPi 4 B connected to Google Fiber
* [[merlin]] - RPi 3 B connected to Comcast Business / Zayo
* [[trefoil]] - RPi 5 connected to Spectrum


== History ==
== History ==


:''Warning: This entire section is written in the first-person ([[Mark Kamichoff|Mark Kamichoff's]]) point of view''
<div class="mw-collapsible mw-collapsed">History is hidden by default.  Click '''expand''' to see it.<div class="mw-collapsible-content">''Warning: This entire section is written in the first-person ([[Mark Kamichoff|Mark Kamichoff's]]) point of view''


=== Beginnings ===
=== Beginnings ===
Line 121: Line 188:
At each site, I wanted to run IBGP internally, and designate one box to be the route reflector, in order to loosen the IBGP full-mesh requirement.  Some of the OpenWrt devices did not have loopbacks at the time, so I needed to shuffle around some addresses and fix this.
At each site, I wanted to run IBGP internally, and designate one box to be the route reflector, in order to loosen the IBGP full-mesh requirement.  Some of the OpenWrt devices did not have loopbacks at the time, so I needed to shuffle around some addresses and fix this.


I'd still run an IGP internal to each site (not nonce or dax, since they are only one router), and advertise a default route via OSPFv2 within the site, for Internet access.  I could also advertise default routes from two different routers within a site, for redundancy and failover Internet access.
I'd still run an IGP internal to each site (not nox or dax, since they are only one router), and advertise a default route via OSPFv2 within the site, for Internet access.  I could also advertise default routes from two different routers within a site, for redundancy and failover Internet access.


So, here's some of the tasks I performed prior to making any routing changes:
So, here's some of the tasks I performed prior to making any routing changes:
Line 145: Line 212:
I started setting up the IPv4 BGP neighbors, and ran into a strange issue with ScreenOS.  I've documented it here.  Basically, my two Juniper firewalls wouldn't establish IBGP connections unless they were configured as passive neighbors (wait for a connection).
I started setting up the IPv4 BGP neighbors, and ran into a strange issue with ScreenOS.  I've documented it here.  Basically, my two Juniper firewalls wouldn't establish IBGP connections unless they were configured as passive neighbors (wait for a connection).


After all the IPv4 BGP connections were up and running, I killed the network-wide IGP process entirely (shut off ospfd/ospf6d on dax and nonce), and let everything reconverge.  It worked out of the box - success!
After all the IPv4 BGP connections were up and running, I killed the network-wide IGP process entirely (shut off ospfd/ospf6d on dax and nox), and let everything reconverge.  It worked out of the box - success!


I removed the static default routes on my OpenWrt routers, and advertised defaults at each site.  No problem there.
I removed the static default routes on my OpenWrt routers, and advertised defaults at each site.  No problem there.
Line 155: Line 222:
=== EBGP Conversion ===
=== EBGP Conversion ===


I got sick of confederations, so I just removed the confederation statements and turned all the links into EBGP links.  This is how the network currently looks like, from a BGP perspective:
I got sick of confederations, so I just removed the confederation statements and converted all of the inter-site links to straight EBGP.</div></div>
 
[[file:bgpnet.png|280px|BGP on PCN]]


== Applications ==
== Applications ==
Line 163: Line 228:
PCN enables several applications:
PCN enables several applications:


* Voice (via [[SIP]] / G.711u)
* VoIP (via [[SIP]] / G.711u)
* IPv6
* IPv6 Internet access
* Streaming audio
* Streaming audio


Line 175: Line 240:
== External Links ==
== External Links ==


* [http://www.prolixium.com/mrtgfe PCN MRTG]
* [https://www.prolixium.com/mrtgfe PCN MRTG]
* [http://www.prolixium.net/ PCN Home Page]
* [http://www.prolixium.net/ PCN Home Page]

Latest revision as of 22:45, 12 May 2024

Prolixium Communications Network Logo

The Prolixium Communications Network (known also as PCN, mynet, My Network, Prolixium .NET, and My Hobby Network) is a collection of small, geographically disperse, computer networks that provide IPv4 and IPv6, VPN, and VoIP services to the Kamichoff family. Owned and operated solely by Mark Kamichoff, PCN often serves as a testbed for various network experiments. Some of the PCN nodes are connected via residential data services (cable modem), while others are located in data centers have Gigabit Ethernet (or better) connections to the Internet.

Current State

Overview

PCN WAN Architecture
PCN World Map

As of March 10, 2024, PCN is composed of several networks in the United States and across the globe, connected via OpenVPN and WireGuard with the IPv6 backbone connected via 6in4 tunnels:

Each site has multiple OpenVPN tunnels to other locations supporting both IPv4 and IPv6. The network is primarily powered by Free Range Routing (FRR) with some sites using BIRD.

Routing

The routing infrastructure consists of several autonomous systems, taken from the IANA-allocated private range: 64512 through 65534. Each site runs IBGP, possibly with a route reflector, and its own IGP for local next-hop resolution. EBGP is used between sites and peering connections. IPv4 Internet connectivity for each site is achieved by advertisement of default routes from boxes performing NAT. The lab is connected to starfire (core router) in Ashburn, VA. The PCN used to use one large OSPF area with no EGP. It was converted to a BGP confederation setup, which was a bad idea (but educational!), then reconverted to its current state.

BGP on PCN

IPv6 Connectivity

IPv6 connectivity is provided by four (5) direct connections to Vultr (The Constant Company), ARP Networks, Free Range Cloud, and Tier.Net. A Hurricane Electric BGP tunnel is used as backups off excalibur & trident but is depreferenced. The border transit network piece of the PCN provides this connectivity.

IPv6 addressing is out of 2620:6:2000::/44, which is a direct allocation from ARIN.

Border Transit Network

The border transit network operates in AS395460 and consists of excalibur, trident, orca, pegasus, daedalus, and concorde. Connectivity is provided by the following transit peers:

  • trident: AS25795 and AS6939
  • excalibur: AS20473 and AS6939
  • orca: AS20473
  • concorde: AS20473
  • pegasus: AS53356
  • daedalus: AS397423

This network injects a default route into the rest of the PCN, which can be referred to PEN (Prolixium Enterprise Network). The border network receives a full table from all transits and advertises 2620:6:2000::/44 out each peer along with some sites advertising /48 specifics for networks that are nearby.

Hurricane Electric (AS6939) is only used as backup because it is a tunneled connection and is suspected to be throttled.

Border Transit Network

Border Transit Network Map

The following hosts do not default route to the border transit network and use their own native IPv6 connectivity:

  • centauri
  • firefly
  • storm

The following hosts may have IPv6 connectivity but it's not currently enabled (at time of writing):

  • exodus
  • galactica
  • photonic

DNS

DNS is done with two views: internal and external. PCN has two external nameservers, and four internal ones, all which perform zone transfers from the master nameserver, ns3.antiderivative.net. antiderivative.net is used for all NS records, as well as glue records at the GTLD servers. The internal nameservers are ns{1-4} and external ones are ns{2,3}. Each zone has two views, internal and external, and a common file that is included in both views (SOA, etc.). The zones include the following:

  • Internal view, answering to 10/8, 172.16/12, and 192.168/16 addresses
    • 3.10.in-addr.arpa. and 3.16.172.in-addr.arpa. reverse zones
    • prolixium.com, prolixium.net, antiderivative.net, etc.'s internal A/CNAME records
  • External view, answering to everything !RFC1918
    • prolixium.com, prolixium.net, antiderivative.net, etc.'s external A/CNAME records
  • Common information, answering for all hosts
    • 0.0.0.2.6.0.0.0.0.2.6.2.ip6.arpa., and other reverse zones
    • prolixium.com, prolixium.net, antiderivative.net, etc.'s common MX records

Previously, the Xicada DNS Service (developed by Mark Kamichoff) kept track of all the forward delegations as well as IPv4 reverse delegations on Xicada. The administrator of each node enumerated their zones into a web form, and then configured their DNS server to pull down a forwarders definition for all Xicada zones. It supported BIND and djbdns, but also outputted a CSV file if someone decided to use another DNS server. It was originally intended that each DNS server should pull down a fesh copy of the forwarders definition file nightly, but there were really no rules.

Mark Kamichoff has a policy on his network to have DNS entries (includes A, AAAA, and PTR) for each and every active IP address. If a host is offline, the DNS records should be immediately expunged. This precludes the requirement of a host management system or a collection of poorly-maintained spreadsheets. If an IP is needed, the PTR should be checked. All DHCP-assigned IP addresses are created via {side ID}-{lastoctet}.prolixium.com. Again, no confusion. DNS itself is a database, so why not use it?

All transit links on PCN are addressed using the prolixium.net domain. The format is {unit/VLAN}.{interface}.{host}.prolixium.net. For example, the xl1 interface on starfire would be: xl1.starfire.prolixium.net. There is a collection of DNS entries for every IPv4 and IPv6 transit link. There is not one hop in my network which has no PTR record (or a PTR record w/out a corresponding A or AAAA record). Each router has a loopback interface with IPv4 and IPv6 addresses (if supported).

Ashburn-Specific Setup

Ashburn LAN

The network setup in Ashburn (formerly Seattle, WA and Charlotte, NC) is slightly different from the other sites, where there is a single router with a dynamic address. In the Ashburn location there are two ISPs and they're terminated in separate LXC instances (all with VPNs to at least one of interstellar, nox, dax, or elise - the "enterprise" network):

  • discovery (on evolution) - Verizon FiOS
  • sprint (on evolution) - Verizon Wireless (LTE)

starfire and evolution are the two core routers with multiple Gigabit Ethernet interfaces. The current routing setup is as follows:

  • IPv6 (Internet & internal) inbound & outbound traffic traverses discovery (Verizon FiOS) via VPN
  • IPv4 Internet inbound & outbound traffic traverses discovery (Verizon FiOS) via NAT
  • All LXCs above advertise an IPv4 default route into OSPFv2
  • LOCAL_PREF and AS_PATH prepending influence the traffic flow

In the case of backup, discovery is replaced with the LXC sprint.

In the past, NetFlow was used on atlantis, which was depicted in the drawing below:

PCN NetFlow Setup

The NetFlow collector ran ntop, but this was uninstalled due to instability.

Printing

The whole printing/CUPS/lpd setup is mostly an annoyance. Most people would want to run CUPS on every Unix client on the network. Mark Kamichoff believes it's better to have a lightweight client send a PostScript file via lpd to a CUPS server rather than sending a huge RAW raster stream across the network and have both the client and server do print processing. See the diagram to the bottom:

PCN Printing Setup

SmokePing

For monitoring, PCN uses a combination of Nagios, SmokePing, and MRTG. The SmokePing setup itself is a combination of slaves and masters, both IPv4 and IPv6.

SmokePing

nox is the master for a few slaves:

  • tiny - VPS connected to atlantic.net
  • storm - RPi 3 connected to AT&T Fiber
  • exodus - RPi 3 connected to AT&T DSL
  • galactica - RPi 4 B connected to Comcast Xfinity
  • photonic - RPi 4 B connected to Google Fiber
  • merlin - RPi 3 B connected to Comcast Business / Zayo
  • trefoil - RPi 5 connected to Spectrum

History

History is hidden by default. Click expand to see it.
Warning: This entire section is written in the first-person (Mark Kamichoff's) point of view

Beginnings

After joining the [Xicada network back at RPI, I decided to continue linking all of my networks and sites together via various VPN technologies. At first, the network was just a simple VPN between my network at home and a few computers in my dorm room at RPI. The connection tunnelled through RPI's firewall like a knife through warm butter, using OpenVPN's UDP encapsulation mode. Actually, a site to site UDP tunnel was the only thing OpenVPN offered, back then. My router at RPI was a blazing-fast Pentium 166MHz box running Debian GNU/Linux. At that point, my Xicada tunnels were terminated on another box I found in the trash, an old AMD K6-300, which eventually ran FreeBSD 4.

The network quickly started expanding, and I was able to move the K6-300 box (starfire) into the ACM's lab, which was given a 100mbit link, in the basement of the DCC. At this point in time, my network had three sites: home, the lab, and my dorm room. Since I didn't stick around RPI during most summers, I reterminated the Xicada links on starfire, since it sported a more permanent link.

Shortly after starfire was moved to the lab, I started toying with IPv6, and acquired a tunnel via Freenet6 (now Hexago, since they're actually trying to sell products, or something). RPI's firewall wouldn't allow IP protocol 41 through the firewall, and my attempts at getting this opened up for my IP failed. So, I terminated the IPv6 tunnel on my box at home, which sat on Optimum Online. Freenet6 gave me a /48 block out of the 3ffe::/16 6bone space, and I started distributing /64's out to all of my LAN segments. I started running Zebra's OSPFv3 daemon, and realized it was buggy as all get out. It mostly worked, though. Since Freenet6 gave me an ip6.int. delegation, I spent some time applying tons of patches to djbdns, my DNS server of choice, back then. After tons of patching, I got IPv6 support, which was fairly neat at the time. What did I use this new-found IPv6 connectivity for? IRC and web site hosting. www.prolixium.com has had an AAAA record since at least 2003.

Sometime in 2003 (I forget when), I moved my IPv6 tunnel to BTExact, British Telecom's free tunnel broker that actually gave out non-6bone /48's and ip6.arpa. DNS delegations. I quickly moved to them, and enjoyed quicker speeds than Freenet6 for about a year. Of course, after a year, my parents had a power outage at home, and my server lost the IP it had with OOL for the past two years. BTExact, at that time, had frozen their tunnel broker service, and didn't allow any modifications or new tunnels to be created. I went back to Freenet6, who had changed to 2001::/16 space.

After leaving RPI, and getting a job, I decided to purchase a dedicated server from SagoNet. I extended my network down to Tampa, FL, where the server was located.

Fast-forwarding to the present day, I currently have six sites, and native IPv6 from Voxel dot Net. Almost every host on the network is IPv6-aware, and the IPv6 connectivity is controlled completely by pf.

Xicada connectivity at this point has been terminated, due to lack of interest.

VLAN Conversion (Laundry Room Data Center)

VLAN Setup
I'm lucky to have CAT5(e?) cabled to every room in my condo, all aggregated in the laundry room, I figured it was time to deploy a couple different VLANs on my network. Initially, I just had a dumb switch connecting all of the various ports in different rooms together. Since that was too simple of a solution, I picked up a Cisco 2940 switch on eBay, and setup a 1Gbit trunk between starfire and the laundry room. I setup 4x VLANs:
  • 2: Various wall jacks
  • 3: Media center link (connected to kamikaze)
  • 4: Linksys link (connected to mercury)
  • 5: Lab link (connected to hysteresis)

I ended up throwing some other gear in the laundry room along with the switch, and ended up moving my lab (3.0) there.

BGP (Confederations) Conversion

History

Starting with the Xicada project, my network was one big OSPF backbone area. Entirely flat, except for some route redistribution for the lab connection. When I added OSPFv3 for IPv6 reachability, it was no different - one big area: no stub areas, no frills. It worked, but was boring, and didn't provide the flexibility required if I wanted to start redirecting Internet traffic.

After reading up on BGP, I realized I could make my network 1000% more complex, while gaining some real-world experience. Sounds like a plan, huh? Preparation and Design

Due to some Quagga instability issues, I originally tested out some alternate BGP/OSPF implementations, including XORP. Unfortunately, none of them fit the bill, and XORP, although promising, was horribly unstable and appeared to suffer from configuration file parsing issues, more than anything else. So I decided to stick with Quagga. I also decided to keep two separate BGP connections, one for IPv4 and one for IPv6 (so I didn't run into any nasty next-hop accessibility problems).

One of the goals of the redesign was to eliminate the large network-wide IGP process and break down each site into sub-ASes, using BGP confederations and route reflectors. This required a partial mesh of CBGP (confederation BGP - like EBGP, but more attributes are retained) between all the sites, to take advantage of the tunnels. Unfortunately, this meant that I had to renumber all of my IPv6 tunnels, since they were all /128's. Not a big deal. I didn't want to do this with the IPv4 (OpenVPN) tunnels, since the documentation strongly recommended against the use of anything other than a 32-bit netmask. This required the use of the ebgp-multihop command, since according to most [E]BGP implementations, /32's or /128's connecting to each other is not classified as 'directly connected' for some reason. (doesn't make sense to me, since even a TTL of 1 should theoretically allow communication to succeed)

At each site, I wanted to run IBGP internally, and designate one box to be the route reflector, in order to loosen the IBGP full-mesh requirement. Some of the OpenWrt devices did not have loopbacks at the time, so I needed to shuffle around some addresses and fix this.

I'd still run an IGP internal to each site (not nox or dax, since they are only one router), and advertise a default route via OSPFv2 within the site, for Internet access. I could also advertise default routes from two different routers within a site, for redundancy and failover Internet access.

So, here's some of the tasks I performed prior to making any routing changes:

  1. Add loopbacks to all routers
  2. Redo all IPv6 tunnel interfaces, converted to /126's to avoid subnet-router anycast issues
  3. Redo tunnel naming standards (was too long before)

IPv6 Migration

I figured, since on most platforms, IGP routes take precedence over BGP routes, I could add all the peering relationships and get everything setup without skipping a beat. Quagga's zebra process wouldn't insert or remove anything from the FIB (the kernel routing table). Then I could remove OSPFv3 from all the WAN links, and zebra would just shuffle around the routes, but reachability would come back within a few minutes, maybe?

So I started building the BGP neighbors, and quickly ran into a problem. For some reason, no IPv6 BGP routes were being sent to other peers from Quagga's bgpd. I posted a message to the mailing list, and quickly got a helpful response. Apparently I was hitting a bug that's been in Quagga for awhile (typo) that dealt with the address-family negotiation between peers. The quick fix was to add 'override-capability' to each neighbor (or peer group) and it would accept all advertised address families.

After all the peers were setup, I disabled OSPFv3 on all the WAN links, and everything reconverged... oddly. It looked like BGP was doing path-selection based on tiebreakers, and picking the higher peer address as the best path for a destination, even if it meant not utilizing the directly connected link. After scratching my head for a few minutes, I realized my stupidity. Normal BGP treats AS_CONFED_SEQUENCE and AS_CONFED_SET as a length of one, so all paths through my network looked like they had an AS path length of *1*. Luckily, Quagga had a nice bgp bestpath as-path confed command that modified the path selection algorithm, and gave me what I wanted. I described this a blog entry.

Since I wanted all loopbacks and transit interfaces reachable from anywhere, I added a ton of network statements to bgpd. It felt like a hack, but isn't too bad, since there's really no other way of doing it, without using a network-wide IGP.

IPv4 Migration

Since the IPv6 migration was successful, I figured the IPv4 migration would turn out the same - and it did, mostly.

I started setting up the IPv4 BGP neighbors, and ran into a strange issue with ScreenOS. I've documented it here. Basically, my two Juniper firewalls wouldn't establish IBGP connections unless they were configured as passive neighbors (wait for a connection).

After all the IPv4 BGP connections were up and running, I killed the network-wide IGP process entirely (shut off ospfd/ospf6d on dax and nox), and let everything reconverge. It worked out of the box - success!

I removed the static default routes on my OpenWrt routers, and advertised defaults at each site. No problem there.

Finish

Although I ran into a number of problems, and probably complicated troubleshooting of my network by an order of magnitude, I think the conversion was worth it. Now if anyone wants to start Xicada 2.0, we can do it right, this time...

EBGP Conversion

I got sick of confederations, so I just removed the confederation statements and converted all of the inter-site links to straight EBGP.

Applications

PCN enables several applications:

  • VoIP (via SIP / G.711u)
  • IPv6 Internet access
  • Streaming audio

Lab

Main Article: PCN Lab

The PCN lab is Mark Kamichoff's network proving ground and general hacking arena.

External Links