Oct 4, 2007

Bidirectional Forwarding Detection(BFD)

Bidirectional Forwarding Detection (BFD) is a network protocol used to detect faults between two forwarding engines. It provides low-overhead detection of faults even on physical media that don't support failure detection of any kind, such as ethernet, virtual circuits, tunnels and MPLS LSPs.

BFD establishes a session between two endpoints over a particular link. If more than one link exists between two systems, multiple BFD sessions may be established to monitor each one of them. The session is established with a three-way handshake, and is torn down the same way. Authentication may be enabled on the session. A choice of simple password, MD5 or SHA1 authentication is available.

BFD does not have a discovery mechanism; sessions must be explicitly configured between endpoints. BFD may be used on many different underlying transport mechanisms and layers, and operates independently of all of these. Therefore, it needs to be encapsulated by whatever transport it uses. For example, monitoring MPLS LSPs involves piggybacking session establishment on LSP-Ping packets. Protocols that support some form of adjacency setup, such as OSPF or IS-IS, may also be used to bootstrap a BFD session. These protocols may then use BFD to receive faster notification of failing links than would normally be possible using the protocol's own keepalive mechanism.

A session may operate in one of two modes: asynchronous mode and demand mode. In asynchronous mode, both endpoints periodically send Hello packets to each other. If a number of those packets are not received, the session is considered down.

In demand mode, no Hello packets are exchanged after the session is established; it is assumed that the endpoints have another way to verify connectivity to each other, perhaps on the underlying physical layer. However, either host may still send Hello packets if needed.

Regardless of which mode is in use, either endpoint may also initiate an Echo function. When this function is active, a stream of Echo packets is sent, and the other endpoint then sends these back to the sender via its forwarding plane. This is used to test the forwarding path on the remote system.

Generalized MPLS(GMPLS)

Generalized MPLS

a. What is "Generalized MPLS" or "GMPLS"
From "Generalized Multi-Protocol Label Switching Architecture" "Generalized MPLS extends MPLS to encompass time-division (e.g. SONET ADMs), wavelength (optical lambdas) and spatial switching (e.g. incoming port or fiber to outgoing port or fiber)."

GMPLS represents a natural extension of MPLS to allow MPLS to be used as the control mechanism for configuring not only packet-based paths, but also paths in non-packet based devices such as optical switches, TDM muxes, and SONET/ADMs.

For an overview of GMPLS, see Generalized Multiprotocol Label Switching: An Overview of Routing and Management Enhancements

b. What are the components of GMPLS?
GMPLS introduces a new protocol called the "Link Management Protocol" or LMP. LMP runs between adjacent nodes and is responsible for establishing control channel connectivity as well as failure detection. LMP also verifies connectivity between channels.

Additionally, the IETF's "Common Control and Measurement Plane" working group (ccamp) is working on defining extensions to interior gateway routing protocols such as OSPF and IS-IS to enable them to support GMPLS operation.

c. What are the features of GMPLS?
GMPLS supports several features including:

Link Bundling - the grouping of multiple, independent physical links into a single logical link
Link Hierarchy - the issuing of a suite of labels to support the various requirements of physical and logical devices across a given path
Unnumbered Links - the ability to configure paths without requiring an IP address on every physical or logical interface
Constraint Based Routing - the ability to automatically provision additional bandwidth, or change forwarding behavior based on network conditions such as congestion or demands for additional bandwidth

d. What are the "Peer" and "Overlay" models?
GMPLS supports two methods of operation, peer and overlay. In the peer model, all devices in a given domain share the same control plane. This provides true integration between optical switches and routers. Routers have visibility into the optical topology and routers peer with optical switches. In the overlay model, the optical and routed (IP) layers are separated, with minimal interaction. Think of the overlay model as the equivalent of today's ATM and IP networks, where there is no direct connection between the ATM layer and the IP routing layer.

The peer model is inherently simpler and more scalable, but the overlay model provides fault isolation and separate control mechanisms for the physical and routed network layers, which may be more attractive to some network operators.

e. What is the "Optical Internetworking Forum"?
The Optical Internetworking Forum (OIF) is an open industry organization of equipment manufacturers, telecom service providers and end users dedicated to promote the global development of optical internetworking products and foster the development and deployment of interoperable products and services for data switching and routing using optical networking technologies.

An Introduction to the Optical Internetworking Forum White Paper can be found at http://www.oiforum.com/

f. Where can I get more information on GMPLS?
For information about GMPLS standards development, visit the IETF Common Control and Measurement Plane (CCAMPP) working group web page at http://www.ietf.org/html.charters/ccamp-charter.html as well as the White Papers section of this web site.

INTERNET PROTOCOL VERSION 6 MULTICAST ADDRESSES

IPv6相較於IPv4的改革,除了IP Address數量之外,還有一個很重要的就是取消了IPv4的broadcast,全部改為Multicast及新增加的Anycast來取代,以下是目前IANA所定義的一些multicast address,雖然不用強記,但是對於了解IPv6的運作肯定會有所幫助!

INTERNET PROTOCOL VERSION 6 MULTICAST ADDRESSES

(last updated 2007-08-30)

IPv6 multicast addresses are defined in "IP Version 6 Addressing
Architecture" [RFC4291]. This defines fixed scope and variable scope
multicast addresses.

IPv6 multicast addresses are distinguished from unicast addresses by the
value of the high-order octet of the addresses: a value of 0xFF (binary
11111111) identifies an address as a multicast address; any other value
identifies an address as a unicast address.

The rules for assigning new IPv6 multicast addresses are defined in
[RFC3307]. IPv6 multicast addresses not listed below are reserved.

Current IPv6 multicast addresses are listed below.


Fixed Scope Multicast Addresses
-------------------------------

These permanently assigned multicast addresses are valid over a specified
scope value.

Node-Local Scope
----------------

FF01:0:0:0:0:0:0:1 All Nodes Address [RFC4291]
FF01:0:0:0:0:0:0:2 All Routers Address [RFC4291]
FF01:0:0:0:0:0:0:FB mDNSv6 [Cheshire]

Link-Local Scope
----------------

FF02:0:0:0:0:0:0:1 All Nodes Address [RFC4291]
FF02:0:0:0:0:0:0:2 All Routers Address [RFC4291]
FF02:0:0:0:0:0:0:3 Unassigned [JBP]
FF02:0:0:0:0:0:0:4 DVMRP Routers [RFC1075,JBP]
FF02:0:0:0:0:0:0:5 OSPFIGP [RFC2328,Moy]
FF02:0:0:0:0:0:0:6 OSPFIGP Designated Routers [RFC2328,Moy]
FF02:0:0:0:0:0:0:7 ST Routers [RFC1190,KS14]
FF02:0:0:0:0:0:0:8 ST Hosts [RFC1190,KS14]
FF02:0:0:0:0:0:0:9 RIP Routers [RFC2080]
FF02:0:0:0:0:0:0:A EIGRP Routers [Farinacci]
FF02:0:0:0:0:0:0:B Mobile-Agents [Bill Simpson]
FF02:0:0:0:0:0:0:C SSDP [UPnP]
FF02:0:0:0:0:0:0:D All PIM Routers [Farinacci]
FF02:0:0:0:0:0:0:E RSVP-ENCAPSULATION [Braden]
FF02:0:0:0:0:0:0:F UPnP [UPnP]
FF02:0:0:0:0:0:0:16 All MLDv2-capable routers [RFC3810]
FF02:0:0:0:0:0:0:6A All-Snoopers [RFC4286]
FF02:0:0:0:0:0:0:6B PTP-pdelay [IEEE1588, K.Lee] 02 February 2007
FF02:0:0:0:0:0:0:6C Saratoga [Wood] 30 August 2007
FF02:0:0:0:0:0:0:FB mDNSv6 [Cheshire]

FF02:0:0:0:0:0:1:1 Link Name [Harrington]
FF02:0:0:0:0:0:1:2 All-dhcp-agents [RFC3315]
FF02:0:0:0:0:0:1:3 Link-local Multicast Name
Resolution [RFC4795]
FF02:0:0:0:0:0:1:4 DTCP Announcement [Vieth, Tersteegen]

FF02:0:0:0:0:1:FFXX:XXXX Solicited-Node Address [RFC4291]

FF02:0:0:0:0:2:FF00::/104 Node Information Queries [RFC4620]


Site-Local Scope
----------------

FF05:0:0:0:0:0:0:2 All Routers Address [RFC4291]
FF05:0:0:0:0:0:0:FB mDNSv6 [Cheshire]

FF05:0:0:0:0:0:1:3 All-dhcp-servers [RFC3315]
FF05:0:0:0:0:0:1:4 Deprecated (2003-03-12)
FF0X:0:0:0:0:0:1:1000 Service Location, Version 2 [RFC3111]
-FF0X:0:0:0:0:0:1:13FF


Variable Scope Multicast Addresses
----------------------------------

These permanently assigned multicast addresses are valid over all scope
ranges. This is shown by an "X" in the scope field of the address that
means any legal scope value.

Note that, as defined in [RFC4291], IPv6 multicast addresses which
are only different in scope represent different groups. Nodes must
join each group individually.

The IPv6 multicast addresses with variable scope are listed below.

FF0X:0:0:0:0:0:0:0 Reserved Multicast Address [RFC4291]
FF0X:0:0:0:0:0:0:C SSDP [UPnP]
FF0X:0:0:0:0:0:0:FB mDNSv6 [Cheshire]

FF0X:0:0:0:0:0:0:100 VMTP Managers Group [RFC1045,DRC3]
FF0X:0:0:0:0:0:0:101 Network Time Protocol (NTP) [RFC1119,DLM1]
FF0X:0:0:0:0:0:0:102 SGI-Dogfight [AXC]
FF0X:0:0:0:0:0:0:103 Rwhod [SXD]
FF0X:0:0:0:0:0:0:104 VNP [DRC3]
FF0X:0:0:0:0:0:0:105 Artificial Horizons - Aviator [BXF]
FF0X:0:0:0:0:0:0:106 NSS - Name Service Server [BXS2]
FF0X:0:0:0:0:0:0:107 AUDIONEWS - Audio News Multicast [MXF2]
FF0X:0:0:0:0:0:0:108 SUN NIS+ Information Service [CXM3]
FF0X:0:0:0:0:0:0:109 MTP Multicast Transport Protocol [SXA]
FF0X:0:0:0:0:0:0:10A IETF-1-LOW-AUDIO [SC3]
FF0X:0:0:0:0:0:0:10B IETF-1-AUDIO [SC3]
FF0X:0:0:0:0:0:0:10C IETF-1-VIDEO [SC3]
FF0X:0:0:0:0:0:0:10D IETF-2-LOW-AUDIO [SC3]
FF0X:0:0:0:0:0:0:10E IETF-2-AUDIO [SC3]
FF0X:0:0:0:0:0:0:10F IETF-2-VIDEO [SC3]

FF0X:0:0:0:0:0:0:110 MUSIC-SERVICE [Guido van Rossum]
FF0X:0:0:0:0:0:0:111 SEANET-TELEMETRY [Andrew Maffei]
FF0X:0:0:0:0:0:0:112 SEANET-IMAGE [Andrew Maffei]
FF0X:0:0:0:0:0:0:113 MLOADD [Braden]
FF0X:0:0:0:0:0:0:114 any private experiment [JBP]
FF0X:0:0:0:0:0:0:115 DVMRP on MOSPF [Moy]
FF0X:0:0:0:0:0:0:116 SVRLOC [Guttman]
FF0X:0:0:0:0:0:0:117 XINGTV
FF0X:0:0:0:0:0:0:118 microsoft-ds
FF0X:0:0:0:0:0:0:119 nbc-pro
FF0X:0:0:0:0:0:0:11A nbc-pfn
FF0X:0:0:0:0:0:0:11B lmsc-calren-1 [Uang]
FF0X:0:0:0:0:0:0:11C lmsc-calren-2 [Uang]
FF0X:0:0:0:0:0:0:11D lmsc-calren-3 [Uang]
FF0X:0:0:0:0:0:0:11E lmsc-calren-4 [Uang]
FF0X:0:0:0:0:0:0:11F ampr-info [Janssen]

FF0X:0:0:0:0:0:0:120 mtrace [Casner]
FF0X:0:0:0:0:0:0:121 RSVP-encap-1 [Braden]
FF0X:0:0:0:0:0:0:122 RSVP-encap-2 [Braden]
FF0X:0:0:0:0:0:0:123 SVRLOC-DA [Guttman]
FF0X:0:0:0:0:0:0:124 rln-server [Kean]
FF0X:0:0:0:0:0:0:125 proshare-mc [Lewis]
FF0X:0:0:0:0:0:0:126 dantz [Yackle]
FF0X:0:0:0:0:0:0:127 cisco-rp-announce [Farinacci]
FF0X:0:0:0:0:0:0:128 cisco-rp-discovery [Farinacci]
FF0X:0:0:0:0:0:0:129 gatekeeper [Toga]
FF0X:0:0:0:0:0:0:12A iberiagames [Marocho]
FF0X:0:0:0:0:0:0:12B X Display [McKernan]
FF0X:0:0:0:0:0:0:12C oap-multicast [Eastham]
FF0X:0:0:0:0:0:0:12D DvbServDisc [Willigen]
FF0X:0:0:0:0:0:0:12E Ricoh-device-ctrl [Ohhira]
FF0X:0:0:0:0:0:0:12F Ricoh-device-ctrl [Ohhira]

FF0X:0:0:0:0:0:0:130 UPnP [UPnP] 21 September 2006
FF0X:0:0:0:0:0:0:131 Systech Mcast [Jakubiec] 21 September 2006
FF0X:0:0:0:0:0:0:132 omasg [Lipford] 21 September 2006

FF0X:0:0:0:0:0:0:181 PTP-primary [IEEE1588, K.Lee] 02 February 2007
FF0X:0:0:0:0:0:0:182 PTP-alternate1 [IEEE1588, K.Lee] 02 February 2007
FF0X:0:0:0:0:0:0:183 PTP-alternate2 [IEEE1588, K.Lee] 02 February 2007
FF0X:0:0:0:0:0:0:184 PTP-alternate3 [IEEE1588, K.Lee] 02 February 2007


FF0X:0:0:0:0:0:0:201 "rwho" Group (BSD) (unofficial) [JBP]
FF0X:0:0:0:0:0:0:202 SUN RPC PMAPPROC_CALLIT [BXE1]

FF0X:0:0:0:0:0:0:300 Mbus/Ipv6 [RFC3259]

FF0X:0:0:0:0:0:2:0000
-FF0X:0:0:0:0:0:2:7FFD Multimedia Conference Calls [SC3]
FF0X:0:0:0:0:0:2:7FFE SAPv1 Announcements [SC3]
FF0X:0:0:0:0:0:2:7FFF SAPv0 Announcements (deprecated) [SC3]
FF0X:0:0:0:0:0:2:8000
-FF0X:0:0:0:0:0:2:FFFF SAP Dynamic Assignments [SC3]

FF3X:0::0-FF3X:0000:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF (FF3X:0000:/32) Source-Specific Multicast block
Registration Rules:
Addresses in FF3X:0000:/32 but not listed below are reserved for future SSM
address use, but are currently invalid.
FF3X::4000:1-FF3X::7FFF:FFFF - IETF consensus
FF3X::8000:0-FF3X::FFFF:FFFF - Dynamically allocated by hosts when needed [RFC4607].

Address/Range Description Reference
--------------------------- ---------------------------------- ---------
FF3X::0:0-FF3X::3FFF:FFFF Invalid addresses [RFC4607]
FF3X::4000:0 Reserved [RFC4607]
FF3X::4000:1-FF3X::7FFF:FFFF Reserved for IANA allocation [RFC4607]
FF3X::8000:0-FF3X::FFFF:FFFF Reserved for local host allocation [RFC4607]


References
----------

[RFC2462] Thompson, S., and T. Narten, "IPv6 Stateless Address
Autoconfiguration", RFC 1971, December 1998.

[RFC1045] Cheriton, D., "VMTP: Versatile Message Transaction Protocol
Specification", RFC 1045, February 1988.

[RFC1075] Waitzman, D., Partridge, C., and S. Deering, "Distance
Vector Multicast Routing Protocol", RFC 1075, November
1988.

[RFC1119] Mills, D., "Network Time Protocol (Version 1),
Specification and Implementation", STD 12, RFC 1119, July
1988.

[RFC1190] Topolcic, C., Editor, "Experimental Internet Stream
Protocol, Version 2 (ST-II)", RFC 1190, October 1990.

[RFC2080] Malkin, G., and R. Minnear, "RIPng for IPv6", RFC 2080,
January 1997.

[RFC3111] Guttman, E., "Service Location Protocol", RFC 3111,
May 2001.

[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.

[RFC3259] J. Ott, C. Perkins, and D. Kutscher, "A Message Bus for
Local Coordination", RFC 3259, April 2002.

[RFC3315] R. Droms, J. Bound, B. Volz, T. Lemon, C. Perkins, and M. Carney,
"Dynamic Host Configuration Protocol for IPv6 (DHCPv6)",
RFC 3315, July 2003.

[RFC3810] R. Vida, L. Costa, Eds., "Multicast Listener Discovery Version 2
(MLDv2) for IPv6", RFC 3810, June 2004.

[RFC4286] B. Haberman and J. Martin, "Multicast Router Discovery",
RFC 4286, December 2005.

[RFC4291] Hinden, R., and S. Deering, "IP Version 6 Addressing
Architecture", RFC 4291, February 2006.

[RFC4620] M. Crawford and B. Haberman, Ed. "IPv6 Node Information Queries",
RFC 4620, August 2006.

[RFC4607] H. Holbrook and B. Cain, "Source-Specific Multicast for IP",
RFC 4607, August 2006.

[RFC4795] B. Aboba, D. Thaler, L. Esibov, "Link-local Multicast Name
Resolution (LLMNR)", RFC 4795, January 2007.

[IEEE1588] http://ieee1588.nist.gov/


People
------

[Aboba] Bernard Aboba, , May 2004.



[AXC] Andrew Cherenson,

[Braden] Bob Braden, , April 1996.

[Bob Brenner]

[Bressler] David J. Bressler, , April 1996.



[Bound] Jim Bound,

[BXE1] Brendan Eic,

[BXF] Bruce Factor,

[BXS2] Bill Schilit,

[Casner] Steve Casner, , January 1995.

[Cheshire] Stuart Cheshire, , 05 October 2005.

[CXM3] Chuck McManis,

[Tim Clark]

[DLM1] David Mills,

[DRC3] Dave Cheriton,

[DXS3] Daniel Steinber,

[Eastham] Bryant Eastham, , April 2005.

[Farinacci] Dino Farinacci,

[GSM11] Gary S. Malkin,

[Guttman] Erik Guttman, , May 2001.

[Harrington] Dan Harrington, , July 1996.



[IANA] IANA,

[Jakubiec] Dan Jakubiec, , 21 September 2006.

[Janssen] Rob Janssen, , January 1995.

[JBP] Jon Postel,

[JXM1] Jim Miner,

[Kean] Brian Kean, , August 1995.

[KS14]

[Lee] Choon Lee, , April 1996.

[K.Lee] Kang Lee, , 02 February 2007.

[Lewis] Mark Lewis, , October 1995.

[Lipford] Mark Lipford, , 21 September 2006.

[Malamud] Carl Malamud, , January 1996.

[Andrew Maffei]

[Marohco] Jose Luis Marocho, <73374.313&compuserve.com>, July 1996.

[McKernan] John McKernan, , May 2003.

[Moy] John Moy,

[MXF2] Martin Forssen,

[Ohhira] Kohki Ohhira, , 20 June 2006.

[Perkins] Charlie Perkins,

[Guido van Rossum]

[SC3] Steve Casner,

[Simpson] Bill Simpson, November 1994.

[Joel Snyder]

[SXA] Susie Armstrong,

[SXD] Steve Deering,

[Tersteegen] Hanno Tersteegen, , May 2004.

[Toga] Jim Toga, , May 1996.

[Tynan] Dermot Tynan, , August 1995.

[Uang] Yea Uang, November 1994.

[UPnP] UPnP Forum, , April 2002, 17 August 2006, 21 September 2006.

[Vieth] Moritz Vieth, , May 2004.

[Willigen] Bert van Willigen, , 16 September 2005.

[Wood] Lloyd Wood, , 30 August 2007.

[Yackle] Dotty Yackle, , February 1996.

Oct 3, 2007

Cisco Nonstop Forwarding for BGP: Deployment & Troubleshooting


When the NSF-capable router performs a route processor switchover, the TCP connection to the Peer Router is cleared; a Peer Router that does not support BGP restart then clears all routes associated with the Restarting Router and no longer forwards packets to it. With BGP Graceful Restart, the Peer Router marks all routes to the Restarting Router as stale, but continues to use them for packet forwarding, based upon the knowledge that the Restarting Router will re-establish the BGP session shortly and that it maintains the capability to forward packets in the interim.

When the Restarting Router's newly active RP opens the new BGP session, it will again send the Graceful Restart capability (#64). However, this time, the restart bit in the Restart Flags portion of the capability exchange will be set. This notifies the Peer Routers that the restart of the BGP process on the Restarting Router caused the disconnect/reconnect.

While continuing to forward packets, the Peer Router refreshes the Restarting Router with any relevant BGP updates. The Peer Router indicates completion of this process by sending an End-of-RIB (EOR) marker. The EOR marker for IPv4 is a BGP update message that is of the minimum length—23 bytes. The EOR does not contain any routes to be added or withdrawn. Essentially, it is an "empty" update, whose sole purpose is to indicate that all available routes have been sent. The EOR marker helps speed convergence, because it allows the router to begin best-path selection as quickly as possible, without waiting for the timer to expire.3

Once the Restarting Router has received all available routes from each peer, it can conduct best-path selection, and send any updates to its Peer Routers. The Restarting Router will also use the EOR to indicate the completion of this process.

Figure 2 provides a graphical representation of this process.

Figure 2

BGP Graceful Restart Procedures



Consider the step-by-step protocol exchange to clarify the implementation of End of RIB (EOR) and Graceful Restart (GR). The goal is to restart a BGP session without the Restarting Router's peers redirecting traffic around the Restarting Router.

1. The BGP process of Router A (RTR_A) BGP begins, and it establishes a peering relationship with router B (RTR_B). It sends an OPEN message to router B, and the OPEN message includes the Graceful Restart Capability (Code 64) and Address Family of IPv4, Subsequent Address Family of unicast. Because router B also supports GR, it also sends an acknowledgement via its own OPEN Message, which contains GR=64 and AF=IPv4.

2. An RP switchover occurs and Router A's BGP process restarts on the newly active RP. Router A does not have a routing information base on this RP, and must reacquire it from its Peer Routers. Router A will continue to forward IP packets destined for (or through) Router B using the last updated FIB and CEF table.

3. When the Receiving router (Router B) detects that the TCP session between it and Restarting Router is cleared, it immediately marks routes learned from the Restarting Router as STALE. Router B only marks routes learned from Router A as STALE; If B had other peers, then the routes learned from those peers would remain in the UP state. Router B also initializes a Restart-timer for the Restarting Router. The default setting for this timer is 120 seconds. The Restart-Timer is the amount of time that a Receiving router will wait for an OPEN message from the Restarting Router. A Receiving router will remove all STALE routes unless it receives an OPEN message from the Restarting Router within the specified Restart-time. Once router B receives router A's OPEN message, the Restart-timer is reset. During this time, Routers A and B continue to forward traffic using the last updated CEF table.

4. Router A's BGP process has initialized. It will now attempt to re-establish a BGP session with router B. It first establishes a new TCP session, and then it sends an OPEN message (Restart State bit set, Restart Time= n, and Forwarding State= IPv4). By default, Restart time is 120 seconds and it is also configurable. When Router B receives this OPEN message, it resets its own Restart-timer and starts a Stale-path timer. Stale-path, by default, is 360 seconds and is configurable.

5. Both routers successfully re-establish their session. At this point, if Router B recognizes that the Forwarding State in Router A's OPEN message is not set for IPv4, it immediately removes any STALE routes, which it had learned from the Restarting Speaker, and re-computes its routing database. (Normally, the Forwarding State will be set for IPv4)

6. Router B will begin to send UPDATE messages to Router A. These messages contain IP prefix information, and Router A will process them accordingly. Until an EOR indication is received from all peers (or the bgp update-delay timer expires), Router A will not start the BGP Route Selection Process. A new routing information database is available after the Route Selection Process is finished and the CEF information is updated accordingly. Router A starts an update-delay timer and waits up to 120 seconds to receive EOR from all of its NSF-peers.

7. Once Router A has received EOR from all its peers, it will begin the BGP Route Selection Process. Once this process is complete, it will begin to send UPDATE messages, which contain prefix information, to router B. Router A concludes this process by sending an EOR indication to Router B so that B, in turn, can start its Route Selection Process. Once Router B receives an EOR from A, and it has completed its Route Selection Process, then any STALE entries in BGP will be refreshed with newer information or removed from the BGP RIB and FIB. Router B is now converged. While Router B waits for an EOR, it also monitors stalepath-time. If the timer expires, all STALE routes will be removed and "normal" BGP processes will be started.

4.0 Router Preparation and Network Configuration:
In order to ensure a successful migration to a Graceful Restart-capable router, there are a few important principles to consider.

The router must have compatible RPs installed. In addition, care should be taken when mixing RP types:

•Cisco 12000 Series Internet Router: GRP and GRP-B RPs can be used together. If using a PRP on this router, it must be paired with another PRP.

•Cisco 10000 Series Internet Router: two PRE-1s must be used. The original PRE for this router is not supported for purposes of Cisco NSF with SSO.

•Cisco 7500: RSP-2 and RSP-4 can be used in combination. RSP-8 and RSP-16 can also be used in combination. However, an RSP-8 or RSP-16 cannot be mixed with an RSP-2 or an RSP-4.

•For all RP types on all supported platforms, the active and standby RPs must have the same amount of memory

A wide variety of line cards support Cisco NSF with SSO, but—for optimum performance of BGP Graceful Restart—every card in the router chassis should support Cisco SSO. For a list of supported line cards, please visit: http://www.cisco.com/en/US/products/sw/iosswrel/ps1829/1221748

Cisco SSO may not be supported on any line card not specifically listed in the aforementioned document. In this case, that specific line card will operate in RPR+ mode. At the time of the RP switchover, the dCEF table on the card will be cleared. This will cause Cisco NSF to destinations reachable through that card to fail.

Subsequent releases of Cisco IOS Software provide additional hardware support for Cisco SSO on specific line cards. Please check the release notes for later releases of Cisco IOS Software to determine if support for a particular line card may be available.

The referenced document also supplies detailed instructions on enabling SSO on the platforms that support Cisco NSF. Cisco SSO is an absolute requirement for enabling Cisco NSF; it will not work unless both are concurrently enabled.

On the Cisco 12000 Series Internet Router, there is a method to validate whether all line cards within a chassis are supported. Load a software image enabled with Cisco NSF with SSO and then issue the command "show redundancy mode-supported". Each card in the chassis will be listed, and indicate the highest level of system redundancy it supports (RPR, RPR-Plus, Cisco SSO).

To achieve the full benefit of Cisco NSF with SSO, all line cards should support Cisco SSO. Furthermore, depending on platform, Distributed Cisco Express Forwarding (dCEF) must be enabled for the line cards in order for NSF to work.

The correct software image must be loaded on the flash disks of both route processors. Currently, mixing software versions between the active and standby router processors is not supported—even if both software images support Cisco NSF with SSO.

The software boot image in bootflash should also be upgraded and should correspond to the software image being loaded on the RP.

BGP Graceful Restart is configured under the global "router bgp" configuration command. The most basic configuration is "bgp graceful-restart"

Router(config-route)# [no] bgp graceful-restart

Router(config-route)# [no] bgp graceful-restart restart-time n

Router(config-route)# [no] bgp update-delay n

Router(config-route)# [no] bgp graceful-restart stalepath-time n

The "bgp graceful-restart" command must be entered on the Cisco NSF-capable router, and also must be entered on any NSF-aware peer that will be participating in Graceful Restart. Graceful Restart is not enabled by default, and must be explicitly configured on both the Restarting Router and all Peer Routers.

The "bgp graceful-restart restart-time n" is the maximum amount of time that a peer will wait for a reconnection of the TCP session and a new BGP OPEN message following the detection of a failure on the Restarting Router. If the TCP and BGP sessions are not re-established before this timer expires, the BGP session is deemed a failure, and normal BGP recovery procedures take effect. The default value for restart-time is 120 seconds.

The "bgp update-delay n" command may be entered on the Cisco NSF-capable router. The update-delay specifies the time interval- after the first peer has reconnected—during which the restarting router expects to receive all BGP updates and the EOR marker from all of its configured peers. The default value of n is 120 seconds, and n is always measured in seconds. If the restarting router has a large number of peers, each with a large number of updates to be sent, this value may need to be increased from its default value.

The "bgp graceful-restart stalepath-time n" command may be entered on the NSF-aware peer(s) of the restarting router. This timer sets an upper limit on how long the peer will continue to use stale routes for forwarding after it has re-established the BGP session with the restarting router. The default value is 360 seconds. While this should allow an adequate amount of time to allow for complete convergence, on very large networks it may be necessary to increase this value.

RFC 4364 - BGP/MPLS IP Virtual Private Networks (VPNs)(About Route Distinguisher)



4.1. The VPN-IPv4 Address Family

The BGP Multiprotocol Extensions [BGP-MP] allow BGP to carry routes
from multiple "address families". We introduce the notion of the
"VPN-IPv4 address family". A VPN-IPv4 address is a 12-byte quantity,
beginning with an 8-byte Route Distinguisher (RD) and ending with a
4-byte IPv4 address. If several VPNs use the same IPv4 address
prefix, the PEs translate these into unique VPN-IPv4 address
prefixes. This ensures that if the same address is used in several
different VPNs, it is possible for BGP to carry several completely
different routes to that address, one for each VPN.

Since VPN-IPv4 addresses and IPv4 addresses are different address
families, BGP never treats them as comparable addresses.

An RD is simply a number, and it does not contain any inherent
information; it does not identify the origin of the route or the set
of VPNs to which the route is to be distributed. The purpose of the
RD is solely to allow one to create distinct routes to a common IPv4
address prefix. Other means are used to determine where to
redistribute the route (see Section 4.3).

The RD can also be used to create multiple different routes to the
very same system. We have already discussed a situation in which the
route to a particular server should be different for intranet traffic
than for extranet traffic. This can be achieved by creating two
different VPN-IPv4 routes that have the same IPv4 part, but different
RDs. This allows BGP to install multiple different routes to the
same system, and allows policy to be used (see Section 4.3.5) to
decide which packets use which route.

The RDs are structured so that every Service Provider can administer
its own "numbering space" (i.e., can make its own assignments of
RDs), without conflicting with the RD assignments made by any other
Service Provider. An RD consists of three fields: a 2-byte type
field, an administrator field, and an assigned number field. The
value of the type field determines the lengths of the other two
fields, as well as the semantics of the administrator field. The
administrator field identifies an assigned number authority, and the
assigned number field contains a number that has been assigned, by
the identified authority, for a particular purpose. For example, one
could have an RD whose administrator field contains an Autonomous
System number (ASN), and whose (4-byte) number field contains a
number assigned by the SP to whom that ASN belongs (having been
assigned to that SP by the appropriate authority).

RDs are given this structure in order to ensure that an SP that
provides VPN backbone service can always create a unique RD when it



Rosen & Rekhter Standards Track [Page 13]

RFC 4364 BGP/MPLS IP VPNs February 2006


needs to do so. However, the structure is not meaningful to BGP;
when BGP compares two such address prefixes, it ignores the structure
entirely.

A PE needs to be configured such that routes that lead to a
particular CE become associated with a particular RD. The
configuration may cause all routes leading to the same CE to be
associated with the same RD, or it may cause different routes to be
associated with different RDs, even if they lead to the same CE.

4.2. Encoding of Route Distinguishers

As stated, a VPN-IPv4 address consists of an 8-byte Route
Distinguisher followed by a 4-byte IPv4 address. The RDs are encoded
as follows:

- Type Field: 2 bytes
- Value Field: 6 bytes

The interpretation of the Value field depends on the value of the
type field. At the present time, three values of the type field are
defined: 0, 1, and 2.

- Type 0: The Value field consists of two subfields:

* Administrator subfield: 2 bytes
* Assigned Number subfield: 4 bytes

The Administrator subfield must contain an Autonomous System
number. If this ASN is from the public ASN space, it must have
been assigned by the appropriate authority (use of ASN values
from the private ASN space is strongly discouraged). The
Assigned Number subfield contains a number from a numbering space
that is administered by the enterprise to which the ASN has been
assigned by an appropriate authority.

- Type 1: The Value field consists of two subfields:

* Administrator subfield: 4 bytes
* Assigned Number subfield: 2 bytes

The Administrator subfield must contain an IP address. If this
IP address is from the public IP address space, it must have been
assigned by an appropriate authority (use of addresses from the
private IP address space is strongly discouraged). The Assigned
Number subfield contains a number from a numbering space which is
administered by the enterprise to which the IP address has been
assigned.



Rosen & Rekhter Standards Track [Page 14]

RFC 4364 BGP/MPLS IP VPNs February 2006


- Type 2: The Value field consists of two subfields:

* Administrator subfield: 4 bytes
* Assigned Number subfield: 2 bytes

The Administrator subfield must contain a 4-byte Autonomous
System number [BGP-AS4]. If this ASN is from the public ASN
space, it must have been assigned by the appropriate authority
(use of ASN values from the private ASN space is strongly
discouraged). The Assigned Number subfield contains a number
from a numbering space which is administered by the enterprise to
which the ASN has been assigned by an appropriate authority.

A Brief Overview of SONET Technology

最近開始逐步準備SP(Service Provider) CCIE Written,因此我將會把相關準備方向及參考資料放上來供大家參考,準備CCIE Written很大的困擾就是不知道要看什麼書籍…因為出題範圍就是…幾近沒有範圍(比以前CCIE沒有分類前好一些而已),所以只好針對一些有可能的方向找出重點來了解,順便可以提昇自己的知識領域:

SONET Basics
SONET defines optical signals and a synchronous frame structure for multiplexed digital traffic. It is a set of standards that define the rates and formats for optical networks specified in ANSI T1.105, ANSI T1.106, and ANSI T1.117.

A similar standard, Synchronous Digital Hierarchy (SDH), is used in Europe by the International Telecommunication Union Telecommunication Standardization Sector (ITU-T). SONET equipment is generally used in North America, and SDH equipment is generally accepted everywhere else in the world.

Both SONET and SDH are based on a structure that has a basic frame format and speed. The frame format used by SONET is the Synchronous Transport Signal (STS), with STS-1 as the base-level signal at 51.84 Mbps. An STS-1 frame can be carried in an OC-1 signal. The frame format used by SDH is the Synchronous Transport Module (STM), with STM-1 as the base-level signal at 155.52Mbps. An STM-1 frame can be carried in an OC-3 signal.

Both SONET and SDH have a hierarchy of signaling speeds. Multiple lower-level signals can be multiplexed to form higher-level signals. For example, three STS-1 signals can be multiplexed together to form an STS-3 signal, and four STM-1 signals multiplexed together to form an STM-4 signal.

SONET and SDH are technically comparable standards. The term SONET is often used to refer to either.

SONET Transport Hierarchy
Each level of the hierarchy terminates its corresponding fields in the SONET payload, as such:

Section
A section is a single fiber run that can be terminated by a network element (Line or Path) or an optical regenerator.

The main function of the section layer is to properly format the SONET frames, and to convert the electrical signals to optical signals. Section Terminating Equipment (STE) can originate, access, modify, or terminate the section header overhead. (A standard STS-1 frame is nine rows by 90 bytes. The first three bytes of each row comprise the Section and Line header overhead.)

Line
Line-Terminating Equipment (LTE) originates or terminates one or more sections of a line signal. The LTE does the synchronization and multiplexing of information on SONET frames. Multiple lower-level SONET signals can be mixed together to form higher-level SONET signals. An Add/Drop Multiplexer (ADM) is an example of LTE.

Path
Path-Terminating Equipment (PTE) interfaces non-SONET equipment to the SONET network. At this layer, the payload is mapped and demapped into the SONET frame. For example, an STS PTE can assemble 25 1.544 Mbps DS1 signals and insert path overhead to form an STS-1 signal.

This layer is concerned with end-to-end transport of data.

Configuration Example
The optical interface layers have a hierarchical relationship; each layer builds on the services provided by the next lower layer. Each layer communicates to peer equipment in the same layer and processes information, and passes it up or down to the next layer. As an example, consider two network nodes that are to exchange DS1 signals, as shown in this figure:



At the source node, the path layer (PTE) maps 28 DS1 signals and path overhead to form an STS-1 Synchronous Payload Envelope (SPE) and hands this to the line layer.

The line layer (LTE) multiplexes STS-1 SPE signals and adds line overhead. This combined signal is then passed to the section layer.

The section layer (STE) performs framing and scrambling and adds section overhead to form an STS-n signal.

Finally, the electrical STS signal is converted to an optical signal for the photonic layer and transmitted over the fiber to the distant node.

Across the SONET network, the signal is regenerated in optical regenerators (STE-level devices), passed through an ADM (an LTE-level device), and eventually terminated at a node (at the PTE level).

At the distant node, the process is reversed from the photonic layer to the path layer where the DS1 signals terminate.

SONET Framing
A standard STS-1 frame is nine rows by 90 bytes. The first three bytes of each row represent the Section and Line overhead. These overhead bits comprise framing bits and pointers to different parts of the SONET frame.

There is one column of bytes in the payload that represents the STS path overhead. This column frequently "floats" throughout the frame. Its location in the frame is determined by a pointer in the Section and Line overhead.

The combination of the Section and Line overhead comprises the transport overhead, and the remainder is the SPE.

For STS-1, a single SONET frame is transmitted in 125 microseconds, or 8000 frames per second. 8000 fps * 810 B/frame = 51.84 Mbs, of which the payload is roughly 49.5 Mbs, enough to encapsulate 28 DS-1s, a full DS-3, or 21 CEPT-1s.

An STS-3 is very similar to STS-3c. The frame is nine rows by 270 bytes. The first nine columns contain the transport overhead section, and the rest is SPE. For both STS-3 and STS-3c, the transport overhead (Line and Section) is the same.

For an STS-3 frame, the SPE contains three separate payloads and three separate path overhead fields. In essence, it is the SPE of three separate STS-1s packed together, one after another.

In STS-3c, there is only one path overhead field for the entire SPE. The SPE for an STS-3c is a much larger version of a single STS-1 SPE.

STM-1 is the SDH (non-North American) equivalent of a SONET (North American) STS-3 frame (STS-3c to be exact). For STM-1, a single SDH frame is also transmitted in 125 microseconds, but the frame is 270 bytes long by nine rows wide, or 155.52 Mbs, with a nine-byte header for each row. The nine-byte header contains the Multiplexer and Regenerator overhead. This is nearly identical to the STS-3c Line and Section overhead. In fact, this is where the SDH and SONET standards differ.

SDH and SONET are not directly compatible, but only differ in a few overhead bytes. It is very unlikely that Cisco will ever use a framer that does not support both.

SONET is very widely deployed in telco space, and is frequently used in a ring configuration. Devices such as ADMs sit on the ring and behave as LTE-layer devices; these devices strip off individual channels and pass them along to the PTE layer.

All current Cisco line cards and Port Adapters (PAs) act as PTE-layer devices; these devices terminate the full SONET session and L2 encapsulation. They are Packet Over SONET (POS) cards, which indicate serial transmission of data over SONET frames. There are two RFCs that describe the POS process: RFC 1619, PPP over SONET/SDH , and RFC 1662, PPP in HDLC-like Framing .

These Cisco products cannot sit directly on a SONET or SDH ring. One of them must hang off of some LTE-layer device, such as an ADM. Equipment such as an Integrated SONET Router (ISR) has both PTE and LTE functionality, so it can terminate and pass through data.

Configuration Issues
These parameters affect configuration of SONET devices:

Clocking—The clocking default value is line, and is used whenever clocking is derived from the network. The clock source internal command is typically used when two Cisco 12000 Series Internet Routers are connected back-to-back, or are connected over dark fiber where no clocking is available. In either case, each device must have its clock source set to internal. For a more detailed explanation, refer to Configuring Clock Settings on POS Router Interfaces.

Loopback—Loopback is a line and internal (DTE) value. This is a SONET section loopback if done on the controller. If done on the individual interface, these are individual path loopbacks.

Framing—Most Cisco framers support both SONET and SDH.

Payload scrambling—This value is normally set to On.

S1S0 flag—This value must be between 0 and 3; the default value is 0. With SONET, s1so must be set to 0, and with SDH it must be set to 2. Value 3 corresponds to the received Alarm Indication Signal (AIS).

J0 flag - 0-255—This setting is the section trace identifier. It is only required for section tracing.

C2 flag - 0-255—This setting specifies the STS path signal label (5 to 7 are configured with the pos flag command).

Alarm reporting—Alarm reporting allows you to specify which alarms are reported. The permitted values are b1-tca, b2-tca, sf-ber, sd-ber, los, lof, ais-l, and rdi-l. (This value is configured with the pos report command).

Alarm thresholds—The alarm threshholds setting specifies the Bit Error Rate (BER) thresholds that signal an alarm. (This value is configured with the pos threshold command).

Sep 30, 2007

Cisco (CSCO) today 100 times bigger than 3Com (COMS) -- it wasn't in 1994

This morning 3Com (NASDAQ: COMS) announced that private equity firm, Bain Capital, would put it out of its misery and pay $2.2 billion in cash for the company. 3Com has lagged so far behind that it has been painful to watch. 3Com and Cisco Systems (NASDAQ: CSCO) indeed could provide at least two to three chapters in an investing teaching and history book. Here's the CliffsNotes version:

Summer of 1994 was a tough technology environment. Technology had a great run from 1990 through 1994, till summer that is. Valuations contracted and investor fatigue set in for about four to five months. I was traveling through Silicon Valley with a couple of British portfolio managers visiting companies. One day we had a breakfast meeting with then CEO Eric Benamou of 3Com and lunch with a senior VP at Cisco (whose name escapes me). Benamou was an intellectual, a refined man, but did not possess the street smarts necessary for a tech company CEO. He was arrogant and bluntly declared that Cisco's days were numbered and 3Com would acquire any tech company necessary to achieve total domination. OK, great, and we went on to Cisco for lunch.

The senior VP was a classy guy, never said a bad word about any competitor and just explained Cisco's game plan and execution philosophy. Here is the funny part: In July 1994, BOTH companies had a market capitalization of $9 billion.

3Com went on to make some stupid acquisitions like US Robotics, paying top dollar for a company in serious decline with evaporating margins. 3Com has never been the same since. Eric Benamou went on to pursue "other interests" and 3Com has languished at the bottom of the tech food chain.


Cisco went on to make over 120 acquisitions, most very strategic and successfully integrated. Cisco at its peak topped $450 billion in market capitalization. That number was frothy and unrealistic in 2000-2001. Today, Cisco's market value is a more earthly $202 billion, nearly 100 times the value of what Bain Capital is paying for 3Com.

Cisco is the clear winner in the networking world: game, set and match.

In real estate the expression is location, location, location. In evaluating stocks, the expression is management, management, management...