Jul 2, 2014

BGP Routing Issues Case Study 2 - Unreachable to the learned BGP route

If you have only one BGP router with multi-home ISP uplinks, its much easier to maintain the BGP routing table because it was totally under centralized control.

But as network continue to expand and we start to consider the redundancy issue, then your network will add additional multiple BGP speakers and you must enable IBGP between these BGP routers.





Jun 23, 2014

BGP Routing Issues Case Study 1 - BGP configuration without filter

I learned BGP since 1998, of course, just like many other people, I made some human errors without fully understanding of BGP protocols. Just copy and paste sample configuration from cisco websites and modified it then applied to the production BGP router. However, its a dangerous thing if you just know part of something without complete knowledge and implement it on the production network.

This is the reason I want to start to share my knowledge and experience about BGP protocol. Maybe it can help some people to prevent doing some ridiculous BGP incidents over internet(ex: advertise private IP or default route to the internet)

May 30, 2014

[POC] Cisco vs Juniper running OSPF w/o Backbone Area 0

As everyone knows that OSPFv2 is a standard routing protocol (http://www.ietf.org/rfc/rfc2328.txt), but not all vendors device will implement it exactly the same. Especially when the network scenario was not follow the standard design, then it might have different exceptional behavior in different vendor devices.

In order to compare the difference behavior between Cisco and Juniper. I designed a special OSPF topology just like below, so we can see Cisco and Juniper have different result of routing exchange behavior.

May 24, 2014

Learning JUNOS from IOS - Day3 (View/Modify Configuration)

A bird in the hand is worth two in the bush

Day 3 - How to view or modify JUNOS configuration ?


Entering Configuration Mode

When you stand behind an engineer, you can identify the engineer is Cisco or Juniper guy easily. 

Most cisco engineers like to use the command 'conf t' to enter configure mode of router or switch.

router> enable
Password:
router# conf t
Enter configuration commands, one per line.  End with CNTL/Z.
router(config)#


When you want to show any results, you don't need to exit to the privilege mode(#) to show it. You can leverage 'do' command to check the status.

router(config)# do sh ip int brief
Interface              IP-Address      OK? Method Status                Protocol
GigabitEthernet1       10.17.14.195    YES manual up                    up      
GigabitEthernet2       unassigned      YES unset  administratively down down    
GigabitEthernet0       unassigned      YES manual up                    up      
Loopback0              5.5.5.5         YES manual up                    up     

Mar 28, 2014

[POC] Junos script Operations Automation (op script) - show-bgp-policy

Junos Script Automation is a powerful and flexible on-box toolset which provides customization of network behavior, adaption to what your application expects to configure, manage and diagnose if and when needed. It sits right above the Junos OS, with a northbound interface to Junos Space applications, and southbound access to Junos SDK applications and native management plane instrumentation. This customized programmable solution makes your application smarter and better in real-time.

In Juniper official website provides many script samples to match part of common requests. In my company, we deployed many inter-connections EBGP/IBGP between router or layer 3 switch. So I picked one op script from the JUNOS Script Library - show bgp policy: display all routing-policies in sequential order for a selected BGP peer.

Mar 25, 2014

Learning JUNOS from IOS - Day2 (Configuration Management)

Configuration Management

Day 2 - How to review router configuration ?

In Cisco IOS, it has two default configuration files: 
(1) startup-config: is used for initialization of router boot up process 
(2) running-config. is the real-time concurrent configuration repository whenever you type any commands in IOS.



And how do you differentiate screen output is the startup-config or running-config ?



Mar 23, 2014

Learning JUNOS from IOS - Day1 (Show Interface)

Once a use, forever a custom

My first-time experience of Cisco router installation was in 1997, when I was a junior network engineer in a small company. I remembered that day I finished installing a customer router on-site for only 15 mins then I left and went back again after 2 hours to configure the router via console again..because I forgot to configure password under line vty (I told myslef I would never made such stupid mistake again like that - Password required, but none set)

After 13 years later, I started to learn JUNOS since 2010. Because I familiar Cisco IOS so much, so I knew the feeling of use behavior change from IOS to JUNOS. The hierarchical structure is not so easy to read when you see it in the first time.(especially when you have no any programming experience)

However, having the use experience of Cisco IOS is a not a bad thing before you start to learn JUNOS. I believe if you can leverage your previously IOS command knowledge then map to JUNOS relative statements, it will help you to learn the JUNOS quickly. This the reason why I want to write this series of articles to share my personal learning experience and tips with you.


As below comparison comments came from a great blog article -
Cisco IOS vs. Juniper JUNOS: The technical differences can help you to understand the difference between these NOS(Network Operating System):

IOS traditionally is a monolithic operating system, which means it runs as a single operation and all processes share the same memory space. Because of the latter feature, bugs in one operation can have an impact on or corrupt other processes. In addition, if a user wishes to add features or functions to the operating system, IOS has to be deactiviated while a completely new version with the desired features is loaded. 
JUNOS, on the other hand, was constructed as a modular operating system. The kernel is based on the open source FreeBSD operating system, and processes that run as modules on top of the kernel are segregated in exclusive, protected, memory space. Users thus can add features and functions to the version of JUNOS running on their systems without disabling the entire operating system — a characteristic known as in-service software upgrades that also enhances uptime and availability. 
The goal of Cisco's new IOS variantsIOS XR, IOS XE and NX-OS — is to overcome the monolithic limitations of the traditional IOS while addressing critical needs for increased uptime and availability in the service-provider core and edge, and enterprise data center, respectively. All these operating systems are modular, in that IOS services run as modules on top of a Linux-based kernel (in IOS XE and NX-OS), or as a third-party Portable-Operating-System-Interface (POSIX)-based real-time kernel (in IOS XR).


Mar 19, 2014

[POC] Juniper SRX IPSec tunnel (Aggressive mode) SOP configuration

In order to prepare the future migration from Juniper SSG to SRX, so I tried to use SRX GUI interface to see how its easy for operation team to sustain this.

This is the first time I tried to use GUI to manage a router, and if you are not familiar with Juniper SRX features and functions, I have to say its a quick start to have a glance overview of Juniper SRX by web interface.

For many junior engineers, if they can have what-you-see-what-you-get interface, they will accept new technology as fast as they can or they might refuse to try or to learn new technology if there's no time pressure or instructions from high-level managers directly .

We are still using CLI to control most routing and switching network device today, but I believe someday the condition may change if the network virtualization come true.(I think no one would like to control firewall by CLI, isn't it ?)

Mar 14, 2014

How to use SecureCRT to access your AWS EC2 instance ?

Cloud era is coming, so its time to learn those you are not familiar with.

Amazon Web Services, aka AWS, nevertheless to say is the No.1 cloud service you should know immediately now.


Mar 11, 2014

Setup Openstack in a VM w/ Devstack Step-by-Step

Learning openstack is not an easy task for me, because I don't have much linux knowledge. During the openstack setup process followed by Openstack.org official installation guide, I spent more than 3 hours to install necessary modules and modified the configuration files one by one.
But I failed and I cannot figure out what the problem is...maybe I should spend more time to understand each action and verify it one-by-one.

But I don't have so much time to waste on installation procedure, I need to familiar the openstack as soon as possible to test its feature.
So I tried to leverage Devstack all-in-one install script to help me to learn what is openstack and see how it works.

However, its still not just so simple just like Devstack.org said if you are installing openstack first time:

Mar 7, 2014

JUNOS CoS processing building block with related CLI commands

Juniper CLI learning is a little challenge for junior network engineers or Cisco IOS engineers, because the JUNOS modular and hierarchical structure design.
Some features may need several command line which were configured under different hierarchical levels, then combined all of them together in another hierarchical level.
Such kind of CLI design especially not easy to learn when apply CoS on juniper device.(I believe many Cisco IOS engineers don't want to switch to JUNOS because of this...)


As above figure is my understanding about the related JUNOS command which is using in our production network.

Mar 6, 2014

[POC] Use Juniper Firefly Perimeter to support RTBH BGP scale with 120 BGP Peers

As Juniper FIREFLY-PERIMETER is an ideal candidate of virtual router solution for RTBH router, because its just need control plane and memory(it will not be limited by hardware) for BGP exchange route with community. No much data forwarding plane packet process was needed.

So I rebuild the lab with Juniper firefly to see the difference with physical routers as below topology.



In my vmware workstation lab, I assigned two interface to each firefly, ge-0/0/0 was used for BGP connections and ge-0/0/1 was used for SSH purpose only(to be more easier for config copy/paste.

The most obviously advantage of firefly is the response time of commit action, it was almost done immediately after you press Enter key when I initialized the configuration clean-up, its great!
...But after I copy & paste all my configurations to it then the response time still became longer.

[edit]
lab@FIREFLY-PERIMETER-1# run show chassis hardware
Hardware inventory:
Item             Version  Part number  Serial number     Description
Chassis                                22cbfad3dcef      FIREFLY-PERIMETER
Midplane       
System IO      
Routing Engine                                           FIREFLY-PERIMETER RE
FPC 0                                                    Virtual FPC
 PIC 0                                                  Virtual GE
Power Supply 0

[edit]
lab@FIREFLY-PERIMETER-1# run show chassis forwarding
FWDD status:
  State                                 Online   
  Microkernel CPU utilization        28 percent
  Real-time threads CPU utilization   0 percent
  Heap utilization                   21 percent
  Buffer utilization                  3 percent
  Uptime:                               15 hours, 10 minutes, 32 seconds
 

I think Firefly is a great candidate for this kind of role(BGP Route Reflector), without much forwarding traffic pass-through, so you don't need to concern the forwarding performance.
It works just for BGP signaling and routing sustain so it can always keep low CPU loading.

lab@FIREFLY-PERIMETER-2# run show bgp summary | match 0/0/0/0 | count
Count: 120 lines
lab@FIREFLY-PERIMETER-1# run show chassis routing-engine
Routing Engine status:
    Total memory              2048 MB Max   655 MB used ( 32 percent)
      Control plane memory    1150 MB Max   460 MB used ( 40 percent)
      Data plane memory        898 MB Max   189 MB used ( 21 percent)
    CPU utilization:
      User                       0 percent
      Background                 0 percent
      Kernel                     1 percent
      Interrupt                  0 percent
      Idle                      99 percent
    Model                          FIREFLY-PERIMETER RE
    Start time                     2014-03-05 18:49:02 UTC
    Uptime                         15 hours, 11 minutes, 42 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       0.00       0.00       0.00
So I tried to enable additional BGP features - BFD(Bidirectional Forwarding Detection) over 120 BGP sessions to test the CPU loading impact:
[edit]
lab@FIREFLY-PERIMETER-1# run show bfd session
                                                  Detect   Transmit
Address                  State     Interface      Time     Interval  Multiplier
1.1.1.2                  Up        ge-0/0/0.1     3.000     1.000        3  
2.2.2.2                  Up        ge-0/0/0.2     3.000     1.000        3  
3.3.3.2                  Up        ge-0/0/0.3     3.000     1.000        3  
...
119.119.119.2            Up        ge-0/0/0.119   3.000     1.000        3  
120.120.120.2            Up        ge-0/0/0.120   3.000     1.000        3  

120 sessions, 120 clients
Cumulative transmit rate 120.0 pps, cumulative receive rate 120.0 pps
Then the result surprise me...the CPU loading(0%) became less than before ???
Cool!

[edit]
lab@FIREFLY-PERIMETER-1# run show chassis routing-engine   
Routing Engine status:
    Total memory              2048 MB Max   655 MB used ( 32 percent)
      Control plane memory    1150 MB Max   460 MB used ( 40 percent)
      Data plane memory        898 MB Max   198 MB used ( 22 percent)
    CPU utilization:
      User                       0 percent
      Background                 0 percent
      Kernel                     0 percent
      Interrupt                  0 percent
      Idle                     100 percent
    Model                          FIREFLY-PERIMETER RE
    Start time                     2014-03-05 18:49:02 UTC
    Uptime                         15 hours, 31 minutes, 33 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       0.00       0.00       0.00
Compared with previously Firefly version, I found the difference is that I cannot see the expiry license anymore when I show system license:
[edit]
lab@FIREFLY-PERIMETER-1# run show system license
License usage: none

Licenses installed: none


Maybe its the Juniper's gift without expiry date ?
Try it and you will know!


Mar 5, 2014

[POC] Use Juniper SRX100H to support RTBH BGP scale with 120 BGP Peers

Since our company current RTBH router was EOL(Cisco 1800), and our security team would like to expand the RTBH scope to all office SSL VPN all over the world(more than 100s), so we are trying to survey a good candidate for this position.

We have a spare Juniper M10i and I believe it can meet the requirement for sure, but its too big so our operation team tried to leverage the lab device - Juniper SRX100H for this purpose. That's why I did this POC to prove the BGP scalability of SRX100H.

As below is the Juniper SRX100H hardware features, as a such small device but has 1GB RAM so it can do much more than my expectation in its control plane:
  • DDR Memory: 1 GB
  • Power supply adapter: 30 watts
  • AC input voltage: 100 to240 VAC
  • FastEthernetports: 8
  • Consoleport: 1
  • USB port: 1
  • LEDs: 4
  • NAND flash: 1 GB 
My POC topology as below is very simple and straight, I used a single cable connect between two SRX100H, then setup a trunk w/ 120 VLANs between them, each VLAN will have a direct connect EBGP session.
After all configuration was done, all 120 BGP neighbors were UP without issues:
lab@SRX100-2# run show bgp summary 
Groups: 1 Peers: 120 Down peers: 0
Table          Tot Paths  Act Paths Suppressed    History Damp State    Pending
inet.0              2400         20          0          0          0          0
Peer                     AS      InPkt     OutPkt    OutQ   Flaps Last Up/Dwn State|#Active/Received/Accepted/Damped...
1.1.1.1                   1        215        216       0       1     3:23:49 20/20/20/0           0/0/0/0
2.2.2.1                   1        214        214       0       1     3:23:45 0/20/20/0            0/0/0/0
3.3.3.1                   1        213        214       0       1     3:23:41 0/20/20/0            0/0/0/0
...
118.118.118.1             1        213        214       0       1     3:23:39 0/20/20/0            0/0/0/0
119.119.119.1             1        213        214       0       1     3:23:35 0/20/20/0            0/0/0/0
120.120.120.1             1        213        214       0       1     3:23:31 0/20/20/0            0/0/0/0

lab@SRX100-2# run show bgp summary | match 0/0/0/0 | count
Count: 120 lines

And I configured 20 BGP networks annoncement to each neighbor:


lab@SRX100-2# run show route protocol bgp | count
Count: 2400 lines

Then check the SRX CPU and memory usage, its looks great!


lab@SRX100-2# run show chassis routing-engine
Routing Engine status:
    Temperature                 60 degrees C / 140 degrees F
    Total memory              1024 MB Max   461 MB used ( 45 percent)
      Control plane memory     560 MB Max   330 MB used ( 59 percent)
      Data plane memory        464 MB Max   135 MB used ( 29 percent)
    CPU utilization:
      User                       4 percent
      Background                 0 percent
      Kernel                     8 percent
      Interrupt                  0 percent
      Idle                      88 percent
    Model                          RE-SRX100H
    Serial ID                      AT1612AF0205
    Start time                     2014-03-05 09:40:12 UTC
    Uptime                         4 hours, 29 minutes, 8 seconds
    Last reboot reason             0x1:power cycle/failure 
    Load averages:                 1 minute   5 minute  15 minute
                                       0.11       0.13       0.07 
If you have similar case and realistic resource limitation, maybe you can consider to reuse your spare Juniper SRX to do this kind of job :)
Good luck!