Examples of automated fault recovery
2.2.5.3 Examples of automated fault recovery
Consider the redundant router shown in Figure 2-2 and Figure 2-4 . Suppose that one of the user workstations shown in the diagram is communicating with one of the servers. So packets from the workstation are intended for the IP address of the server. But at Layer 2 the destination address in these packets is the Ethernet MAC address of the primary router. This router is the default gateway for the VLAN. So it receives all packets with destination addresses not on the subnet associated with this VLAN, and it forwards them to the appropriate destination VLAN. Figure 2-4. A simple LAN with redundancy—network-layer view Now, suppose that this primary routers power supply has failed. Smoke is billowing out of the back, and it can no longer send or receive anything. Its gone. Meanwhile, the secondary router has been chattering back and forth with the primary, asking it whether it is working. It has been responding dutifully that it feels fine and is able to continue forwarding packets. But as soon as the power supply failed, it stopped responding to these queries. After a couple of repeated queries, the secondary router decides that it must step in to save the day. It suddenly adopts the Ethernet MAC address and the IP address of the primary on all of the ports that they have in common. Chapter 3 will discuss the details of how these high-availability protocols work. The workstation has been trying to talk to the server while all of this is happening. It has sent packets, but it hasnt seen any responses. So it has resent them. Every one of these packets has a destination Ethernet MAC address pointing to the primary router and a destination IP address pointing to the server. For a few seconds while the secondary router confirmed that it was really appropriate to take over, these packets were simply lost. But most applications can handle the occasional lost packet without a problem. If they couldnt, then ordinary Ethernet collisions would be devastating. 24 As soon as the secondary router takes over, the workstation suddenly finds that everything is working again. It resends any lost packets, and the conversation picks up where it left off. To the users and the application, if the problem was noticed at all, it just looks like there was a brief slow-down. The same picture is happening on the server side of this router, which has been trying to send packets to the workstations IP address via the Ethernet MAC address of the router on its side. So, when the backup router took over, it had to adopt the primary routers addresses on all ports. When I pick up the discussion of these Layer 3 recovery mechanisms in Chapter 3 , I talk about how to ensure that all of the routers functions on all of its ports are protected. This is how I like my fault tolerance. As I show later in this chapter, every time a redundant system automatically and transparently takes over in case of a problem, it drastically improves the networks effective reliability. But if there arent automatic failover mechanisms, then it really just improves the effective repair time. There may be significant advantages to doing this, but it is fairly clear that it is better to build a network that almost never appears to fail than it is to build one that fails but is easy to fix. The first is definitely more reliable.2.2.5.4 Fault tolerance through load balancing
Parts
» Money Geography Business Requirements
» Installed Base Bandwidth Business Requirements
» Layer 1 Layer 2 The Seven Layers
» Layer 3 Layer 4 The Seven Layers
» Layer 5 Layer 6 Layer 7 The Seven Layers
» Routing Versus Bridging Networking Objectives
» Top-Down Design Philosophy Networking Objectives
» Failure Is a Reliability Issue
» Performance Is a Reliability Issue
» Guidelines for Implementing Redundancy
» Redundancy by Protocol Layer
» Multiple Simultaneous Failures Complexity and Manageability
» Always let network equipment perform network functions Intrinsic versus external automation
» Examples of automated fault recovery
» Fault tolerance through load balancing
» Avoid manual fault-recovery systems
» Isolating Single Points of Failure
» Multiple simultaneous failures Predicting Your Most Common Failures
» Combining MTBF values Predicting Your Most Common Failures
» Traffic Anomalies Failure Modes
» Software Problems Human Error
» Ring topology Basic Concepts
» Star topology Basic Concepts
» Mesh Topology Basic Concepts
» Spanning Tree eliminates loops Spanning Tree activates backup links and devices
» Protocol-Based VLAN Systems VLANs
» Why collapse a backbone? Backbone capacity
» Backbone redundancy Collapsed Backbone
» Trunk capacity Distributed Backbone
» Trunk fault tolerance Distributed Backbone
» Ancient history Switching Versus Routing
» One-armed routers and Layer 3 switches
» Filtering for security Filtering
» Filtering for application control
» Containing broadcasts Switching and Bridging Strategies
» Redundancy in bridged networks Filtering
» Trunk design VLAN-Based Topologies
» VLAN Distribution Areas VLAN-Based Topologies
» Sizing VLAN Distribution Areas
» Multiple Connections Implementing Reliability
» Routers in the Distribution Level Routers in Both the Core and Distribution Levels
» Connecting Remote Sites Large-Scale LAN Topologies
» General Comments on Large-Scale Topology
» Cost Efficiency Selecting Appropriate LAN Technology
» Installed Base Maintainability Selecting Appropriate LAN Technology
» Ethernet addresses Ethernet Framing Standards
» Collision Detection Ethernet and Fast Ethernet
» Transceivers Ethernet and Fast Ethernet
» FDDI Local Area Network Technologies
» Wireless Local Area Network Technologies
» Firewalls and Gateways Local Area Network Technologies
» Horizontal Cabling Structured Cabling
» Vertical Cabling Structured Cabling
» Network Address Translation IP
» Multiple Subnet Broadcast IP
» Unregistered Addresses General IP Design Strategies
» Easily summarized ranges of addresses
» Sufficient capacity in each range
» Standard subnet masks for common uses
» The Default Gateway Question
» Types of Dynamic Routing Protocols
» Split Horizons in RIP Variable Subnet Masks
» Basic Functionality IGRP and EIGRP
» Active and Stuck-in-Active Routes
» Interconnecting Autonomous Systems IGRP and EIGRP
» Interconnecting Autonomous Systems OSPF
» Redistributing with Other Routing Protocols
» IP Addressing Schemes for OSPF OSPF Costs
» Autonomous System Numbers BGP
» IPX Addressing Schemes General IPX Design Strategies
» RIP and SAP Accumulation Zones
» Using Equipment Features Effectively
» Hop Counts Elements of Efficiency
» Bottlenecks and Congestion Elements of Efficiency
» Filtering Elements of Efficiency
» QoS Basics Quality of Service and Traffic Shaping
» Layer 2 and Layer 3 QoS Buffering and Queuing
» Assured Forwarding in Differentiated Services
» Traffic Shaping Quality of Service and Traffic Shaping
» Defining Traffic Types Quality of Service and Traffic Shaping
» RSVP Quality of Service and Traffic Shaping
» Network-Design Considerations Quality of Service and Traffic Shaping
» Configuration Management Network-Management Components
» Fault Management Performance Management Security Management
» Designing a Manageable Network
» VLAN structures Architectural Problems
» LAN extension Architectural Problems
» Redundancy features Architectural Problems
» Out-of-Band Management Techniques Management Problems
» Multicast Addressing IP Multicast Networks
» Multicast Services IP Multicast Networks
» Group Membership IP Multicast Networks
» Multicast administrative zones Network-Design Considerations for Multicast Networks
» Multicast and QoS Network-Design Considerations for Multicast Networks
Show more