About Me

My photo
Scott Arnett is an Information Technology & Security Professional Executive with over 30 years experience in IT. Scott has worked in various industries such as health care, insurance, manufacturing, broadcast, printing, and consulting and in enterprises ranging in size from $50M to $20B in revenue. Scott’s experience encompasses the following areas of specialization: Leadership, Strategy, Architecture, Business Partnership & Acumen, Process Management, Infrastructure and Security. With his broad understanding of technology and his ability to communicate successfully with both Executives and Technical Specialists, Scott has been consistently recognized as someone who not only can "Connect the Dots", but who can also create a workable solution. Scott is equally comfortable playing technical, project management/leadership and organizational leadership roles through experience gained throughout his career. Scott has previously acted in the role of CIO, CTO, and VP of IT, successfully built 9 data centers across the country, and is expert in understanding ITIL, PCI Compliance, SOX, HIPAA, FERPA, FRCP and COBIT.

Monday, October 8, 2012

WAN Design: Building a Resilient WAN for BCP

Time to refresh your Business Continuity Plan, and while you are doing that, let's make sure your network can support your plan.  Perhaps it is time to rollout out a WAN upgrade project - let's not forget to include our BCP plans into the new WAN design.

Wide area networks (WANs) provide connectivity to local area and other networks over long distances. Users, Data Centers and corporate assets alike are dependant these days on the WAN.


WANs have a multi-faceted role in an organization: They can support voice and data communications and Internet connectivity, provide connectivity for company email and virtual private networks (VPNs), and link to other organizations doing business with the company.

In a disaster situation, WANs become essential tools for an organization to communicate internally among its employees and externally with stakeholders and other third parties. Loss of a WAN infrastructure, without suitable backup and recovery capabilities, can seriously disrupt business operations, and a financial impact.

WAN technologies have evolved dramatically from the days of fixed point-to-point circuits. Depending on the applications being transported, a variety of network protocols may be supported by a WAN, such as MPLS (multi-protocol label switching), SIP (session initiation protocol), SONET (synchronous optical network), Ethernet (e.g., 10 GbE) and, of course, the TCP-IP standard. Transport is typically over fiber-optic networks coupled with high-capacity copper- and fiber-based local access facilities.

When building or managing WANs, a primary activity is to keep them running with minimal disruptions. A principal WAN design goal, therefore, is resilience, which ensures that any potential disruptions are found and resolved quickly and efficiently.  Depending on the size of the organization and the network, a Network Operations Center is usually essential for real time monitor and support of the WAN.

When developing WAN resilience plans, your most important ongoing activity is to work with your carriers to take full advantage of their recovery and restoration capabilities. In addition to getting details on their service recovery and restoration offerings, find out how they approach service-level agreements (SLAs) that specifically address how they will respond during a service disruption. Make sure that their time frames align with your business requirements. For instance, if you have a four-hour recovery time objective (RTO) for a specific system that needs Internet access, be sure that your carrier can restore access within your RTO. I also like having more than 1 carrier in your network - some of the best WAN designs have a primary carrier and a secondary carrier.  Your business has critical applications or transactions on the WAN -you can't afford a significant disruption.

To build resilient WANs, access to real-time information about network performance is essential for spotting potential disruptions. That information must be end-to-end, and not limited to network segments. To obtain visibility across WANs, your network management system must be able to “see” all network segments and how well they are performing. Ideally, you should have an automated tool that can be programmed to analyze cross-WAN performance data. Use that data to compare current network performance against specific metrics and/or SLAs. The tool should also be able to flag situations that indicate impending problems. I would also like that tool to integrate to your incident ticketing system and open a priority one incident ticket for immediate notification and response. 

The most resilient network topology is a mesh network, in which all network end points connect to each other. This, of course, is also the most expensive configuration, so you may wish to use network design software (work with your service provider on this) to define a configuration that balances cost-effectiveness and resilience. Ensure that channels with the highest traffic volumes have alternate routes available, from different carriers if possible, that can be rapidly activated to maintain performance levels. If your WAN uses undersea cables and/or satellite channels, be sure to consider alternate cable and satellite systems for diversity and resilience. This design is also key in your VoIP corporate solutions for call routing from point to point.  No need to bring all that voice traffic back to the data center.

At your data centers and offices, install redundant network connection devices, such as routers and switches, and also have an inventory of spares that can be brought into service quickly if a device fails. Be sure to rotate spare devices into production networks to ensure they perform properly. I would also recommend having a process or procedure for keeping your spare hardware updated and current on firmware or IOS.

Ensure that your WAN’s primary commercial power supplies have backup power (e.g., uninterrupted power systems) so they will remain operational in the aftermath of a commercial power outage or lightning strike. I would also say locate network infrastructure equipment in secure, HVAC-equipped rooms that are accessible to a limited number of employees and vendors.

Establish network disaster recovery (DR) plans that provide step-by-step activities to diagnose problems, establish bypass and recovery arrangements, recover failed network components and return WAN operations to normal. Periodically test these plans to ensure they are appropriate for your WAN as configured, the procedures work and are in the correct sequence, and that your service providers are in synch with your network resilience requirements.  One more thing, don't forget staff training and skill development to be able to quickly troubleshoot and repair WAN issues.

Summary

Resilient wide area networks can be achieved through a combination of partnering with service providers, intelligent network design, proactive network management, a disaster recovery program combining plans and regular testing, and an operational philosophy that blends performance with resilience and survivability. In addition, test your plan on a regular basis - make sure your design works, but that staff know and understand the design, and have the skills to respond.

Keep it positive!

Scott Arnett
scott.arnett@charter.net







No comments:

Post a Comment