*As originally published as a guest article on Luxatia International's website
Even in the most structured environments with clear operational strategies, complexity build-up in infrastructure and operations is unavoidable as businesses grow. To help these environments thrive in a consistent, reliable way, it’s vital to optimize the IT function: the backbone that supports every business application and ensures service excellence across all other functions. Automating operations is the only way for IT to scale and support today’s business demands.
In many ways, IT needs to be able to move faster than the speed of business.
As modern networks grow, you need to increase your networking capacity to handle that growth. It would be ideal to simply acquire new hardware, turn it on, and start working right away. We’re reaching a stage where some can configure itself, but the concept of provisioning endures.
As we take stock of the network use cases Resolve’s customers are automating, a few common themes have emerged. While not every use case listed here is an end-to-end process, the manual time drain involved in some of them is high enough to warrant mention.
Use Case #1: Device Provisioning
Device provisioning currently involves the setup and distribution of IT equipment, an initial step in the IT lifecycle. Each device must function seamlessly because additional devices rely on it as part of the IT infrastructure. Imagine a world where IT could simply procure new hardware, plug it in, and turn it on to realize seamless functionality.
In reality, however, a lot of work must take place before a device is ready to go mainstream. These devices can include physical hardware such as network switches, routers, and wireless access points or virtual access points (for example, in the cloud they could be VLANs, virtual switches, VPCs, etc). This muti-step process may involve discovering devices, setup per vendor instruction, assigning roles, prepping security, and other activities to enforce organizational policies; additionally, configuration audits are necessary to ensure optimal performance.
It's possible that, based on the organization and its size, these steps can be performed by SMEs in multiple teams that possess different levels of privilege.
Automation can minimize setup times to achieve a zero-touch model that handles not only setup but the entire provisioning process, as well as conducting regular audits to ensure optimal performance. And that is just the tip of the iceberg.
Use Case #2: Incident Response
An IT incident refers to anything that negatively impacts the IT infrastructure and, by extension, the business itself. These most often originate with monitoring tools and are classified as an incident with the help of an ITSM tool.
Lacking standardized procedures for handling IT infrastructure incidents puts an organizations at risk of poor customer experience. Fragmented, sub-optimal workflows exacerbate time-to-resolution and increase impact of outages.
With IT SLAs that often mandate 99.999% uptime across IT infrastructure, there is no room for inefficiency. NOCs or network operations teams need to work with their other peers to ensure the root cause is identified and a fix deployed—all while keeping the business informed of progress.
Automation can help with a lot of the heavy lifting involved in eliminating false positives, performing triage, diagnosing network alarms, and automatically triggering the necessary response. Automating incident response is the only way to streamline this process—reporting ongoing activities while keeping humans in the loop.
Use Case #3: Troubleshooting
Networks are becoming more complex, comprising multiple locations and vendors. Throwing cloud into this mix makes the process very cumbersome to maintain manually.
Troubleshooting a network problem across a diverse IT infrastructure landscape involves managing and applying reasoning over very large volumes of data. This is a high-touch, time-consuming, and error-prone endeavor reserved for SMEs or network administrators. These professionals offer the expertise and knowledge to diagnose symptoms, identify root causes, and perform the remediation steps necessary to resolve network issues.
While most network incidents include the troubleshooting process, it’s worth specific mention since it is a time consuming and critical part of root cause analysis. The problem could be something as logical as a faulty or damaged cable, a routing problem due to an over-utilized link, misconfiguration of IP addresses, or other issues associated with network expansion.
Troubleshooting can be codified into a series of steps which makes it a perfect candidate for automation. Automated troubleshooting can be triggered based on a monitoring alert; and automation can be used to not only collect data but also analyze that data to highlight discrepancies. The only way to accelerate this process is through automated workflows.
Use Case #4: Health Checks
Health checks are commonly performed by network engineers to ensure that their networks are up and running optimally. Regular health checks are the only way to keep networks performing reliably and securely.
Health-check routines can be complicated and involve a lot of moving parts, which raises the risk of leaving out necessary components. These may include physical network infrastructure, firewalls, network switches, access points, and VPNs, among other causes. And this list doesn’t include virtual components, which add their own complexity issues.
By the end of this process, network operations teams should have increased confidence in their IT infrastructure, resulting in a solidly performing network. In some cases the health check can detect and identify minor situations or incidents that, if ignored, could turn into outages.
Automation enables programming of health checks to run at regular intervals. This not only allows network operations teams to become more proactive, but can dramatically reduce those “1:00 a.m. network incidents” that impact the business. Automation also takes over routine, mundane health checks, liberating you from manual tasks to focus your attention on more complex activities.
Use Case #5: Configuration Management
Network configuration management (NCM) is a comprehensive process that every device is subjected to throughout its lifecycle. It spans device discovery, inventory maintenance, configuration backup, monitoring of configuration changes and compliance, tracking user activity, and troubleshooting. These are put through appropriate network operations whenever necessary.
Automation can help manage the entire lifecycle of network devices and configurations, solving and streamlining network configuration, and simplifying change and compliance management. The right tool can support and automate crucial network functions such as scheduling backups, tracking user activity, generating detailed reports, and many additional processes.
Use Case #6: Configuration Compliance
The term “networking” reflects the foundation of physical devices, features and protocols, and CLI configurations. Today that word encompasses cloud networking infrastructure as well, handling cloud-native networking services such as virtual switches and VPCs.
Processes involved in managing networks are built upon a foundation of compliance. Regulatory observance is designed to ensure that optimal configuration is maintained on all devices, whether physical or virtual, and that business applications are delivering optimal performance.
A configuration standard is traditionally built to a configuration template for a device (or set of devices) and then compared to a snapshot of the device’s current configuration. If any changes don’t match up with the standard, they must be fixed. This “check-fix” process is reactive, with multiple steps manually handled by network practitioners.
Automation becomes extremely relevant in maintaining compliance; any drifts in configuration can be easily spotted by contrasting them with the “golden configuration. The highly systematic nature of this process makes it ideal for automation; IT should not have to perform this function manually.
Use Case #7: Patching/Patch Management
Upkeep or patch management is a responsibility that people often schedule below more immediate priorities, unless the patch applies to a serious security threat. As a result, even with the best of intentions, the average total patch time is 102 days!
Patch management refers to the process of obtaining, verifying, and installing patches. These resources protect or mitigate against vulnerabilities. Unpatched or behind-schedule software can turn a device into a target for exploits and threats, so software patching is critical to IT and security operations. Threat actors are always looking for ways to release malware to exploit any vulnerability they can detect. Security patches substantially lower the risk of exploitation and breach.
Automating patching is key to successful defense against cybercrime. Automated patch management software allows companies to schedule regular update scans. They can ensure patches are automatically applied and thus prevent exploitation of critical vulnerabilities with published exploit codes.
Use Case #8: Upgrades
As a rule, network infrastructure should be upgraded every few years. Reasons to upgrade could range from improving security by swapping out legacy hardware for newer versions, or scaling speed and reliability to support business growth. Whatever the reason to upgrade, the manual effort required is not insignificant, making upgrade a prime candidate for automation.
Beyond a robust plan to minimize the business impact of bringing services down, the upgrade process requires meticulous execution to ensure that services in production are promptly returned to functionality.
Network automation helps to:
Use Case #9: Orchestration
Network orchestration is actually a more advanced level of network automation. It manages high-level sequences of interdependent tasks across multiple, diverse, multi-vendor systems.
Network orchestration is a must-have when you move beyond limited point automations. IT can help you stitch individual task-based automations into a cohesive workflow; executions can then be standardized and recorded in detail for audit and compliance. Because orchestration is more network-aware, it executes workflows based on device states and configurations.
Use Case #10: Rollback
The burgeoning number of network devices and services—plus increasing network changes across the infrastructure—makes manual processes like “Stare and Compare” and “Copy and Paste” no longer practical for managing the network. These manual activities elongate the course of bringing new applications and services online or troubleshooting existing network issues. Manual processes also raise the potential for human error. With configuration files growing longer and more complex, finding those changes can become a nightmare.
Successful rollback requires secure access to a platform that provides the right people with the right tools, as well as access to the appropriate network devices or services. Trusting a “Copy and Paste” approach for config changes and rollbacks can introduce errors and take far longer than needed.
Automating configuration rollback can help network engineers quickly restore the last best configuration. When updates produce unintended consequences or an unauthorized change was performed prematurely, automation can speed recovery time by quickly locating and implementing the last good config.
Network automation uses runbooks that are approved, tested, and well-documented by your SMEs for convenient, automatic triggering.
Increase speed of execution. Automation works at a far quicker pace than people and enables 24x7 operations when used in conjunction with IT self-service and self-help capabilities.
Reduce operational costs. Not only is automation quicker and more available, it’s also cheaper relative to the cost of human labor that would otherwise need to be employed.
Extend and enhance human capabilities. Automation, especially in conjunction with AI, can help to improve human performance, reduce error, and minimize intervention while providing greater flexibility and adaptability.
Extend your existing automations. Automation needn’t totally replace manual operations; rather, it opens up opportunities for additional improvements and creates efficient new ways to work.
If you're interested in seeing how these automations play out in the real world, schedule your demo today.
Learn more about the top 3 challenges and how to overcome them.
Find out how the Cognitive NOC has become the driving force in network management.
How to make your NOC performance reach its full potential.