For Network and IT Ops in most industries, the ability to handle thousands of daily incidents quickly, especially before they become service outages, is quintessential to their survival. This is most critical for Communication Service Providers, such as Mobile Operators, Internet Service Providers or Cable Service Providers. For these businesses, the network is the business. Outages and performance issues are not merely an inconvenience that need to be attended to, they could mean lost revenue and lost customers in a hyper-competitive environment. Proactive prevention of incidents and accelerated resolution of incidents is paramount to the success of these organizations.
Industry research suggests that the Communication Service Provider industry is one of the largest spenders for Contact Center and Customer Care solutions. Incidentally, they also rank at the bottom with regard to customer satisfaction with their customer support organization. It is not uncommon for customers to experience serious service issues, such as degraded internet service performance, and endure unacceptably long wait times just to reach a support agent. When they do reach a live person, the agent is unable to provide speedy solutions, as the issues are technical and difficult to debug given their limited training.
The front line agents are forced to escalate the customer to more experienced and more expensive level-2 engineers or dispatch agents. This long, drawn-out resolution process neither helps the service provider with costs nor the disgruntled customer who just wants to use the service they are paying for.
So, why exactly has it been so difficult for service providers to manage outages and provide speedy diagnosis and resolution of incidents when they do occur?
The answer to this lies in the approaches that operations centers have historically taken to solve the problem:
1. End-to-end Automation to diagnose and resolve all incidents and issues
2. Sophisticated Knowledge Management and Search applications to provide information to frontline agents to address incidents and customers escalations
Both of these approaches have failed to deliver the desired results for these businesses and they continue to sink deeper in the quagmire of incidents.
End-to-end automation can deliver results in the diagnosis and resolution of some of the incidents. However, it fails as the only comprehensive strategy. Many incidents cannot be resolved in an automated fashion because they need a significant chunk of human intervention. Other times, developing and maintaining end-to-end automations may drain scant resources and the cost savingS and benefits achieved from the automation may not be commensurate with the frequency and resolution speed requirements of the incident. Many businesses have unwisely (in hindsight) invested excessively in the wrong end-to-end automation scenarios and have had to abandon the automation initiative as a result of the perceived fiasco. A classic case, unfortunately, of throwing the baby with the bath water as automation has a powerful role to play in the incident resolution strategy of the operations team.
The other approach to incident resolution – Knowledge Management and Search-based – has also not helped these operation centers deal with the issue effectively. There are many reasons for this.
The first line agents in this environment have simply become escalation points to more expensive level-2 engineers. Clearly this is not an approach that scales with time.
What should service providers do to address the incident resolution problem? The answer lies in interactive automation. RESOLVE from gen-E has been leading the market with this innovative approach and numerous service providers have completely transformed their incident resolution with RESOLVE.
The interactive automation approach takes a dynamic approach to incident resolution, recognizing that no single approach can work for every type of incident. To be successful, the resolution approach needs to nimble and adapt to the incident type and the most optimal resolution approach needs to be assigned to it.
Analytics of existing incident logs can clearly indicate which incidents are most frequent, have a high impact and are fully automatable from diagnosis to remediation. RESOLVE provides the tools to design and develop these automations to be deployed quickly.
Next, incidents that are high frequency but cannot be fully automated are handled with a combination of automation and human touch. The first event handling step is validation of the incident, which can be fully automated and false alarms can be completely eliminated before premium human escalation occurs. Also, initial steps such as creation and population of a ticket can be completely automated. Automation can then be written in RESOLVE to collect a rich base of contextual information, as well as determine in which part of the service (e.g. optical line, server, or set top box) the issue has actually occurred. Once this rich contextual information is obtained, RESOLVE creates custom Guided Wizards or Decision Trees that provide step-by-step procedural guidance to less trained front line agents to solve the problem. Based on the human inputs of the front line agent, additional diagnostics and smaller resolution steps such as resetting a network interface can be automated and embedded in the manual resolution procedure.
Throughout the entire manual journey, the agent can obtain real-time assistance from other subject matter experts and also read comments and notes left behind by other agents who have gone through this journey in the past. The final set of steps, such as confirming that the resolution has worked as well as closing the ticket, can also be completely automated.
Finally, Incidents that are infrequent and relative easy to fix manually can be addressed in the manual mode. Lean resolution procedures can be defined in RESOLVE that are automatically linked to the incident in the event management or ticketing system and are displayed to the front line agent when they occur. Front line agents do not have to search for a procedure in a massive knowledge management system. Instead, they are led to the precise, tested and certified manual resolution procedure to fix the problem.
The interactive automation approach has many unique strengths that makes the approach so effective for Operations Centers.
Interactive automation is truly the future of Incident Resolution strategies at Operations Centers. Take 15 minutes of your day to chat with one of our automation experts about what interactive automation can do for your organization.
Discover how to facilitate the transformation to innovator and revenue generator with IT automation
AppDynamics + Resolve help solve challenges with full-stack observability and automated remediation.