Typing the question “How do you accelerate incident resolution?” renders 2,040,000 results when typed directly into Google search. What’s even more impressive, is it only took 0.46 seconds to fetch these results. What’s most impressive, is Google’s number 1 result out of all 2 million plus results is a post by my company, Resolve Systems. Yes, we are on top when it comes to the question: how do you accelerate incident resolution? Browsing Google’s results on how to resolve incidents, however, you will more than likely come up with 2 answers:
If these are the two answers, and that is what everyone is telling you to do, then why are your incidents still not being resolved faster? Are your support engineers feeling relief? Are you seeing a positive impact on your business? Perhaps your business is unique and a manual solution works, or an automated solution works. For our customers manual resolutions and closed loop automation are not the answer to their questions. The answer to their question is the third answer, or the first answer that google populated: Partial automations with human guided procedures.
Some procedures are done manually, and they work. You have some automations that are end-to-end or “closed loop” that you stick in the black box, and they sometimes work too. What if you could combine the two, create a standard process of resolving incidents, and increase efficiency within your operations? Let’s discuss all three options of accelerating incident resolution.
These are hard, they take a long time and they are difficult to pass down to a Level 1 agent to handle on their own. They require a lot of escalation and the incidents are not resolved very quickly. Its a slow process, but it eventually works.
Let’s think about a common incident that Level 1 agents are currently working to resolve. Let’s say resolution involves a 10 step procedure. Using a manual procedure, each step requires an engineer’s time. If the Level 1 agent working the incident gets through steps 1 to 5 without a problem and gets stuck on 6, what happens? What if the Level 2 agent cannot figure out step 6? After many escalations, someone finally resolves step 6 of the 10 step procedure and sends it back to the Level 1 agent to continue working through steps 7 to 10. This process is extremely slow and inefficient. Many do it because its the only option, other than trying to fully automate which can be expensive, time consuming and not worth the task.
This type of software is amazing: you push a button and out comes a fully automated procedure, amazing! Well, amazing until it doesn’t work. Yes, that means actually looking inside that black box to see what went wrong. Yes, that means escalation. Yes, that means manual procedures (see previous paragraph). To fully automate a process it takes time, development hours and a lot of dispersed knowledge lassoed into a project managers lap. Some projects are just too difficult, and simply stated, will never be fully automated.
Again, let’s explore the 10 step procedure using a fully automated procedure. Level 1 agents receive the same incident, and simply push a button to start the automation (that was easy!). An error pops up. Where was the error? Was it step 1, step 10, or somewhere in between? Escalate the incident to someone who can see inside the blackbox. The Level 2 agent is able to identify where the automation failed, it was step 6. Identifying the issue was hard enough, and the Level 2 agent can’t resolve it. Again, it is escalated and eventually resolved. This process is great when it works, but building automations can be difficult, time consuming and require the group effort of some highly paid individuals who are not willing to participate. Even when they are created, they will often times fail.
Finally, a chance to really answer the question. How do you ACCELERATE incident resolution? The last two answers were merely handling incidents. Accelerating incidents takes more than just an engineer and some software. In order to truly accelerate the resolution of an incident, the two worlds need to be combined. Take the good from manual and fully automated procedures, and marry them to a permanent, scalable solution.
Taking the same scenario that was played out using the other two procedures, let’s see how a partial automation with human guided procedures would resolve the incident. Steps 1 through 5 are simple, they are automated no problem. But the difficult step 6 of the procedure comes into play. Unlike the previous two options, the partial automation will pause at step 5 and allow the human guided procedure to take over at step 6. The Level 1 agent then follows a simple decision tree until step 6 is resolved. Steps 7 through 10 are automated and the incident is marked as resolved.
Imagine your operations team using this approach to incident resolution. I ask the same questions posed earlier. Are your incidents being resolved faster? Are your support engineers feeling relief? Are you seeing a positive impact on your business?
Fewer escalations, faster mean time to repair, less down time are a few reasons why implementing a scalable solution like Resolve Software System for accelerating incident resolution makes so much sense. How do you accelerate incident resolution? Its simple: partial automation with human-guided procedures.
AppDynamics + Resolve help solve challenges with full-stack observability and automated remediation.
Learn five benefits an automated DDM solution can bring to your CMDB management process.