Providing breakthrough IT operations requires ITSM, DevOps, and other ITOps leaders to efficiently and economically deliver exceptional employee experiences and high-performing business applications. Achieving these expectations isn’t easy. Today, IT organizations must support both legacy and cloud-native applications, deploy to hybrid clouds, target more frequent releases and changes, scale infrastructure for growing data sets, and harden security measures against a growing number of threats.
It’s difficult to lead IT operations today, and improving experiences and services while driving efficiencies across large, more complex landscapes is a hard equation to solve. To address this charter, many ITOps leaders look to streamline workflows by augmenting IT staff's skills and decision-making competencies with a library of automations that supplement people’s capabilities. Advancing these initiatives towards hyperautomation will be critical for IT success in an increasingly complex environment.
Hyperautomation aims to unify people’s skills with system capabilities for automating repetitive tasks, discovering information from complex data, orchestrating large-scale operational procedures, and implementing rule-based decisions. The goal of hyperautomation and platforms that support it is to make people more successful at their jobs by leveraging, augmenting, and extending their expertise.
Hyperautomation is not robotic, and IT isn’t so simplistic that engineers can fully automate every task and decision. To better understand where hyperautomation can enable breakthrough IT operations, here are three practical examples that IT leaders should consider today: automating CMDB updates and maintenance, low-code automation, and AIOps.
Tracking assets and configurations was complicated enough when IT environments included data centers and desktops that didn’t change frequently. Today, IT infrastructure consists of public clouds such as AWS, Azure, and GCP, and private clouds where systems are virtualized and managed with hypervisors such as vSphere, Hyper-V, and RHEV. Organizations investing in IoT and 5G applications might deploy to edge computing infrastructures with technologies such as AWS Wavelength and Azure Edge Zones.
Auto-discovery and dependency mapping or DDM is a hyperautomation that collects system, storage, network, and application information on a defined schedule and can be used to uncover insights, populate a CMDB, and support ITSM processes. An agentless DDM means that IT doesn’t have to preinstall services or applications on the infrastructure to make them discoverable. Instead, lightweight data collectors scan networks to quickly identify compute, network, and storage configurations.
IT uses DDM-collected information to augment their understanding of how applications and business services consume infrastructure (and are impacted by the performance and availability of that infrastructure). They can group flows using simple user interfaces to create a single pane of glass for each business-critical app. And by automating DDM for near-real-time accuracy, the current state of the infrastructure gets captured and can be used by ITSM teams to improve employee experience and provide proactive support services.
Let’s consider a simple but very realistic scenario where there is a sudden increase in an application’s usage. Rapidly fluctuating demand is common for ecommerce applications and can be a challenge for pharmacies supporting COVID vaccines, banks supporting new loan applications, or any enterprise adjusting infrastructure to support remote working. Rapidly scaling applications can bring on performance issues, and application flow topologies generated by agentless DDM can help IT incident managers quickly pinpoint problems and resolve issues.
Having richer and up-to-date information on the flows connecting business services, applications, and infrastructure provides IT the insights required for proactive actions and automations.
Automating common tasks and orchestrating more complex workflows is a good way to improve the mean time to recovery (MTTR) to incidents, fuel faster responses to employee requests, and increase the reliability of deploying changes. Common tasks include steps to restart web servers, replicate databases, patch systems, provision infrastructure, predict low disk space, assign application entitlements, and automate change requests. These automations all help improve reliability by ensuring consistency in a task’s execution.
Now there are many ways to develop automations, but IT Ops teams achieve breakthrough results when implementing hyperautomations with platforms and paradigms exhibiting the following attributes:
These capabilities can drive significant ROI for NOC and IT Operations that must improve the reliability of systems and responsiveness of service desks while reducing costs. IT organizations realize the financial returns by improving network uptime, resolving incidents faster, improving customer satisfaction on IT requests, assigning more IT people to work on projects, and enabling growth by automating more procedures.
The third area of hyperautomation for breakthrough IT operation teams is implementing an AIOps solution that aggregates operational data from log files and monitoring tools. AIOps solutions centralize the data and then use machine learning to correlate alerts from multiple systems into a time-sequenced incident.
Here’s why AIOps is vital in today’s complex IT environments.
Everyone who’s worked in IT long enough knows that enabling monitors and alerts is only the starting point to providing more reliable IT services. In the past, whenever there were recurring system issues, IT implemented monitors and set up alerts to get notified on system issues before end-users noticed and opened tickets. When one monitoring tool wasn’t sufficient for a specific application or technology, IT procured a new tool to support it operationally. When business leaders wanted to improve uptime and MTTR, IT tried and often failed to integrate monitoring tools and ITSM platforms sufficiently.
There are just too many systems, applications, log files, monitoring tools, and operational data to manually create an end-to-end process for capturing alerts, resolving incidents, and researching root causes.
In some cases, centralizing the alert data enables IT to automate the most probable actions. If a website isn’t responding, then it can trigger restarts and close the incident if successful. When there are more complex incidents, the team can efficiently conduct a top-down analysis through the AIOps correlated view of all the underlying events.
So, during a major incident, instead of joining a bridge call where numerous subject matter experts analyze uncorrelated data from different monitoring tools, the analysis with an AIOps solution in place starts with a centralized view of all the alerts and information from related systems.
While there are many ways to implement machine learning on IT operational data, the real transformation is integrating AIOps with DDM and low-code automation. The DDM provides context for the AIOps machine learning algorithms and connects alerts to flows and business services. The integration with low-code and modular automation enables IT to hyperautomate insights into actions.
IT is full of complexities and opportunities. When IT leaders select platforms that auto-discover and map system information, aggregate alerts, apply machine learning, and connect to automation, then it leads to breakthrough operational results.
The CMDB has a checkered past but is finally ready to shed its bad reputation thanks to AIOps.
Driving change to implement widescale IT automation and AIOps starts with the right message.