Buzzwords being the currency of tech marketing, the application of artificial intelligence and automation to IT operations has given rise to a new one — AIOps, short for Artificial Intelligence for IT Operations.
Gartner describes it as using big data, modern machine learning and other advanced analytics technologies “to, directly and indirectly, enhance IT operations (monitoring, automation and service desk) functions with proactive, personal and dynamic insight.” And it sees AIOps platforms as taking data from all those different domain-specific monitoring tools to provide a centralized, unified view of operations.
CA Technologies is among the vendors touting AIOps, having just released its own AIOps platform: a combination of CA Digital Experience Insights, which correlates data across users, applications, infrastructure and network services; CA Operational Intelligence, which analyzes diverse structured and unstructured data sources from the cloud to the mainframe; and CA Automic Service Orchestration, which automates service requests and can provide automated remediation.
“Predicting something before it happens is one thing, but being able to remediate without human intervention – if you’re running out of storage, if you need to reset something in the middle of the night – that’s another,” said Ashok Reddy, CA’s general manager of DevOps solutions.
While the platform benefits from the company’s experience with performance management including mainframes, mobile and cloud, he said, it wasn’t just a matter of repackaging three products.
The digital operations intelligence piece has been in beta with certain clients.
“It’s really, ‘How do I learn from the different things happening in operations?’ and looking at logs, events, alarms, different types of things with the network and storage. Being able to look at the patterns and predict things. We just went GA with this launch,” he said.
“We had to do some work on our Automic automation platform… it connects to the machine learning and AI in Digital Operations Intelligence. It’s able to take swift actions on [certain] types of things. It’s kind of like a self-driving app or self-driving data center. You don’t have to have a human deciding something and doing something.”
It looks at data coming in from IT operations — storage, network, capacity, performance, configuration — and can go back to look at historical data, he said.
The platform is built on the CA Jarvis analytics engine, which incorporates open source technologies such as Elasticsearch, Kafka and Spark. It also integrates with a range of third-party monitoring, management, analytics, and visualization tools, such as Splunk, IBM, Elastic, ServiceNow, Dynatrace, AppDynamics, SolarWinds, Puppet, Chef, Tableau and more.
Reddy said the team has been working with clients over the past three months on questions such as “How do you know this actually would work? “ “How do you know it predicts the right things?” “ How do I know whether what it finds is true or if remediation can actually fix the problem?”
Its approach has been to ask for clients’ data for the previous three months, and if there have been outages or other problems, not tell CA about them, but to let the algorithms find them and determine the root cause. One client financial institution had three problems it knew about, but the algorithms also found others.
Setup does not require writing rules, he said, because IT operations people shouldn’t have to be data scientists. As it learns and provides recommended actions, humans further train the system based on their knowledge and experience.
Among the system’s capabilities:
- It ingests structured and unstructured data from multiple IT performance monitoring sources — metric, alarm, log, topology, text and API data — into a single, resilient data lake.
- Predictive analytics help IT staffs get ahead of potential problems such as performance and capacity; issues such as misconfiguration can be fixed automatically.
- Built-in machine-learning-driven algorithms, dashboards, and integrations.
Being able to own the problem end-to-end across complex environments is what sets CA’s platform apart, Reddy said, as well as its ability to provide relevant context.
“Over the last five years, the infrastructure and application stack has become increasingly modularized, dynamic and distributed, i.e., it is built out of ever-smaller, ever-more-loosely coupled components whose relations to one another are in perpetual flux, and whose location is increasingly scattered, both geographically and organizationally,” states the Gartner report on AIOps.
“The increased dynamism of the stack renders the boundaries among the domains increasingly porous and causes one to question the cogency of monitoring any one domain independently of the others,” it concluded.
Upon hearing the term “AIOps” for the first time earlier this year, according to analyst Chris Riley, he said to himself, “Oh great, here we go again.”
In full disclosure, Riley acknowledges that his site Fixate.io has published content for CA.
“My problem with the term is most of the time, it says nothing, and relies heavily on the magic found in the words ‘artificial intelligence’ and ‘machine learning.’ So I insist all vendors in the market — CA, Tricentis, OpsRamp [and others] — bring meaning to the word thru tactics,” he said. “And I think that is starting to happen.”
There are real-world implications to bring more intelligence to DevOps and infrastructure,” he said.
“The areas that interest me the most right now are self-healing/automatic remediation — basically having the first ‘person’ who is on call for any issue being an algorithm. And that algorithm learns over time the common solutions to repetitive issues,” he said.
“But next I hope to see examples of AIOps impacting infrastructure orchestration — living infrastructure that adapts to not only the demands of applications but also to the entire delivery chain,” he said, pointing to the Automic tool’s strength in providing a holistic view.
“There is a lot of waste in IT infrastructure that is based on poor human judgment, and as shift-left starts to really impact developers’ abilities to get their job done, more intelligent environments are going to be the only solution.
Feature image via Pixabay.