What news from AWS re:Invent last week will have the most impact on you?
Amazon Q, an AI chatbot for explaining how AWS works.
Super-fast S3 Express storage.
New Graviton 4 processor instances.
Emily Freeman leaving AWS.
I don't use AWS, so none of this will affect me.
AI / Operations

3 Smart Ways to Use Gen AI for IT Operations Health

Generative AI can help IT teams increase capacity and productivity while saving money, but it may require culture, skills, process and security changes to succeed.
Nov 15th, 2023 6:26am by
Featued image for: 3 Smart Ways to Use Gen AI for IT Operations Health

It seems generative artificial intelligence (Gen AI) is on everyone’s minds these days — and with good reason. In the current macroeconomic environment, AI can help organizations with fewer resources scale and be more effective. According to McKinsey, Gen AI could add trillions of dollars in value to the global economy, impacting a wide range of industries.

Gen AI can also power nimbler customer service experiences. The struggle to keep up with the ever-increasing demand for high-quality delivery and faster response is real. On top of that, customers are also getting familiar with the technology. Why not expect the same agility from any digital business?

For IT operations (ITOps) teams, Gen AI is a phenomenal opportunity to increase capacity and productivity while saving money. Yet, taking the step toward Gen AI adoption can feel overwhelming. It might require changes in the organization’s culture, skills, processes and security. Begin with a tangible set of strategies and use cases to start.

A Step-by-Step Approach to Gen AI

Gen AI’s strength lies in structuring and summarizing information. It’s meant to assist with otherwise time-consuming tasks, like crafting status update drafts, creating incident timelines or taking the first pass at a runbook. Gen AI is a place to start, not the ultimate solution for your business strategy and hurdles.

Also, as with any new methodology or technology, starting small is the best strategy. Narrowing the focus makes it easier to test and learn. Here are four steps to take to get started on the right foot:

  1. Assess current needs: Starting with a small team or project, identify its specific needs and challenges, and determine which could be addressed by Gen AI.
  2. Set clear goals: Once you define your scope of action, it’s time to define your scope of success. What benefits are you expecting from Gen AI? Establish a set of key performance indicators (KPIs) accordingly.
  3. Research and educate: Evaluate which available Gen AI tools and platforms align with your team’s or project’s needs, budget, scalability needs and security requirements. Research and document Gen AI best practices to provide adequate training for your team. Security, compliance and data quality are key topics.
  4. Measure and report: Implement a feedback loop to track Gen AI’s performance according to the previously established KPIs. Evaluate improvements after each experiment.

Three Use Cases to Consider Right Now

Here are three examples of how your organization can start using Gen AI to get small, quick, impactful wins:

  • Create workflows and runbooks: Gen AI can be a co-author for automation, helping ITOps teams manage and resolve unplanned work more quickly. Not only can Gen AI create detailed, well-structured runbooks, it can fetch real-time data from monitoring tools, logs or other sources to provide context and recommendations for possible updates. Therefore, Gen AI is a powerful ally to remove manual toil on a number of tasks, from deploying routine maintenance tasks, to triggering diagnosis and remediation actions to resolve common incidents. Before getting there, make sure to ace your prompt engineering.
  • Foster healthy stakeholder communication: Particularly during major incidents, keeping internal and external stakeholders in the loop is an essential part of resolution. Industry best practices recommend making regular status updates to stakeholders and leadership every 30 minutes (at minimum). But teams need to focus on finding and fixing issues. It takes time to craft those updates and align their tone to the target audience. This is a place Gen AI can again save time and resources. By automatically collecting and collating relevant incident data, it generates a status update draft in seconds. It can even pull information from Slack and other popular team chat apps to enrich the AI output.
  • Facilitate continuous documentation and learning: Cultivating a culture of (blameless) learning in the organization is vital to improve processes and avoid recurring mistakes. Implementing incident postmortems is a big step in that direction. But it takes time for teams to thoroughly and manually document the when, what and how of an incident. With Gen AI, that can take only a few clicks. By pulling from the incident’s timeline, logs, metrics and incident-specific chats, it can provide a detailed run of the incident’s progress and, more importantly, recommended next steps.

Always Keep a Human in the Loop

While AI can help to generate content, human oversight is crucial to ensure its accuracy and relevance every step of the way. Always have the team’s appropriate subject matter expert review and edit content where needed. Again: Gen AI is there for (some) of the heavy lifting, but it can’t replace human expertise.

Transform Your Operations with Gen AI

Start tapping into the power of Gen AI to manage unplanned work faster and more efficiently. Learn how PagerDuty is using Gen AI for the PagerDuty Operations Cloud.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Pragma.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.