GenAI risks, rewards arise for DevOps and platform engineers

From chatbots that alleviate pressure on IT help desks to full-fledged LLMOps, DevOps and platform teams are at the forefront of enterprise generative AI adoption.

As generative AI transitions from hype to practical reality, platform engineers and DevOps teams are on the front lines, uncovering the technology's efficiency benefits and addressing its mounting risks.

Generative AI has dominated tech industry discussions since late 2022, but market researchers forecasted that 2024 will be the year enterprise IT pros begin to put it into production. It's still early in that adoption trend, but some organizations have begun to see the results of generative-AI-driven chatbots on the day-to-day work of DevOps teams; contend with the security, privacy, pipeline integration and cost challenges of operating large language models (LLMs); and uncover fresh opportunities for generative AI tools to automate platform engineering tasks, such as onboarding new developers.

Communications platform-as-a-service company Nylas, for example, launched a generative AI chatbot for its customers in August 2023 using a service from Mendable. Since then, it has seen a 25% decline in support tickets opened in its help desk system, despite growing its customer base more than 30%.

"We're seeing a reduction of a support ticket volume, but it's not necessarily reducing the amount of time that our support team spends on tickets," said Isaac Nassimi, senior vice president of product at the San Francisco-based company. "It's those really simple, straightforward tickets that they get over and over and over again that are bulwarked by the Nylas Assist chatbot, so they get to spend their time on the hard stuff, on the interesting stuff."

Capturing chatbot interactions has also yielded helpful data for the company's DevOps team to feed back into its development process, Nassimi said.

"You get questions and feedback that you wouldn't really get otherwise without doing literally thousands of customer interviews," he said. "It's good to find the areas of your product that people are running into problems with and patch them up."

For example, questions developers frequently asked the chatbot helped the DevOps team realize it needed to give clearer instructions on how to manage authentication with user email accounts, Nassimi said.

Generative AI shows promise -- with big caveats

Chatbots are among the earliest generative AI tools to be adopted so far in part because conversational assistants are a relatively mature area of focus in AI research already, said Andy Thurai, an analyst at Constellation Research. But they still require judicious use.

"Generative AI in general is still on version 1.0," he said. "And when it comes to [existing] conversational AI agents, in case of an issue, you can escalate to a live operator. …[With generative AI,] people think it's a superhuman brain power that can answer everything … but there's this fine line issue of when you should bring in a human agent."

Without human oversight, customer-facing chatbot errors can have a negative business impact, Thurai said, citing the example of Air Canada, which was held liable last month for an erroneous answer its chatbot gave a customer in November 2022.

In the last 18 months, generative AI tech "has made tremendous leaps and bounds," Thurai said.

"I tell my customers, if you're not looking at LLMs, you're stupid -- you should be experimenting with them," he said. "But you have to figure out the right use cases."

Careful LLM training was a big part of Nassimi's experience with his company's chatbot, he said, and there's still room for improvement.

"Very few of these things are at the point where you can just kind of take your hands off and be like, 'I hope the customers have a great time,' because it will occasionally give them things that are wrong," he said. "We still have that even today, where a small percentage of the users get incorrect advice, which is frustrating for them. We do want to remedy that, but it kind of comes with the territory."

In addition to the risks associated with chatbots that serve up incorrect information, enterprise concerns about security and privacy also linger as the market begins to embrace generative AI tools, and in some cases, work directly with customizing or hosting their own LLMs. For example, IDC's Future Enterprise Resiliency & Spending survey in July 2023 found that 44% of 890 respondents said security was the top barrier to using generative AI, while 38% ranked privacy concerns first.

Security concerns around generative AI include the potential for the leakage of sensitive company data through answers to prompts, as well as the security and privacy risks associated with company data being used to train third-party LLMs. Ongoing copyright lawsuits about training data could lead to legal liability exposure or license poisoning for software code generated by AI, among other potential business risks.

 As a result, among 158 senior executives at large financial services and insurance companies surveyed by management consultancy EXL Service for its 2024 Enterprise AI Study, 58% said they are deeply concerned about generative AI and 63% have established policies limiting its use.

LinkedIn, Credit Karma platform teams establish LLMOps

Regardless of the risks, at some large companies, operating LLMs -- also known as LLMOps -- and integrating them into the development of applications are already a part of daily life for platform engineering teams.

LinkedIn has revamped its engineering practices over the last year to support generative AI apps and features for end users, such as writing first drafts of InMail messages and summarizing account information for users of its Sales Navigator tool. Along the way, the social network provider's platform engineering team has brokered access for developers to OpenAI models via Azure OpenAI Service, as well as internally hosted and open source models, according to a LinkedIn blog post last month.

The LinkedIn platform team also pre-built libraries for developers that turn structured API responses into standardized API prompts, and developed a GenAI Gateway that governs interactions with cloud-based generative AI models such as GPT, with features such as rate-limiting outbound requests and resource quota enforcement, the post stated.

Some of the pipeline orchestration tools LinkedIn platform engineers created to integrate machine learning and AI data for developers have also been donated to open source, including Flyte in 2019 and last month's release of FlyteInteractive. The latter "provides engineers with an interactive environment inside Kubernetes Pods to … easily debug their [machine learning] model in the 'production-like' environment," according to a company blog post.

While most of LinkedIn's generative AI efforts have been geared toward external use by customers and the open source community, platform engineers have also begun to use generative AI internally to improve efficiency, said Animesh Singh, executive director of LinkedIn's AI and machine learning platform, in an interview with TechTarget Editorial.

"For example, we're using generative AI in our Slack channels to be able to answer a lot of questions [about] typical software migration efforts that have become very commonplace," Singh said. "So, even if you have a doc, a chatbot could be able to pinpoint an answer to the question, exactly."

So far, that bot is in the early stages of development, where users can upvote and downvote its answers, he said, but some developers have already been able to free up time by using it. Similarly, machine learning engineers at LinkedIn use a generative AI assistant with Jupyter notebooks to write database queries using natural language.

As with LinkedIn, which leans on its parent company Microsoft for some of its access to LLMs, platform engineers at fintech Credit Karma are drawing on parent Intuit's generative AI efforts, such as the Gen Studio developer tools built for Intuit's GenOS AI platform.

Also similar to LinkedIn, Credit Karma's early platform engineering efforts for generative AI have focused on developer experience, including coordinating developers' authentication in Gen Studio, said Jeremy Unruh, senior director of engineering at Credit Karma.

"We're using different authentication [from Intuit] as [Credit Karma] employees, so for the dev side of things, we had to build our own layer that can translate," Unruh said. "Now, when [they] talk to Gen Studio, we can track and log … who's using what."

Charting the LLMOps frontier

Next, the Credit Karma platform team is working on a stacked ranking-like tool to improve AI feedback on pull requests in its developer pipelines, and will work this quarter on a chatbot that can answer questions about platform documentation for developers. Eventually, generative AI might also play a role in automatically spinning up platform infrastructure resources on demand, Unruh said.

"We have a bunch of stuff that we're looking at doing, like creating all the scaffolding for various types of requests based on our platform standards," he said. "Things like, 'I need a new microservice,' things that GitHub Copilot isn't really fine-tuned for."

At LinkedIn, more IT ops-focused uses for generative AI are in progress internally as well, such as AIOps-driven incident remediation augmented by LLMs and natural-language interfaces, according to Singh.

"Multiple scenarios are emerging where you can have a combination of these models acting as agents, working amongst themselves to be able to coordinate root cause analysis, create tickets and alerts to be sent to the corresponding teams," Singh said. "Some of these steps are very programmatic in a sense that you know what to do, and then have some reasoning capability to do that task well within that context."

Among the open source projects for this kind of AIOps scenario, Singh said he's following are LangChain's Agents and Microsoft's AutoGen.

Another emerging LLMOps challenge is how to effectively evaluate the quality of natural language outputs, which are less precise than traditional AI model results, he said.

"For similar input, generative models can give you different outputs at different points in time, depending on where they are in the learning curve," Singh said. "If there is a variation in the prompt, it becomes a lot more challenging to ensure there is consistency in the response of these models."

Generative AI's complicated costs

GenAI early adopters have also encountered issues with managing its costs. First and foremost, operating LLMs is expensive, according to Singh. Tools such as FlyteInteractive were developed to reduce the expense of training machine learning models and have already saved thousands of developer hours on such tasks.

Next, Singh said his team will refine how the company uses self-hosted open source models versus cloud-based LLMs to further control costs.

"We'll be developing some methodology around open source models versus hosted models, because generally, they're expensive to maintain [and] expensive to run [with enough] GPU horsepower," he said. "We are working to ensure that the product teams are able to get the right data to make the right decisions [without high costs]."

So far generative AI has failed to deliver cost savings for most respondents to EXL's survey. The survey divided respondents into groups of "leaders" with advanced use of AI and "strivers," who are catching up. Less than half of each group -- 46% of leaders and 37% of strivers -- said they had achieved cost savings.

It can be especially tricky for software vendors to determine how to distribute costs and savings for AI-generated products, said Constellation Research's Thurai, especially when extensive LLM training time and the need for human supervision are factored in.  

"Generative AI makes code producers more efficient, but how do you measure efficiency?" he said. "More importantly, how do you figure out how to convert that efficiency into production when there's another client [company] involved?"

Beth Pariseau, senior news writer for TechTarget Editorial, is an award-winning veteran of IT journalism covering DevOps. Have a tip? Email her or reach out on X @PariseauTT.

Dig Deeper on DevOps

Software Quality
App Architecture
Cloud Computing
SearchAWS
TheServerSide.com
Data Center
Close