

HPE Networking Chief AI Officer Bob Friday and I recently participated in a podcast with Tech Field Day. The starting premise of the show was, “Data center networking needs AI,” with which we wholeheartedly agree. Bob has spent the last 10 years as a pioneer in bringing AIOps to networking. He kicked off the discussion by articulating HPE Networking’s journey toward the Self-Driving Network™, a vision he explained in his recent 6-part blog series. We then dug deeper into what AI means for data center networks in particular and outlined some of the work we’ve been doing in this area. This is what we’ll cover in this short, two-part blog series.
Is there a problem to be solved?
People are afraid to touch their networks. It sounds ridiculous, but most network engineers are nervous before, during, and after a change, whether that’s provisioning a new service, a firmware upgrade, etc. The process is stressful and operators worry that if they touch the network, they may break it.
The fundamental problem underlying this epidemic in data center networking is complexity: the alphabet soup of dozens of protocols to understand, configurations of perhaps thousands of physical and logical devices, multiple infrastructure vendors to manage. The list goes on. Combine this complexity with the endless flood of data that operators are subjected to, coupled with many inadequate troubleshooting tools on the market today, and data center networking teams are often overwhelmed. They end up drowning in data but starved of insights.
This is precisely the type of situation where clever AI and machine learning algorithms can be useful: mountains of data that are tough for humans to sift through, but for the most part the data is actually fairly well-structured.
It’s not about the technology, it’s about what you need
Too often, technology discussions start in the wrong place: with the technology itself. Many of us have heard that edict from above in our organizations: “We as a company need to use AI or we’ll get left behind.” Or maybe you’re hearing it from a vendor: “You need to use AI in your data center network”. But these are aimless starting points. The starting point with any technology discussion should be, what are your goals? What are the problems you have that need to be solved?
Our goal has always been very clear—deliver the best possible user experience, whether that “user” is the data center network operator, or the end user that relies on the data center, whether they know it or not, for the applications they use. Over the last 10 years, Juniper has developed and applied AI, incredibly successfully, first to Wi-Fi and then to the rest of the campus and branch domain. In the data center we have a second-mover advantage, able to glean all of the insights from lessons learned in the campus.
Data center network operational challenges
If the goal is to optimize the user experience, then the challenges to be solved for data center practioners take close collaboration with our customers. We group these challenges into three categories: limited insights, insufficient speed, and poor reliability.
Before we blindly jump into AI as a panacea, we have to thoroughly understand these problems and ask ourselves, can AI help solve them? The answer is, absolutely, yes.
AI is necessary, but not sufficient
But AI does cannot solve every NetOps problem that you have. AI is necessary, but not sufficient. We need systems that use both AI, which is fundamentally probabilistic, and other deterministic approaches such as intent-based networking.
Is it OK to be 99% right on a configuration? No, you want to be 100% right, which requires rules-based, deterministic software. But on Day 2 when the data center is operating out in the wild in unpredictable environments, it’s a different calculation. If you have a system that can tell you with 99% accuracy, based on a myriad of symptoms, the root cause of a problem that you’re seeing, then that’s a solution that’s probably better than what you have today. And this is the power of AI, to sift through massive amounts of data and extract the correlations that humans cannot easily do.
Put the two technologies together, AI and intent-based networking—then you can deliver that unbeatable network operator experience and application experience for end users.
We’ll dig into how we’re using AIOps to solve stubborn data center networking challenges in the second and final part of this blog series.
HPE Networking Chief AI Officer Bob Friday and I recently participated in a podcast with Tech Field Day. The starting premise of the show was, “Data center networking needs AI,” which we absolutely agree with. In part one of this two-part blog series we emphasized that discussions about whether and how to use AI must be rooted in your particular goals and the problems you have that need to be solved. After examining the challenges data center networking operators have, it is clear that AI can in fact help, and here’s how…
AI-native innovations extending our leadership in the data center
Improvements in AI are coming so rapidly that it will undoubtedly become an increasingly large part of the entire data center lifecycle, from Day 0 Design, to Day 1 Deployment, to Day 2 Ongoing Operations. We recently announced several new AIOps capabilities for data center networking.
Predictive maintenance enables network operators to identify future problems and correct them before they occur.
Ø System Health. Predict when a switch will fail based on analyzing data around processor and memory utilization, temperature, etc.
Ø Capacity. Predict when you need to expand the fabric based on data around link utilization, traffic growth, etc.
Ø Optics. Predict when an optical transceiver will fail based on Tx/Rx throughput, power, voltage, etc. Gray failures in optics are always a problem and they can be worse (harder to detect) than a complete failure.
With many of these examples, when the capability is first launched it is not using AI in a dynamic, responsive way. Initially, the system often sets a static threshold, that triggers an alarm. But as good grapes make good wine, good data makes good AI. It takes some time to accumulate data and this is why Juniper has such an advantage over our competitors—we’ve been doing AIOps for 10 years with the Mist® platform. Good AI needs time to accumulate data and learn-train-learn-train and adapt, all for the goal of optimizing the user experience. In the data center, AIOps is still embryonic, but it is improving very quickly.
Every company, in the broadest sense of enterprise business transformation, should be grabbing foundational LLMs and fine-tuning them. Enterprises should be vectorizing the treasure troves of corporate data that they are sitting on to feed into an AI model through retrieval augmented generation (RAG). Every company that sells software should be experimenting with tying that software to LLMs and other AI models. Closer to the networking industry, we expect model context protocol (MCP) to be a key facilitator for agentic AI. If you haven’t built an MCP server for your enterprise software, do it now!
A significant amount of AI innovation over the coming years will be customer-led. When vendors put open systems into the hands of customers, you can get amazing and even unexpected results. Throughout corporate history, many industries have been transformed by revolutionary innovation driven not by suppliers, but by end users.
May you live in exciting times
Most of us are buried with information about AI out there and the speed at which it is moving. We want to stay informed but not overwhelmed. However, AI is also more accessible than ever. Anyone can download just about any of the thousands and thousands of AI models available from Hugging Face —for free! A newbie can easily build an MCP server and connect it to a number of data sources. And if you get stuck, just ask Claude to help out. LLMs are beginning to feel almost like human entities that you can interact with. It’s an exciting time to be a network engineer.