The Cloud Gambit

Roundtable: GenAI, Smart Agents, Automation, and What’s Next for Networking

June 04, 2024 William Collins Episode 23
Roundtable: GenAI, Smart Agents, Automation, and What’s Next for Networking
The Cloud Gambit
More Info
The Cloud Gambit
Roundtable: GenAI, Smart Agents, Automation, and What’s Next for Networking
Jun 04, 2024 Episode 23
William Collins

Send us a Text Message.

William sits down with Chris Wade, Co-Founder and CTO of Itential, and Craig Johnson, Technical Solutions Architect at Forward Networks, on the roof of the Hyatt in sunny Tampa Bay for a spirited roundtable. In this conversation, the focus is practical, and the conversation is raw. How is Generative AI impacting Network Infrastructure? Why has Network Automation seen slow adoption in the Enterprise? Is AI driving a more verticalized tech stack? And what is next for Network Infrastructure?

Where to find Chris
LinkedIn: https://www.linkedin.com/in/chris-wade-9100136/
Twitter: https://x.com/chris_a_wade

Where to find Craig
LinkedIn: https://www.linkedin.com/in/captainpacket/
Twitter: https://twitter.com/captainpacket
TikTok: https://www.tiktok.com/@captainpacket

Follow, Like, and Subscribe!
Podcast: https://www.thecloudgambit.com/
YouTube: https://www.youtube.com/@TheCloudGambit
LinkedIn: https://www.linkedin.com/company/thecloudgambit
Twitter: https://twitter.com/TheCloudGambit
TikTok: https://www.tiktok.com/@thecloudgambit

Show Notes Transcript Chapter Markers

Send us a Text Message.

William sits down with Chris Wade, Co-Founder and CTO of Itential, and Craig Johnson, Technical Solutions Architect at Forward Networks, on the roof of the Hyatt in sunny Tampa Bay for a spirited roundtable. In this conversation, the focus is practical, and the conversation is raw. How is Generative AI impacting Network Infrastructure? Why has Network Automation seen slow adoption in the Enterprise? Is AI driving a more verticalized tech stack? And what is next for Network Infrastructure?

Where to find Chris
LinkedIn: https://www.linkedin.com/in/chris-wade-9100136/
Twitter: https://x.com/chris_a_wade

Where to find Craig
LinkedIn: https://www.linkedin.com/in/captainpacket/
Twitter: https://twitter.com/captainpacket
TikTok: https://www.tiktok.com/@captainpacket

Follow, Like, and Subscribe!
Podcast: https://www.thecloudgambit.com/
YouTube: https://www.youtube.com/@TheCloudGambit
LinkedIn: https://www.linkedin.com/company/thecloudgambit
Twitter: https://twitter.com/TheCloudGambit
TikTok: https://www.tiktok.com/@thecloudgambit

Intro:

William sits down with Chris Wade, co-founder and CTO of Itential, and Craig Johnson, technical Solutions Architect at Forward Networks, on the roof of the Hyatt in sunny Tampa Bay for a spirited roundtable. In this conversation, the focus is practical and the conversation is raw. How is generative AI impacting network infrastructure? Why has network automation seen slow adoption in the enterprise? Is AI driving a more verticalized tech stack and what is next for network infrastructure?

William:

Welcome to another episode of the Cloud Gambit. Today, coming from beautiful Tampa Florida, I have Chris Wade from Itential. You want to introduce yourself, chris, thank, you.

Chris:

My name is Chris Wade. I'm the CTO at Itential. We build network automation and orchestration software.

William:

And I also have a Cloud Gambit, alumig johnson, from what like six or seven episodes ago. Yeah, I remember, so you want to introduce yourself absolutely.

Craig:

Craig johnson uh technical architect at forward networks and I lead our cloud team, so we are a network digital twin for your multi-cloud network awesome so none of us are from tampa, nope you.

William:

You're closer and you are in Dallas, dallas.

Craig:

So two and a half hours or so. So similar kind of weather, so yeah, not too bad.

William:

Not too bad. So we were here for the USNUA. They call it FLONUG FLONUG.

Intro:

Florida Nug yeah, show them the shirt.

William:

Yeah, it's great Good shirt. So community event around you know, network infrastructure, just all things networking. We had a blast last night and it was good and we decided, hey, we'll record an episode here on the roof. So getting into getting into some topics Generative AI.

Craig:

Oh no.

William:

It's Craig's favorite topic in the world. You really can't walk into grandma's house anymore without having a discussion about generative AI, hearing about it everywhere. It's getting crammed into every product out there. It's also monopolizing the news cycle. I know with new tech innovation, when cloud hit the scene, like ZTNA, zero Trust they become serverless is a big one too. They just become polarizing and folks almost get like tunnel vision. But nothing compares to like the hype machine that we're seeing with generative AI. Both feel about what is the actual value in the real life application at this point of generative AI in the domain of network infrastructure and automation.

Craig:

So, having lived through the SDN and open flow days, this is a little deja vu for me. I think there is absolutely some value. The value I see is more on some of the co-pilots that I see out there, instead of a generative AI to you know, solve your problems and do the configuration for you, I think there's a lot of value in things like helping you with you know config completion, giving you config snippets and blocks, telling you how, what you may be suggesting a config that you might need to do to solve a task. I'm a little more skeptical at this stage of the game of something of a full closed loop to actually you know, suggest, run, config, validate the entire automation to you. I don't think we're quite at that level where we can remove humans there, but I do see a lot of value in copilot, starting out in the coding space and now moving a little more into the network automation infrastructure side started off with an easy one huh um, go right for the throat, exactly.

Chris:

Um. So I mean you bring up open flow I I would compare it a lot to what we've been through with cli. Um, I think a lot of these first applications have really been built for humans. So, whether it's augmentation and code development, prompt engineering is very human driven. You asked how it's going to apply to networking. I think we really need to think about machine interfaces. It's kind of a bummer for me. As we automate and orchestrate the network, we're trying to remove that human tax from doing that and to think that we're going to orchestrate prompts seems a little strange to me. It seems like it's very interesting, but I'm looking forward to the machine interface to these things. Um, you know, uh, co-pilot, so generic I. I'm looking forward to the next couple weeks as we see very like smart agents that actually know what they're doing, that we can ask smart questions to and get smart answers. Um, so I think this first generation is very like interesting, um, but maybe not as applicable, like in the automation loop.

William:

It's more like augmentation for humans as we get started yeah, I totally agree because really, at the end of the day, it's about, especially, as we've seen, cloud adoption, like what are enterprises actually adopting, what are businesses willing to pay for? And right now there's a lot of AI products out there and, from the looks of it to me, they're still trying to figure things out. Like all this is. So there's so much exploratory stuff going on. And of course, you bring up automation and orchestration. Um, many organizations out there are just struggling to automate their networks. Um, I think in a lot of cases there there has been progress with, like, individual, task-based automation, like, hey, we need to add some VLANs or we need to do these individual things, but in terms of like and those are all low risk, extremely low risk changes. Usually you know what is the right way to think about the separation of automation and orchestration and sort of fitting those in like in the enterprise space.

Chris:

So I typically like to think about our cloud friends and how, you know, I think in a more controlled environment we've been able to make more progress.

Chris:

So automation is very domain specific and from my perspective, how we automate a firewall or how do we automate a data center versus how we automate cloud, um, you know, it's quite different when we think about orchestrating things. The typical pattern and from an infrastructure as code perspective, is really pipeline driven. So as people think about, like what do I put in my terraform plan versus what I put in my pipeline, um, you know it's kind of task-based execution that you want, you don't want to touch as much, and and then some of the fallout logic, testing logic, end-to-end logic we tend to put in that orchestration layer. So I think, as people think about tying multiple domains together, typically that task-based automation you're talking about gets very convoluted, a lot of business logic. And then we think about our script heritage where fork and modify. So trying to keep that separation allows us to keep our automation very reusable um you could share, um you can use for multiple use cases. So it's really that separation, I think of domain automation, uh, with with kind of uh orchestration tying those together.

William:

I guess to sort of add on that. So I worked on I remember this very specifically because it was a pretty rough project but I worked on a big automation initiative with like. The business had a few different network teams, so we had actually a separate team for like campus. We had a separate team for colos. We sort of had a separate team for cloud at the time and we had this.

William:

You know, it was when automation was super hot and we were trying to find ways to, I guess, bring our change management into the future. We were still like really heavily ITIL based and you know we had this thing. I think the name was like we named it like Greenlight API and then it just got renamed to Greenlight. So basically it was things that we would choose like a whole process, and some of it was like automating prefix lists and things in a colo or route maps, changing route advertisements to like a specific cloud provider, but then building like net new things within the cloud. So we put all these things in a pipeline work with change management and that became a process.

William:

But then even then, after we started doing this, they wanted us to represent these things on change advisory boards and we couldn't like there wasn't ever a case, even with like net new stuff, that we could just run the pipeline, have a. You know somebody hit the rubber stamp and then go. Somebody hit the rubber stamp and then go. You know this change advisory board, this traditional way of doing managing changes to really mitigate risk, and I totally understand that it's a really hard thing to get away from. Do you have any thoughts on how culture, how change management, how these things you know where they're at and how they can kind of come into the future?

Chris:

Yeah, I would say I mean I think the technology easy, how they can kind of come into the future. Yeah, I would say I mean I think the technology easy. People hard kind of comes in here. I think that some of the struggles you speak about with automation come down to we actually have to change how we operate. We have to change the operations and if we try to automate the way human processes work, we end up with a lot of kind of awkward things.

Chris:

Some of the first use cases we always see with automation is we automate the existing process and as soon as you see it automated with computers or machines, you look at it and say that makes no sense. So we actually have to modify how we do that. And I think the question is we still have to orchestrate humans. I think CCBs is human orchestration. We're not going to get rid of that. But I think the key is to separate which things are human orchestration and which things can be machine orchestration and really focus on the use cases and the aspects of automation and orchestration that can be accomplished full end-to-end. And I think those examples where you kind of run into the wall, people start to say, well, we can't automate other things. So I think use case choice is often overlooked and changing how we operate versus automating the way we work today.

Craig:

I guess I kind of take a little bit of a different tactic with that, because I look at some of our partners in the other side of the IT world and I think, by the nature of networking itself, we lack the validation.

Craig:

How often do we run something similar to like a unit test on any sort of network automation or change?

Craig:

It's really not possible in a lot of cases and I think that's where a lot of the skittishness, where people, comes in. And I think we need to have something like that, some way to validate, so that you can give that change advisory board the surety yes, we've tested this change and not simply yes, I know this config will apply correctly, but I know that it will actually have the impact that I expected it to, and I think that's where we've all been burned way too many times on that. So even when you do that very domain specific where, yes, I can push this config change with an Ansible or something, I can make sure that config complies correctly, it's still far too complicated with all the protocol soup that we have out there to really make sure that I know what the impact of that change is. And until we have a way to create that level of validation. I don't think that will change, at least among the entire. You know business logic.

William:

That's a good segue. You know, audit is just the sheer idea of, hey, we want to automate, to standardize and to make things faster. It's not enough. You need a way to validate if changes that you're going to execute are going to impact the larger environment, larger infrastructure. And now enterprises, you know they're not just on premises, they've got infrastructure kind of everywhere, multiple clouds. You know just no longer the days where it's our four walls of our data center and we own everything from top to bottom. So I would say, how is, you know, ford Networks is, you know, leading in sort of the digital twin space? Y'all do some really awesome work. How are you viewing the future of this problem? Like, does the tooling and everything need to change to support the generative AI stuff? Is that still getting figured out? Or, you know, does it not really impact the value proposition of a digital twin? And you just still? You just kind of hook it into what's happening right now.

Craig:

So I think to your point on the machine interfaces. I think that's getting better among the vendors and I think we are seeing the ability to have that level of automation so you're not having to do screen scraping and CLI. So I think we're getting a little better on being able to model what those devices do. And as we look inside the cloud yes, because I can set up parallel infrastructures it's easier to you know to look in that perspective. So I don't have to, you know, modify my current infrastructure. I think you need to have some level of you know, modeling and parallelization of your network, and vendors are getting better about providing you know tools and labs to do that.

Craig:

Obviously, you'll never get down to exactly what an ASIC will do, but, yeah, I still think you're going to need some sort of parallel infrastructure that you can run. Those unit tests run those pre and post flight checks are to make sure that you know not just that I applied the config correctly, but make sure my intent is correctly. So, whether it is forward networks or any other type of twin out there, I think you have to have something like that and that's really to me that's the enabler of a cultural change, because then I can prove in one way or the other that, yes, this change had the impact that I wanted it to, or no, it didn't. So I don't exit the change with a broken network. I mean, it's far too complex these days to you think you've made a change correctly, you've automated, you've pushed it out, but it's not until some weird app on Monday morning that you had no idea was doesn't work.

Chris:

And because nobody can check every application on the network and nobody can be on the call during a change at night to check it all as well. Just add one thing there. I mean, I like to think we're in the trust and confidence business sometimes and I just keep going back to you know, when we're talking about this. You're talking about, you know, unit tests. These are all machine interfaces, right? So I think historically we've had, you know, a lot of blinky lights and other human-focused network management concepts, and if we're going to get to the point where we can run things without human in the loop, I think we have to think machine first and change how we operate.

William:

Yeah, totally agree with that and so roles and responsibilities. So there's a lot that the vendor community can do. I know that all three of us work for, you know, ford Networks, itential Alkira. We're sort of almost in the integration play where, you know, no enterprise I've ever worked for or ever seen is gonna use a single vendor product for all security or all network. And with the progress of APIs, the progress of software, it's a lot easier to integrate, as we all know, with other vendors. Nowadays, with APIs, we're kind of speaking the same language most of the time, which is really awesome. It's not a hey, I need to file a ticket or a feature request with this vendor that's been around, that maybe manages boxes and it takes. Oh, it just disappeared into the product management suite and you just don't hear anything for like a year. So I guess the question is is there something more that vendors can do to make this transition better for enterprise? And also, like, what can enterprises do as far as their thought process and adoption to make things more smooth with these changes?

Chris:

A lot to unpack in that question. So I would say I find it unacceptable for modern software vendors to not be able to interoperate and require, like you know, a huge army of integrators to make that happen. But it is a little different in the sense that we have standard technologies for integration, so we can talk about REST APIs. We talked about some of the gRPC stuff going on the lower layers, netconf and other things. But if you look at like hyperscaler documentation and APIs, like I, can read the docs and figure out what to do A lot of times. You know we do integration amongst the three of us, also industry-wide, and I often have to ask what APIs do I do and what order to accomplish the outcome I'm looking for.

Chris:

So I think we all like to post our APIs. Some of us hide them behind, you know, ndas and other things. I wish everybody just put it on the Internet. But I really think the documentation and understanding how to use the APIs is the key bit. So you know, as far as enterprises go, I would find it unacceptable for vendors not, to, like amongst themselves, figure that stuff out. I think we've reached a technology point where that is table stakes. But as far as how to use it more. I think we have to collectively work on making it easier for people to know what and how to use our APIs, and I think documentation is number one, two and three.

William:

Totally agree.

Craig:

I do sympathize with the vendors a little bit because it's not easy.

Craig:

If you have an architect that an API first application, it's hard to bolt it on afterwards. I mean, we've all probably worked with APIs that are simply wrappers around CLI commands and things like that, so where you can tell they've bolted it on afterwards. And if you don't have a data model that's built from the beginning, if you don't have a good API first application, it's hard to retrofit. That being said, I totally agree on you know, when you paywall your APIs and you make it difficult, that just makes everything difficult for everyone else and I understand that's a demand driven, that's a product driven thing. But I do think that is one place vendors can absolutely even though I understand they don't want to divulge potentially competitive info, things like that that is absolutely something that everyone should demand from every one of their enterprise vendors. If it's not API first and if you don't have an example of your customers, you know, not even consuming your fancy GUI that I'm sure you're very proud of that, yeah, you should absolutely. You know demand that of your vendors.

William:

Yeah, I agree and I think you can. I think in the very early, early stages of a startup, I can see them wanting to keep things walled off and everything. But once things are real, once the rubber hits the road and things are happening, why isn't, why isn't it out there? It's actually going to help you, it's actually going to help you with marketing and sales. To have it out there, it's actually an advantage to you to to make those things uh, public.

Chris:

In my opinion, I would say my sympathies that are all-time low on that topic, um, but if if I could say something positive about it, it's. I think the networking layer has stabilized, which I think is the most like positive outcome. That's happened over the past maybe 12, 24 months and I think with that stability people can start thinking about what I'm going to build on top of it. You know, if I'm adding a new SD-WAN, if I'm adding a new middle mile, if I'm adding a new colo, if I'm doing multi-cloud, if I'm putting these pieces of infrastructure in place, I don't know what's changing. So I'm constantly thinking I'm adding maybe zero trust on top of my firewalls. I have all these things changing. I don't understand what's connected to what. So my level of automation maybe isn't there, but with modern infrastructure and with stability in the networking, I really think the focus on building some operational platforms is a lot of focus from people I speak with.

William:

Yeah, I know, keith, you've brought up the hyperskillers a few times and I I know that some of the folks that have been in networking at large, like some of the pioneers, some of the heavy hitters, I would say there's a lot of reluctance for a long time that, okay, you can actually automate a network end-to-end oh, it's too complicated, it's this or it's that. Everything is a snowflake. But the hyperskillers did it and they did it pretty daggone good. I would say that AWS, azure, gcp, oracle, they've got a good thing going on. Is there any merit to learning from those bigger companies, even though I mean under the hood? It's not like you have a total insight into exactly how they execute and do everything, but you kind of get the gist of you know, we know the technologies, we know, you know the options and you know it's not like TCP, ip, bgp and all these things has changed, you know, drastically in the last 20, 30, you know, years. So any thoughts there? Can we learn from the cloud and not be scared of it?

Chris:

I think we learn more from the people using the cloud. I think most enterprises said they wanted data centers to run like public cloud. I think there is some merit there, but what I'm always impressed with is just the scale and usability that's driven. So I like to look at how people leverage the cloud for scale and say how can we recreate that scale? I think you know, in the networking world, we always love abstractions, we love data models. That is from our historical past of pain of multi-vendor networks and vendors not having standards on top.

Chris:

So we've tried to apply that, I think, that thought process in the cloud domain and, right or wrong, there's not 27 public cloud vendors that most of us use, like we used to with vendor infrastructure, firewalls, load balancers, route switch, everything, blah, blah, blah. So I think, trying to build abstractions, trying to make them abstracted so I can move payloads around, what we used to hear like five years ago type of stuff, but they've continued to build more aggressive automation and operational excellence which allows them to scale. And I think, looking at how we operate the cloud and saying how can I apply those principles to our infrastructure. And then, craig, what you were talking about earlier, with some of the nuances of brownfield and uniqueness in the network and trying to build unit tests around that. You know we have to really evaluate the time, effort, energy value prop in that versus some of the mass scale value benefit, versus the kind of bespoke uniqueness that we like to kind of project a lot of times.

Craig:

So I have a slightly different take. I think we've lost something in abstracting everything that we've done from the network, like if I look at what the cloud providers do. They of course have plenty of abstraction layers, but they get huge amounts of data from all of their devices you know, all the way down to the ASIC level, and they demand that from their vendors. They don't just say, give me your CLI, they want access to data that comes from every port, every. You know everything that happens and I think we've somehow sacrificed getting the data that we need from the network, from the flows, from how everything's going into.

Craig:

You know lots of different abstracted protocols and abstract APS, which is great for automating and pushing it, but at the same time, you lose something in that level of validation when you don't have that data and you're trusting you know everything that is happening from. You know. You know a 40 teraflop ASIC to simply five lines of CLI. You lose a lot of the context with it. So you're left to trust whatever the vendor says is true, and I know the hyperscalers definitely don't do that. They go all the way down to the metal. So I think there's something that we lost there that I think we need to get back to get that level of confidence in the network.

William:

That's a good point and, honestly, if things don't meet the status quo or the demand, I would say, or the innovation that hyperscalers want to see, they roll their own, they'll build their own. They'll, you know, work, roll their own, they'll build their own. They'll, you know, work their own supply chain. They will do it all themselves and bring it in-house, and they have the, the deep pockets to do that, of course. Um, yeah, so I, I think, yeah, those are some really good, um, really good thoughts there. Um, so we're in 2024 now and every vendor is also every network vendor, specifically, is also an ai company, um, and is also a network as a service company. We kind of discussed this last night, like what does network as?

William:

a service even mean anymore, um. So what? What are you both seeing from like network vendors? Is they they attempt to actually incorporate, like ai well, machine learning, artificial intelligence or the smart agents? You know you brought that up earlier. Are they going to take over um, or is it going to start more practical, like maybe using llms to scrape documentation, generate config snippets, or you know, starting off small and slowly grow um. You know what do you all see in there, and I actually, as a followup to that, you Cisco lives coming up, are you, are you both going? Absolutely you going, yeah, awesome. So I'm sure there's going to be some epic, absolutely epic, amazing announcements in the, the, the keynote, of course.

Chris:

I'd love to hear your all's thoughts on on those things. So I think, uh, I, I would be, I, I guess I will be shocked when cisco doesn't announce some smart agent, uh, next week. Um, but uh, it's going to be interesting to see how this uh kind of plays out, because I, I can, can't imagine I mean, cisco has every tech case, every Cisco config, like I can't imagine anybody on the planet for Cisco equipment to build a better intelligent agent. It would be awesome. In my flows and my unit testing, if I get, if I get an error message, it'd be nice to ask the agent how does this happen, you know, versus me writing 84,000 unit tests. It'd be nice to have. You know, I write my unit tests that are pertaining to my business, but then I get to ask some unit tests against a smart agent that knows generic, you know Cisco equipment.

Chris:

But I think where it ends up is it? It verticalizes the stack even more, because I think these AI agents are going to be very specific to the vendors. I don't think anybody Cisco is not going to train Juniper and Arista switches, right? So what that means is, you know, we forever have tried to abstract the device level, right. So this operational level, and I think AI is actually going to force it up. You know, with controllers we've had kind of this operational layer put on top of this infrastructure.

Chris:

The original intent of controllers was to be domain specific and vendor agnostic. That kind of went away, I don't know, like 2016 or 17. So I think AI is only going to accelerate that. So we have to think. I think we have to think about how do we live in a world where there's more verticalization in these technology domains and we might be integrating at a slightly higher level, unless you're just going to scrape the controllers and scrape the AI and all that operational intelligence off and just deal with the bare metal, as you were talking about, and build it up from there. But I think the average enterprise would get great benefit. I mean, we've seen, I think, great benefit from controllers mass pushes, single point of software control, lots of benefits to controllers. I think my opinion is AI is only going to accelerate that.

Craig:

Yeah, I think I tend to agree and I think it gets. You know, you talk about Cisco and you know if I say I run a Cisco shop, you know which one of five different Cisco operating systems do I run and what exactly does that mean. So I think even when you talk about even a single vendor, you don't get out of this problem as well. And I think I tend to agree. You need to push this up. And does that mean there will be LLMs on top of LLMs to do this?

Craig:

Very likely you'll see something like that, and I think in the last probably 12 to 24 months we have seen the you know, help me generate this config or validate this config does what I think it's doing. But I think the next level is definitely those kind of agents to maybe proactively tell me you know, hey, I see these log messages being hit or I see these counters being hit, here's what that kind of means and give you some level of proactivity to it. So I'm hoping there will be some kind of announcements out there. I would be surprised if there's not. But yeah, I think the next step is going to be not just you know, what does it do for ACI or what does it do for Viptela. You know how do we abstract that across vendors? Because, like I said, even if you say you run one vendor, that one vendor has five different operating systems.

Chris:

Maybe turn the tables. I'm not CCIE. I don't know if either of you are. I didn't look at your bios. Craig has like a thousand of them.

William:

A thousand of them, like all of them, he's, like you know, a bajillion CCIEs over there.

Chris:

So, if I could, maybe a question you know, something I think about is you know, when you talk about LLMs, I think the next version of AI is hopefully beyond that. Hopefully we have some machine interfaces and some cool stuff. But you know we're talking about. You know the obvious chat GBTs. You know scoring 1580s on SATs. You know. Think about my kids taking those types of things. So what would the CCIs on the street feel about a smart agent that's able to produce some of this content that's between people's ears? I'm just very interested in how people are going to think about that. Is it augmentation? Will Cisco slow roll it because there's a concern about that? How does that work?

William:

slow roll it because there's a concern about that. How does that work? So, before we get to the real answer with the expert over there, my take on that is most of the CCIEs I've hired and worked with over the years and I never got the lab, so I got the written at some point and then I actually had kids and that test is a sacrifice beyond measure with your life. But I think CCIEs the most of the ones that I've worked with, it's very deep. They do network design, network engineering, network architecture at a very fundamental level, like much, many layers deeper. So like they are the real architects of like some of the biggest networks.

William:

And it's almost like you know you hire a general consultant to come in and do your bathroom that does like a few different designs and they don't really go that deep. They can't do really custom tile work. They're not going to do custom fixtures or actually build anything. They're going to buy everything, slam it in there and give you the bill. Ccies are like the, the really in-depth contractors that come in custom tile, custom wall, custom everything, um, and a lot of times there are a lot of ccies out there, I take it, that are into automation and have learned to code and stuff, but we're all human. There's only so much we can really go so deep on and you know ccies are definitely deeper in the network design stuff. Craig.

Craig:

Yeah, it definitely goes to a very deep level and, like you mentioned, I have far too many over the years. But the idea that I think that is the most relevant here is even the best CCIE doesn't deal with a 300,000 line firewall config any better than anybody else does. So, yes, you may understand what's going on and that's where I think those AI agents can help, Just because I know exactly what every line or, theoretically, what all the rules will do. That's still a lot of data to parse and process and it's really just too much for any human to know. Even the best CCI in the world once you get above maybe 100 device network can't do it on their own. There's just too many variables at play. Even if you understand it, and it's very helpful in the troubleshooting space.

Craig:

Now I do, for better or for worse, I do think that is where a lot of the AI agents will really help, because the forensic analysis that you have to do when a network's broken it takes that level of CCI and also know exactly how everything is forwarding, because now that it's not working, you know exactly what you need to fix. I think that's going to be a huge application that we're going to be and those kind of early warning analysis on there. Not everyone can see some error message on some weird switch somewhere and then nine hours later that causes my entire data center to go down because it causes a loop or something. Those kind of early warning systems. I think you know that a knock would see it, but doesn't know what that means. An AI agent would absolutely be able to start to interpret that in conjunction with everything that the vendor has, assuming that they open those kinds of things up and know that. Oh yeah, when you see this message, yeah, we've seen 20 other TAC cases that are P1 down issues. So, absolutely, I think so.

William:

I have another thought here. That's some really good points. One thing I see too and I'm a product of this, unfortunately so I used to be a full-time network engineer. I started out, like you know, my first network project actually was decommissioningss load balancers. When I was a network engineer a long time ago, I used to eat, breathe, live, cisco networking, tcpip, bgp all the time.

William:

So the more abstraction that gets injected with network products over time, the less deep that you have to go to fix things, to build things and to do things on an enterprise network. So what I'm seeing a lot of is network engineers are being asked to do more. So a lot of them have almost been forced into becoming generalists in a lot of ways, because you have to keep the lights on, you've got to keep the network running and there's a lot of cross adjacency stuff. Now I still I wouldn't say that I'm a complete generalist I still write code every day. I still got a kuber 90s cluster running in my basement. I still have bgp and stuff, but I do it a lot less.

William:

And the less you do that stuff, it's not like riding a bike. You don't use ospf like. I haven't, haven't touched OSPF in a long time and I had to set up. I did some OSPF whatever the FRR version is and, like LibraSwan, ipsec, the other day we set up some stuff with Container Lab and getting back I'm just slower. It takes a lot longer to spin this stuff up, but I still understand the fundamental mechanics. But just interoperating everything is not an easy task when you're not day in and day out.

Craig:

Yeah, that's absolutely true. I'm very much the same way and I think that's really the hard part about networking is nothing ever goes away, nothing ever gets retired. You know, I think in software development, I think in all other IT spheres, there are trends and then there are supplants and then, okay, we, you know, nobody uses Perl anymore, nobody uses old, old. You know, we move away. We still use spanning tree, we still use everything that I started in networking in late nineties. Everything is still there. That's gone.

Craig:

We've added a whole bunch of layers and abstractions on top of that. And you know, yes, I may be talking SRV sick today, but everything under there is still all the same, and that's, I think that's really hard for people to get into. Like you know, when I took my CCI back in 2001, sure, we, you know it was, it was BGP, ospf, and all that stuff was still there. Now, if I take something similar, I have to add a whole bunch of things on top of that, plus, I still have to know all that stuff below that. So it makes it really difficult for for anyone to get into this to you know, at this level, to not be a generalist with it, because, yeah, you're, you're having to keep up with that old technology.

Chris:

Yeah, just one thing to add that the complexity is almost going in the opposite direction. Um, you know, when we were integrating boxes inside the Cisco router, it was whatever was inside was inside. So now we're cracking that open, we're doing Linux networking, we're doing service meshes, we're doing container networking. So it's almost like you know, as networking the complexity was was the raw volume and now it's like going the opposite direction. You know, maybe some chemistry reference or something, but now we're breaking it apart and the complexity is almost going the opposite direction. So I mean, the positive part is that protocol wise and otherwise we haven't had a ton of new stuff.

Chris:

Most of the innovation, I would say, over the last eight years was, you know, macro management, controller abstraction, and now it's going the opposite direction. I think the complexity on compute is now making its way into networking and the variety of skills it takes to understand the full stack To your point, you know, warps your brain a little bit as you start doing container lab and then you're thinking about, you know, macro WAN networks and then it all tied together. So, yeah, I mean, at the end of the day, you know, the provocative questions were not about, you know, any sort of replacement of ideas or concepts. It's really about augmentation and the need for all of this knowledge for us to have a grasp of what's going on. And I don't see an end to the complexity factor. I do see an end to kind of the raw network innovation that's been happening.

William:

Yeah, yeah, I agree, I mean, you're right. Like I remember I had to troubleshoot this issue where there was a problem between a data center that we had and a Kubernetes cluster and a cloud and all the things like. It's not as simple as following a packet through some network devices. Layer two frames this interface out, that interface redistributed into that routing protocol, like the whole packet walk thing. Like no, I'm going over this circuit into this environment, into this routing table, okay, up to the cloud trying to figure out what the heck a cloud route table, you know, figuring that out. And then, okay, going up into the service mesh land, which is like completely like completely different world, like completely new skills.

Chris:

It's just a different place as much as we wanted v6 end to end. Uh, you know, not everything. I mean, the application teams are like I care about my kubernetes, pod and I. I don't need to worry about the rest of the world. So I'm gonna make this super simple.

William:

Yeah, like that's, you know, I don't need to worry about the rest of the world, so I'm going to make this super simple.

Chris:

Yeah, Like that's you know, we don't Cleaning up the app team's garbage. I mean, you know, at the end of the day, we're here to support apps and humans, right? So unfortunately, or fortunately, sometimes these app teams drive the bus and they want their world simple and to your point kind of earlier, is that complexity kind of gets pushed to a spot. The complexity exists. We've got to deal with it at the right place. Well, and I think I thought you were going to start dancing.

Craig:

For me. It's why I enjoy working in the cloud so much, because I know the cloud providers under the hood. They run all the same MPLS and SR networks and and pe and p routers. They do all of that stuff. But to my interface the most complicated thing I have to do is bgp, and only in a very small amount, like I mean you know it's always the, the dirty secret. I tell people it's like it's really quite simple everything is static router for the most part. I mean mean there's a little bit of there but it makes it very easy for the work.

Craig:

And so I think to your point. Obviously this complexity can be solved, because they do. I mean they run sometimes more complex networks than most enterprises do. They write their own NASs. In some cases they write their own protocols to mesh everything together and don't just use stock BGP. So obviously it can be done. I think maybe to your earlier point. Perhaps the vendors just don't provide the right level of interfaces to abstract that just yet.

Chris:

Yeah, just one. I mean, one use case that I think is always interesting is OpenConfig. I mean I'm just throwing out lots of stuff here, but I mean who drove that right? I mean I'm just throwing out lots of stuff here, but I mean who drove that right? I mean it was AT&T and the hyperscalers, right. So what they did was they said these are the products we're going to support.

William:

We're only going to support these APIs and you have to comply. There's a lot of smarts behind doing that. You set yourself up for success.

Chris:

Exactly and I think that's kind of to your underlying point is they said we're gonna scale, we're gonna simplify, and that was that was that was the starting point, right. And then if you say, hey, I want this api call because I have this esoteric requirement, they say no, yeah I have to wonder you say open config, why hasn't that translated down to the rest of the enterprises, like, of course, the hyperscalers?

Craig:

they got it for their use cases, but we don't see open config in every protocol and every configuration out there. It's only to that very small set. What's prevented that from something that the rest of the world needs, and why has it been given Just your thoughts?

Chris:

Now it's super loaded. I brought it up so I guess I get an answer. I mean, at the end of the day it was. It was a way to simplify operations, right, and they knew what products they wanted to offer. And a lot of times we're we're wiring up stuff for applications we don't know that are coming Right. So we're trying to. What will the requirements be in the future? You know they come at it from a product down. We're going to offer this product in the marketplace, what features do I need? And then they go to the vendors and say you're going to support this.

Chris:

Now, did every vendor support OpenConfig the right way? Or did they write a shim layer on top of their existing, you know Yang models? That's vendor by vendor, specific, kind of not the point. But they say no more than they say yes. They say no more than they say yes.

Chris:

If you go to the OpenConfig sessions, they are very opinionated in what they want to support. And when you say in the enterprise people that are making the requirements of those teams, they can't satisfy all those requirements with the limited use of OpenConfig. So I don't think every enterprise can get to the point where they support OpenConfig. But at the end of the day, we need to understand what our customers want. We need to understand what the application teams want and then build the simplified infrastructure to support that. Not that brownfield is going to go away, not that spanning tree is going to be eliminated, but if we start thinking about where we're going, if we've got to push the boat forward it's a little bit top-down, no-transcript.

Chris:

And if you can back into support open config, great. I just use it as an example. It's like a little bit of lightning rod for a lot of people, but it's like why won't they bend to my use case? And then at that point, at that case, that's how some of these NASs got so bloated because of every feature request that they stuffed in there.

William:

That's such a good point. I mean, if you think about back in the earlier days of AWS and EC2, vpc, virtual private cloud, if they would have said, okay, we're going to support everything a customer asks for. Okay, we're going to hairpin traffic with every VPC. Now we're going to be able to insert this here, do this there, and you're taking all these things. Their scale would not be what it is today.

Chris:

We do love our nerd knobs yeah exactly, and they could have come out with Transit Gateway a long time ago. Right, they held their ground, they were trying to understand the right product and the right fit for the vast majority of enterprises. And now what do we do? We build our infrastructure around those constraints. We build it, and the byproduct of that is the simplification. That byproduct of that is the operational scale and some of the other stuff we talked about.

William:

Well, that opens the door. So, like AWS is like building with Legos you know there's a lot or you the sort of the sky's the limit for the things that you could potentially build on cloud and scale across the planet. And having that sort of architecture and those principles from you know a hyperscaler enables, you know vendors like Alkira to build really intelligent, awesome things on it. You know that, scale across that cloud provider to other cloud providers and such. So it's also a it's also a good thing. It provides you the tools to build really unique things that you know really itch the scratch and, you know, fill in some of these gaps for larger companies out there.

Craig:

Yeah, I think that's correct. I do think something kind of gets lost there. I mean, I hate to use the lock-in word, but if I have a Cisco router today and I'm upset with them, swapping it with a Juniper is whatever, who cares? So it does wind up being. I am looking at a solution Now. Granted, there's not a whole lot of competing solutions, so that's not an issue. So I can see why that gives people pause and that's more of a traditional model. You know people thinking that they don't want to be locked in. And there really isn't such of a lock-in. There's none of that level of vendor pricing that you're going to get from AWS.

Craig:

It's whatever you're going to scale up to. So I think that's why people are clamoring for these kind of abstraction layers on top of the clouds. Do I think it's the greatest idea? No, I think you just buy yourself. You get yourself the worst of both worlds. You get the extra complexity of doing the something totally different, the way Azure does it, and you don't get any pricing support for that as well.

William:

Yeah, yeah, good point, and I know both of you actually have to catch a flight. I'm going to ask one more question that I have to get out. So I remember one time I was working on a large scale automation project and things. You know it was cross team team, cross discipline, um, and we were just having issues because each team was using their own tools. They're speaking their own language.

William:

Not everybody understood git and, like you know, software delivery, um, some of the methodologies, and I remember it's one of the times that I was like giving excuses to higher ups I was like an excuse box for a little while and and they were like stop, the outcome I want is from end to end. I want this whole thing to be automated. I don't care what the problems are with individual teams. Figure out a way to make this cohesive across teams and you know that is the outcome. And it's much easier said than done, obviously. Is there like, how can we make? You know, when you think of the whole infrastructure domain of networking? You have, you know, cloud, you have network security, you have wide area network, the campus and application networking, how do we bridge the gap between all these teams so we can get there. Just any thoughts there.

Craig:

Well, not to be self-serving, but the idea is everyone does need to speak the same language and everyone has their own domain of it and everyone has their own particular config and config quirks and interfaces. But networking is remarkably the same. No matter what A packet goes in, you change some headers, it goes out a particular port. So if you can get everyone to speak the same language, whether you're a security team, a cloud team, a network team, and if you can I think I've used the word abstract way too many times this session but if you can put a way to you know to have that sort of interface so that everyone is at least seeing the end, to know either you build your own type of digital twin, you know in your lab environment, or you rely on something you know commercially or open source out there. You know not to be self-serving on that, but I think there is that to me that's, that's really the key to it, because I don't think there's going to be any less vendors coming out. You know some will come and some will go away.

Craig:

And you know economics. I've noticed the economics only have to change slightly for businesses to radically flip around. You know when compute gets a little cheaper, when storage gets a little cheaper, when cloud gets a little cheaper, there are vast shifts on there. So no matter, you know, the repatriation is the word right now. Next year, maybe something totally different, we may go back to distributed computing or something like that. So none of that will ever change as the economics change. But everyone needs to be able to speak the same language.

Chris:

So the words you used to have these people work together were a lot of software words, right. So you know we build software every day Database team and the web server team and the, the, the message bus team, don't need to necessarily truly understand what each other's doing, right. So the firewall team and the data center team, they have to be excellent in their domain, right, but we have to have interfaces so we know how to work with each other, right, and? And that means both technical interfaces and operational interfaces. So I think you know how much infrastructure is good, how much software development principles are the answer here versus the balance. We do want the most people to participate in automation, and that sometimes does.

Chris:

The barrier of some of these software skills is maybe a bridge too far in some organizations. But I do think you know we talked about learning from hyperscalers. I think learning from you know building. You know we talk about millions of lines of config, millions of lines of code, and how do we all participate in that? In a very controlled way, so that when we ship product we can manage the level of defects.

Chris:

So I still think maybe a little less Git and a little less pipelines and more maybe SDLC type of concepts, but these teams need to build functional areas of excellence. We can learn from how Linux was developed. Each area needs to be excellent and you need to expose your interfaces in a meaningful way so you can tie them together. Too many times there's pockets of nothing and we try to peanut butter it with. You know, I know I know this tool or I know this concept and I'm going to apply that across the board so I can get to end to end might not be the right tool for each domain. You know we're not going to solve the, the message bus, with a database strategy. You know it's like I'm trying to apply, apply software principles, but I think that's ultimately the thought process that gets us there, and trying to shove everything in one system or use one tool is going to it's going to be ultimately a problem, as you brought up at the beginning.

William:

Great points, both of you. Hey, look, this has been fun. Thank you for joining me on this beautiful roof. Thanks for scheduling the weather. Yeah, I'm on that, so can anybody you have any social anybody can follow you on? I can include it in the notes.

Craig:

Yeah, linkedin, linkedin yep, you can find me on linkedin or at captain packet on x slash twitter all right, I will include those in the show notes.

William:

Thank you, gentlemen, and good luck getting to your flight on time.

Generative AI Impact on Network Infrastructure
Improving Vendor Interfaces and Automation
Networking Trends and Cloud Infrastructure
AI's Impact on Network Engineering
Networking Complexity and Innovation Trends
Bridging the Gap in Networking Infrastructure
LinkedIn and Twitter Connections