Breaking New Ground With Collaborative Robots Artwork

Infinite Curiosity Pod with Prateek Joshi

The best place to find out how AI builders build. The host Prateek Joshi interviews world-class AI founders and VCs on this podcast. You can visit prateekj.com to learn more about the host.

All Episodes

Infinite Curiosity Pod with Prateek Joshi

Breaking New Ground With Collaborative Robots

July 02, 2024 • Prateek Joshi

Brad Porter is the founder and CEO of Collaborative Robotics, where they are building robots that will seamlessly blend into our surroundings. They've raised funding from Sequoia Capital, Khosla Ventures, General Catalyst, and Lux Capital. He was previously the CTO of Scale AI. Prior to that, he was the VP of Robotics at Amazon. He has a bachelors and masters degree from MIT.

Brad's favorite book: Mike Mulligan and His Steam Shovel (Author: Virginia Lee Burton)

(00:00) Introduction
(02:11) Collaborative Robots Explained
(05:17) Building Blocks in Robotics
(11:01) Architecture of a Cobot
(14:12) Safety in Industrial Settings
(18:08) Sensors in Cobots
(20:20) Power Consumption and Optimization
(23:31) Zonal Compute Architecture
(26:34) AI Models for Task Planning
(30:32) Reasoning and Human Interaction
(35:00) Simulation to Real-World Deployment
(38:49) Multi-Robot Coordination
(41:57) Technological Breakthroughs in Robotics
(45:29) Rapid Fire Round

--------
Where to find Prateek Joshi:

Newsletter: https://prateekjoshi.substack.com
Website: https://prateekj.com
LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19
Twitter: https://twitter.com/prateekvjoshi

Prateek Joshi (00:01.202)
Thank you so much for joining me today.

Brad Porter (00:04.797)
Yeah, nice to be chatting.

Prateek Joshi (00:07.858)
Let's start with the fundamentals. Can you explain the idea behind collaborative robots? And also, how do you distinguish or where's the line between just a normal robot and a collaborative robot?

Brad Porter (02:11.165)
Sure. The idea is that when I was at Amazon, we led their robotics expansion. We took advantage of a lot of the structure that exists in those environments. Amazon innovated in pouring very flat concrete floors. And we were able to fence in robots. And it just...

We need robots everywhere. And I think the pandemic really kind of showed us the fragility of our labor system and our supply chains and our logistics infrastructure that leads us to need robots in more spaces than just in Amazon warehouses. And yet the challenge of deploying robots into all these spaces is that you don't have the

the freedom Amazon had to design new buildings or new process pads. You need robots that can work in and around us and generally in and around humans. And so the idea behind collaborative robotics is how do we bring a new type of mobile collaborative robot into the world around us that can work in lots of different commercial environments from

warehouses, logistics, manufacturing, but also hospitals, airports, stadiums, and have something that's, you know, has human level capabilities, particularly in the area of moving, you know, material movement, boxes, totes, and carts, without needing the complexity of a humanoid, but with something that's quite a bit more capable than today's traditional

mobile robots, which are generally very low profile, generally depend on fairly flat floors. So that's the idea behind collaborative robotics. I think that to me, when I think about collaborative, I think there are multiple levels of collaboration. There's side by side work, there's turn taking work, there's joint work. And ultimately I think

Brad Porter (04:34.43)
robots are going to do all of that. But the progression right now is from kind of where side by side equals fenced off and in a different space to side by side means sharing the same space that humans share and then ultimately progressing to more turn taking, more joint work. And I think our thesis there is that the robots need to have an understanding of what the humans around them might do.

But also humans need to have a reasonable expectation of what the robots might do. And so that collaboration comes down to a lot of expectation setting, a lot of expectation.

Prateek Joshi (05:17.906)
That's amazing. And you've published a wonderful set of articles describing your vision around collaborative robots or co -bots for short. I want to dive into that. First, you've talked about the concept of building blocks in robotics. So can you explain what these building blocks are? And also, how do you see these blocks being deployed in practice?

Brad Porter (05:49.789)
Yeah, I think about building blocks. I think, so we often start from like, how do we automate a particular process, right? And maybe that process is, you know, bin picking, picking, you know, an item from a tote. That's one process building block. Maybe the process building block is sorting something. You have something, you want to put it in multiple.

multiple different bins or destinations, right? Maybe that process is moving a pallet or loading a pallet onto a shelf. What we then do is we start to think about how do we compose those different processes as we automate them? How do we go from unloading something from a truck to taking that pallet to sticking on top of a

of a shelf to then bringing that power back down from the shelf to taking the boxes off the palette to unloading, you know, extracting things out of those boxes, packing them into a new box, shipping them out. Right. You can draw a whole process map. When we talk about building blocks, the question is how many different mechatronic technologies are you going to need to do all of that? Right. And what I just described, you might need, you know, an autonomous forklift.

and an autonomous, you know, palletizer and de -palletizer and something that can extract from a box. Our observation is that you would ideally like to have as few different robotic technologies as you can to do that.

You know, you look at other robotics companies websites and they show this like beautiful portfolio of like 30 different robots. But the reality is like 30 different robots is 30 different things to integrate to get working together, right? And you're trying to assemble and create these process paths. And then you don't get a lot of reuse. And so everything becomes a little bit brittle. It's a little bit like having, you know, a Lego kit, which is all specialized pieces versus, you know, just this, you know, the four by two.

Brad Porter (08:13.469)
bricks and blocks. With the 4x2 brick, you can build a lot, right? With the specialized pieces, you can kind of build just that one specialized thing. And so when I think about building blocks, that is kind of the analogy. I'm thinking about, okay, you know, if you think about it from Lego bricks, if I have a 4x8 brick and I have some wheels and I have a big platform, like I can build all kinds of little cars, right? Versus if I just have a car kit with very specialized things, you can just build that one type of car. So...

So that's what I encourage people to think about when they think about approaching robotic automation is to think about, okay, if you're gonna go solve this one problem, let's say you're gonna solve kidding, right? You're gonna take a bunch of things from an inventory stock and place them into a box or a bag or a tote that is an assembled set of things for a task. You're gonna use potentially a robotic arm

If you use something that is very, very custom to that task, then you're not going to take the lessons from that experience to the next type of manipulation tasks that you have. Where if you can build that off of a common building block, like a, you know, a FANUC or an ABB or KUKA or Universal Robotics ARM, then the experience you gain from how to use that robot arm in there,

and the qualifications you did and the IT certifications you went through and the staff training and the maintenance, all of that becomes leverageable in those other process paths. And then you can start to redesign your whole system saying, I've got a great building block for manipulation. We think that the building blocks for mobility, for moving human scale loads around are underdeveloped, right? They tend to be fairly simplistic. They run on very flat floors.

very low to the ground. They tend to only interface with their own types of containers or carts. And so it's not a very easy building block to take into different parts of. So take a hospital operation. They have a, you know, most hospitals have some kind of warehouse where they're storing offsite stuff and then they're bringing those up to patient floors. Well, if you have to have different robots in the warehouse than you have in the hospital,

Brad Porter (10:36.445)
than you have on the patient floor. That's now three systems you've got to figure out how to manage. Whereas if you get good at one and it works in all those environments, then you're leveraging all that expertise. So that's what I mean when I say trying to be really smart about picking your building blocks so that you find robots that can work in multiple process paths and that gives you the most flexibility over time.

Prateek Joshi (11:01.234)
With that framework in mind, how do you know when people talk to you about the architecture of a cobot? Like, how do you explain the architecture of a cobot in terms of not just saying, hey, here's the hardware components, here's the software components, but it's more like a robot has to sense its surroundings and then extract some information from it and then take an action. So in this case, how do you think about the architecture of a cobot?

Brad Porter (11:33.725)
Yeah, I mean, I think, you know, to a large extent, it's a systems problem, right? You're trying to bring together electrical, mechanical, hardware. You're bringing in sensing, you're bringing in compute, you're bringing in software, very low level software, higher level software. You're bringing together machine learning, AI, you know, you're both bringing together kind of classic techniques and

you know, SLAM planning and newer techniques, maybe large language models and higher level planning. And so when I think about the architect, I mean, the first thing to acknowledge is it's a systems problem. You need to bring all of that together. And then you need to do it in a way where you understand the operating environment in which the robot's going to work and how humans are going to interact with it. And so

All of those things have to come together to develop the architecture.

I mean, it's not an easy exercise to try to bring together something that's new and different. I think what we look at is, for our robot, we've looked at, we want to work in human spaces alongside humans, moving human scale loads. Volumetrically, that means you're kind of constrained to the similar constraints of a human. If you want to be able to go through a door,

can't be much more than 28 to 30 inches wide, right? If you wanna be able to scan the environment around you, you kinda wanna scan the environment from, I don't know, five foot 10 inches in the air, kinda the place where humans would scan. If you wanna deal with ramps and bumps and other things, and you, I don't think you need bipedalism, but you need some flexibility in the mobility, maybe you need some

Brad Porter (13:40.445)
slightly larger wheels or a little bit more suspension. So we think about the architecture as looking at the, you know, the class of problems you're trying to solve and then how do you put together a system that is best set up to solve those. And then you obviously have to bring in factors like, you know, how long will the batteries last? How, you know, do you need to sit on the charger? How much does it cost? How much does it cost to operate? How is the reliability maintenance?

provides confidence, there's a lot that has to come together.

Prateek Joshi (14:12.306)
Right. And within an industrial setting, you talked about how co -bots and humans, they have to be in the same space, work together. And those industrial robots, they're pretty like, they're strong. And accidentally when they're moving, if you're like a second late, the arm might hit somebody in the head and that could be catastrophic. So what are the primary challenges you encounter?

in real -time interaction between cobots and humans, especially when it comes to things like this, like safety.

Brad Porter (14:51.133)
Yeah, I mean, I think it's really important to start from the physics of the power you have in the motors, the mass you have in the robot, the ways those forces might translate into the world around, and then making sure that you have the sensing and perception and you're kind of managing all those forces, right?

we as humans just kind of naturally navigate the world without realizing that, it doesn't take that much force to, I was out biking yesterday and I hit a pothole that I wasn't expecting. and the handlebar jammed my hand, right? And yeah, I mean, still a day later, it's pretty sore and pretty flexible. And it's like,

All that was was like a two inch pothole, but you know, two inch pothole going 15 miles an hour, stiff frame, handlebar goes in your hand. You know, you can injure yourself. So it's really critical to be aware of the physics of the system that you're building and then to make sure that you have the

the controls, the software, the stop procedures, everything designed, with those forces in mind so that the experience of working in and around a robot shouldn't be like, we shouldn't have to teach people that they shouldn't stand next to a humanoid robot because it might fall over. Like we shouldn't have to teach them that because you and I, if we were standing next to each other, aren't worried about one of us collapsing onto the other.

Prateek Joshi (16:43.378)
All right.

Prateek Joshi (16:49.17)
Right, right, right.

Brad Porter (16:49.181)
Right? And even if we did, we're not made of metal. And so actually, if you collapsed onto me and, you know, try to catch you, right? Try not, try to break your fall. Cause your fall could be, could be harmful, but you're not that likely to seriously injure me if you were to fall, but something metal standing next to you that glitches and collapses. and so we think a lot about that, like it just intrinsically, if you've got four points of contract on the ground,

and the center of mass of the robot is over those four, then it's stable, then it's not going. And so there's other things you can do to make sure that physics is helping you and not working against you.

Prateek Joshi (17:34.162)
Right, that's a very good way of looking at it is just understanding the physics and starting from there to make sure that nothing catastrophic happens. And also on top of that, based on the needs of a cobot, what sensors are incorporated in a cobot? And also, is it more than an average robot? Or is it the same? Or maybe it's the way you process

the data from the sensors that's different. So how do you think about the sensing aspect here?

Brad Porter (18:08.925)
Yeah, I think,

You know, my general philosophy is you start with making sure you have enough sensing to give you the type of coverage that you want. You don't really want blind spots. If you want to precisely know how far something is away from you, a laser scanner, a LIDAR is far more accurate than a radar or a vision system that's trying to interpolate 3D. So that's how we think about it, is we think about

You know, lidars have important place to role to play. Stereo depth cameras have a very important role to play. 3D lidars and 2D lidars. And then you're, you're fusing those different modalities into, you know, software that can then process that to determine, am I close to, you know, where am I in the world and how far away am I from something that

you know, I'm trying to, you know, make sure that I don't bump into. And so that's how we approach sensing is use the right sensors for, for the type of understanding the world that you're trying to get. And then over time, you know, as you always there's a pressure to bring down costs, right? And so

you know, over time you may come to learn that there are different ways to cover some of those blind spots that don't require quite as many cameras, maybe go to a bigger fisheye camera, maybe, you know, your need for lidar is more narrow and you don't need as much of a 3D world and you can then kind of bring the cost down as you invest more in the sensing and perception algorithms.

Brad Porter (20:07.069)
Our philosophy is to start from having enough sensing that you feel confident that you know what's going on in the world around you and you're not having to, the robot's not having to guess.

Prateek Joshi (20:20.306)
Power consumption is a critical aspect of robot design. And you mentioned that you want to have enough sensing, enough compute to make sure that the robots are actually doing well. So how do you optimize the computational load and power consumption of co -bots while maintaining performance?

Brad Porter (20:44.221)
Yeah, so one simple thing is that it turns out the motors tend to consume more power, you know, mass, acceleration, you're moving things around the world, that takes more power, typically than the compute does. Now, that doesn't mean the compute isn't material as well. It just means you have more dimensions and more factors that you can...

you can play with when trying to manage the power budget. I think the other advantage to being something that's mobile and on the ground is you're not, I've worked on drones as well. I worked in the Amazon Prime Air program and then the drone, your weight budget is, your forces, you're overcoming the force of gravity, right? And so the amount of power you can put in those motors is dependent on the amount of weight. And so you're always in this,

this kind of tricky trade -off where more batteries would be great, but more batteries means more weight, which means more, and there's kind of an optimal point in that curve. And it's not ideal yet. We would still like to have lighter weight batteries, higher energy gentrity at the same weight. You know, automobiles, we think about, maybe it's the power of moving the 6 ,000 pound Tesla. It really turns out that, you know,

Wind resistance is velocity squared, right? So the faster you're going, it doesn't go up linear, it goes up. It goes up as a square. And so that's why it's so important that those be wind resistant or very aerodynamically efficient. And a mobile robot, really, most of the power is going into

How much are you trying to lift? How much are you trying to move? How far are you trying to traverse? Are you going up a ramp, down a ramp? Wind resistance, overcoming the force of gravity are less your concerns. And you can put higher density batteries in. So it can be hard to get an hour worth of performance out of a drone battery, but it's not too hard to get four to six hours of lifetime out of a drone.

Brad Porter (23:09.405)
robot battery.

Prateek Joshi (23:11.89)
You've talked about zonal compute architecture and how it's advantageous over a centralized architecture. So can you explain what zonal compute architecture is and specifically compare and contrast that with the centralized architecture?

Brad Porter (23:31.357)
Sure. So a zonal computer architecture is what you might imagine. It's that you have computation in different zones of the robot, right? And so as I described in our robot, we're sensing the world from kind of human height, human vantage. And so we have an NVIDIA or a compute that is fusing all that sensor data, processing it, and then

understanding the world at a slightly higher level and sharing that information generally over ethernet with another computer onboard, which is handling the motion control and those kind of controls and systems. And then you sometimes have a computer just dedicated to lower level safety functions.

If you drive a Tesla, every once in a while you have to reboot your Tesla. But that's only the console, only the display. You're never rebooting the motor controllers, right? There's another layer of compute in there that is engineered a little bit differently. You don't have to reboot it. And so that's the idea behind zonal compute is that we can have a computer dedicated to a particular function of the robot.

sensing or planning or motion control or grasping, loading and unloading cart, to boxes and toads. And each of those computers has enough dedicated compute to handle the task at hand. And then they communicate with each other, whatever relevant information maybe needs to be integrated between the sensing and motion control. But this is a way to

subdivide the work and make sure that you're not, again, you don't want a system where, man, your Bluetooth just failed, so your car is trying to reconnect the Bluetooth. Meanwhile, like, Wi -Fi went out and the user's pressing the radio station and like, a whole bunch of things are going on. You don't want that to affect whether or not the accelerator works, right?

Prateek Joshi (25:47.538)
You

Prateek Joshi (25:52.722)
Right.

Prateek Joshi (25:56.082)
Right, right, right, right.

Brad Porter (25:57.789)
And so that's the idea of decoupling the kind of layers of the system and compute and making sure that, and it's also a way to make sure you have enough compute on board for the various different tasks you're trying to do.

Prateek Joshi (26:14.706)
For task planning and execution, do you use AI models that are installed on the device or maybe a combo of on -prem and cloud? So how do you, can you explain the AI models being used for task planning and execution?

Brad Porter (26:34.045)
Yeah, so there are, as we said, a couple layers of task planning as well, right? I mean, one level of, so the higher level of task planning is just understanding, what am I being asked to do? Where do I need to go? What do I, you know, and sometimes that's, so we're working in a transload facility right now, we're moving transload carts. We have a vision system that detects that there's a cart in a spot that says it's ready to go to another destination.

Right? And so that task assignment is self -assignment. The robot notices that there's a task to be done and then it assigns itself to go do that task, drives up to the cart, recognizes a tag that tells it where the destination is, grabs the cart, takes it to that other destination. Then it might notice that there's an empty cart in a ready to go spot and take it back. And so in that world, task selection and task assignment

is fully autonomous. The robot notices there's work to be done. There's a queue of work and it just completes it and does it. Task assignment could come from a warehouse management system that's saying, hey, I need you to move here or there. Task assignment can also come from a human talking to the robot and saying, hey, can you grab the cart over at dock three and take it over to dock nine?

And so there are lots of ways to instruct the task. And so each of those, the task planner is a little bit different. Like in that warehouse management example, the task planner might be kind of in the cloud, the high level one. And it's just very precisely telling the robot what to do. In another world, the robot self -identifying tasks. And another one, the robot needs to understand human language, understand, if you said, hey, grab that cart over there, maybe it's vision.

picking up your gesture, right? And saying, okay, over there, it looks like it's dot three, right? Go over to dot three, see if there's something. And so we think tasks, this high level task plan is a really, really interesting area for AI right now because we can do far more with large language models and with these vision models and kind of incorporate when you say over there with that gesture.

Brad Porter (28:54.493)
You know, and that's what makes, that's what's gonna make robots much more natural and intuitive, more trustworthy, more collaborative, where you don't have to be super precise in specifying exactly how you instruct the robot to do things. Then obviously at the next level of planner is understanding a motion plan, which says, okay, I see there's free space, you know, over here, I'm gonna, you know, I'm gonna turn left up this little.

cart alley and like turn right to get to the cart and kind of forming its own motion plan that way. So there are a few levels to the planner, but I think what we're most excited about is AI is really helping at that kind of human communication and then resolving those interactions with humans down to, you know, a high level plan that the robot can then go execute as a lower level plan.

Prateek Joshi (29:51.41)
that's actually gonna be a huge leap integrating LLMs into robots. And one of the more, the first thing that comes to mind is, hey, I can talk to a robot. I can tell the robot what to do and the robot does it. But as you mentioned, it's not as simple as humans don't talk in like a robotic instruction level. We just say, hey, pick that box over there. So in addition to understanding speech and converting that to text, that's that simple.

It's a solved problem. What reasoning problems do you encounter or do you have to solve for the robot to feel natural?

Brad Porter (30:32.125)
Yeah, I mean, I think the, obviously the level of reasoning capabilities in large language models is an active area of a lot of debate. But they do have base level reasoning capabilities that are quite reasonable, right? And so, you know, when you say,

there's a package arrived down at the mail room, can you go get it? Being able to translate that to, okay, I should go to elevator bank three, I should go down, I should go to the mail room, and then I should expect that I'm gonna get a box, right? Is the large language model can deductively anticipate.

that that's what it and decide that that's the plan that it should follow. And really large language models have that capability. So we're not.

Brad Porter (31:41.373)
you know, it's not the same as like the end game in chess where you're trying to look six moves ahead and trying to figure out how to use space. And it's not that type of reasoning complexity or even, you know, now people are trying to solve these like math Olympiad style word problems. Like that's, we haven't had to solve those types of reasoning problems. And so the type of reasoning problems that we have are, you know,

There is some spatial understanding. You need to understand what an elevator is, things like that. But it turns out large language models have a lot of that capability already. I think what we're interested in is can we give the robots more context? So for instance, maybe you want the robot when it's going by a bank of elevators to go a little bit wider around just because, or you know what, you know,

in a hospital, people with more mobility challenges tend to go slower and they tend to be more concerned if something's going quickly, right? And so as humans, we will almost naturally, if we can sense that there's someone who's kind of has mobility challenges, is going a little bit slower and maybe is slowing down, we will naturally slow down in kind of empathy, but it's almost subconscious. We naturally slow down with that kind of empathy.

The, we don't expect that, you know, a large language model developing a high level plan is gonna necessarily have that type of awareness or it can, right? We could build vision language models that can interpret and have that kind of empathy, but the ones out of the box aren't necessarily gonna be able to do that. And so we think there's, you know, an extra layer of refinement in some of these environments that helps people, you know,

I think about like, you know, if there's a code blue in a hospital, you know that not just because there's an announcement, right? You also can see how people are reacting. You can see lights, other things. And so we want robots to be able to understand the, the, the universe that there are operating in. and to be able to incorporate that into how they think about the plan. Maybe I should go around, right? You or I, if we came up on that and we're not on an urgent mission, the mail courier is not just going to walk through a

Brad Porter (34:08.157)
crisis situation, they're going to back off, find another way around, maybe come back later. You kind of want robots to have that same level of sophistication and understanding.

Prateek Joshi (34:20.146)
Simulation plays a big role in robotics development and going from simulation to real world, same to real, that's always a, it's a challenge because not every single situation can be simulated. So, and we have tools like NVIDIA's i6im and other tools to help with that. Can you describe the approach you take to go from simulation to real world deployment and also

does like is Nvidia like Isaac Sim, is that sufficient or do you have to do something additional on top of that to make sure that you catch as much as you can and in simulation.

Brad Porter (35:00.829)
Yeah, I think simulation is an incredibly important tool for robotics. And I think the investment NVIDIA is making in Isaac Sim and those tools is incredible. And they just announced Isaac Sim 4 .0 and looks like some amazing capabilities in that. Obviously, I think that there are other tools, Gazebo, other things that work well in particular, TAS, Majoco.

We're not a kind of one simulator to rule them all. I think simulation is, in a lot of these ML tasks, reinforcement learning tasks, there's a phase where you're kind of exploring the policy space, where you haven't kind of got into the fast learning portion of the curve.

diffusion models and and the attention mechanism and transformers helps you to get Out of that broad -based Explorer phase much more quickly But when you are exploring it is far cheaper to do that in Simulation than in the in the physical world now the nice thing is Once you've explored and you've kind of developed a policy that roughly works

right, loosely, or works in SIM. Then refining that policy with, you know, a smaller number of samples from the real world. And then

zero shot, one shot, few shot. The reality is how quickly can you move from simulation into a policy on a robot in the real world that works well. And yeah, we're continuing to see that both the amount of time you have to expend in explore phase is getting less and less. Imitation learning is one of the popular techniques right now to try to get you.

Brad Porter (37:08.861)
out of that kind of explorer phase and into an initial policy that's pretty good. And so, yeah, that's the progression we're seeing right now. Some combination of imitation learning to reinforcement learning in SIM to refinement with real -world samples to transfer onto the robot. There are other things just like modeling out what the process flow is gonna look like. We built...

an Isaac Sim simulation for this cart movement task so that we can make sure that the planner and before we could get into that warehouse and start doing it. So lots of different ways that you can use Sim when we're obviously when we're developing our controllers and we're doing some of these force math studies, we're using simulation to understand, you know, is there a tipping risk for this robot? How, you know.

If we break this fat, you know, if we employ the brakes on the motors as fast, what's gonna happen? And so, yeah, simulation just gets used in a lot of different ways and it's a super important tool.

Prateek Joshi (38:24.146)
And as you scale up the number of robots that are getting deployed, in the future, in the same warehouse, there'll be many robots. So how do you think about multi -robot coordination? Or rather, let's say there's a team that's in charge of that. What are the things they need to keep in mind while designing a system like this?

Brad Porter (38:49.053)
Yeah, I think this, the multi -robot coordination kind of comes back to our planning conversation a little bit, which is how are the robots figuring out their plan? And if the answer is there's a warehouse management system that's orchestrating everyone, which is kind of how Amazon's architecture works, then multi -robot coordination is pretty straightforward.

the warehouse management system said, bring this, you know, this tote over to this location. It arrives at that location. Maybe there's a sensor that says it arrives and then, you know, it puts the tote on the station and then the robot arm activates and does the work and it's all coordinated kind of in the cloud. The way humans sometimes do these processes though is we just, you bring the tote up to another human, the human says, there's a tote there and then starts doing the task.

I think robots will be increasingly able to do that as well. And so it comes to this like coordination, collaboration, how much of it is orchestrated from some Oracle in the cloud that's planning out everybody's steps and workflows and how much of it is organic where the robot knows there's something. And I think what we're excited about is to really build flexible

fluid workflows, you want to get closer to where the humans interact than some kind of rigid warehouse architecture that's scripting everything. And the reality is the world's going to be a little bit of a hybrid, right? Where I would call it loosely coupled hybrid, where maybe there's a system that is controlled by the warehouse management, but then when it gets connected from here to here, the warehouse management system just says, take it over to this. And that other subsystem notices it's

ready and works autonomously. And so we think that it's interesting. I'm a large scale distributed systems guy by background and training. And in those worlds, we have loosely coupled asynchronous systems, we have services and we have tightly coupled synchronous systems. We've moved away from tightly coupled synchronous systems where we can to more loosely coupled asynchronous systems because they're more flexible. They're more. And so

Brad Porter (41:14.941)
I think the same trend is going to happen in robotics, increasingly happening in robotics, where we're going to see subsystems loosely coupled and able to adapt more flexibly like humans would. And it's going to require less of kind of a fragile warehouse management orchestrator that has to get everything right every time.

Prateek Joshi (41:41.394)
I have one final question before we go to the rapid fire round. As we sit here in the mid 2024, what technological breakthroughs in robotics are you most excited about?

Brad Porter (41:57.981)
So the biggest challenge we have in robotics right now is that very high dimensional controllers, say, the human hand has somewhere on the order of 46 degrees of freedom, right? The shadow mechanical hand has about 26 degrees of freedom. Trying to write the controller for those is very hard. I mean, OpenAI did some very impressive early work with their

learned dexterity models. And the, ⁓ the, and again, that's managing a 26 dimensional controller. The problem is there isn't a controls engineer who can really solve a, you know, an open 26 dimensional controller. And if you get something wrong, if you get something wrong in an LM, maybe it spits out some hate speech or something. If you get something wrong in a, in a robotic controller,

If it's a balancing system, it probably falls over. If it's a grasping system, it might crush the thing that it's... So we need very robust controllers and we need to know that they're generally safe. And that reduces the dimensionality. That reduces how many degrees of freedom we can build those control policies. And so if you look at a quadcopter,

four spinning blades, maybe six, IMU, like it's fairly constrained. Even like landing a 787 is not that many different degrees of freedom that you're trying to control, far less than controlling your hand. And so what I get excited about is, you know, as we develop more, as we can collect more and more data, there's just kind of this.

chicken and egg problem, which if we don't have enough data to really train a general, we don't have 13 trillion tokens of robotic motion, right? Like we do text on the internet. And so there's this question, how much can we do without that amount of data? And then how are we gonna get that data if we can't do it? So that problem is really interesting because once we do solve it, once we could have something with the dexterity of a human hand, like I give this example all the time, right?

Brad Porter (44:24.765)
AirPods, right? How do you even open this thing? Right? Like we teach each other. you flip the lid, right? But then as you look at how you extract it, if you try to just grab it on top, you can't get it out. You learn eventually that you put your thumb here and you pull it out and you transfer it over to your thumb. It's a very, very, very hard thing for a robotic manipulator. and by the way, we have incredibly sophisticated touch sensors in our fingers that, and we just do it naturally.

We don't have that quality of touch sensor either. And so, to me, that dexterous manipulation, that bi -manual manipulation, as we get that capability, robots are gonna be able to do incredibly sophisticated things in the real world. But right now, you can see that cable behind, it's like an ethernet cord, asking a robot to extract that ethernet cord and push that little tab and pull that. We don't have that capability. And so,

I get really excited about what robots are going to be able to do when they have that capability, but I also think we're still a little bit out from.

Prateek Joshi (45:29.17)
That's amazing. Actually, at this point, it's a hard problem, but I think it can unlock so many things. So that's wonderful. All right. With that, we're at the rapid fire round. I'll ask a series of questions and would love to hear your answers in 15 seconds or less. You ready? All right. Question number one. What's your favorite book?

Brad Porter (45:48.573)
Sure.

Brad Porter (45:52.797)
My favorite book.

There's a children's book, Mike Mulligan and a Steam Shovel. It's still my favorite book. The, yeah, Rapid Fire, we'll keep going, but there's a reason.

Prateek Joshi (46:08.05)
All right, let's hear the reason.

Brad Porter (46:11.261)
Well, the book is about how the steam shovel gets antiquated and then becomes the furnace for this building in the city. And I would say my biggest lesson from that is like never build yourself into the architecture. I still wanna be out doing work in the world. I don't wanna be, and so I've looked at that from a career lesson but also from a technology standpoint. How do you build flexible building blocks or how do you build your career so that you're not locked?

into just one thing that's rigid and moveable. How do you build that flexibility into your engineering, into your career, into everything?

Prateek Joshi (46:49.202)
I love that. Glad we dug into the reason on that one. All right, next question. What has been an important but overlooked robotics trend in the last 12 months?

Brad Porter (47:01.405)
I think what people are...

The use of imitation learning to learn new tasks very quickly with some of these end -to -end models is super impressive, but I think most people don't understand that it doesn't generalize yet. We can learn new tasks very, very quickly and they make really impressive videos and demos, but it doesn't generalize. So it's both really exciting progress and kind of misleading as to how much progress we've really made.

Prateek Joshi (47:29.074)
What's the one thing about robotics that most people don't get?

Brad Porter (47:34.653)
That robotics is, deploying robots is very, very hard and it is change agent work. It is, you need the whole organization to embrace it. You can't just have a department somewhere trying to bring in a robot. You need to get the whole organization behind things. And that kind of change leadership, what happened in SaaS is we learned how to just product led growth, adopt things very quickly and kind of bypass all that.

Robotics isn't like that. You have to get everyone.

Prateek Joshi (48:07.218)
What separates a great robotics product from a merely good one?

Brad Porter (48:14.333)
I think the great robotics really think hard about not how the robot gets the task done, but how do humans experience the robot? You know, Roombas are great, but those early Roombas kind of randomly wander around like it isn't really what you want. You want something that kind of goes in a more, you know, predictable. And so, you know, it's taken more...

development to do that, but I think we don't think enough about how the humans experience the robot.

Prateek Joshi (48:50.066)
So what have you changed your mind on recently?

Brad Porter (48:56.189)
I think my timelines for how soon we might get very, very highly capable learned controllers like bimanual manipulators has come in significantly. I think I was in the kind of 10 to 15 year window, which by some folks was still wildly optimistic.

I think it's shorter than that now. I don't know if it's two years or six years or 10 years, but I don't think it's 15 years anymore. And so, yeah. And we're doing things at Cobot, assuming that those things are gonna come along sooner and making sure that we have the robot that can take advantage of all that.

Prateek Joshi (49:43.314)
What's your wildest AI prediction for the next 12 months?

Brad Porter (49:50.397)
AI prediction. I think we're going to see maybe some kind of step function improvements in reasoning. That's my hypothesis. And I think it'll be similarly mind blowing when we do have that.

Prateek Joshi (50:06.514)
Our final question, what's your number one advice to founders who are starting out today?

Brad Porter (50:16.957)
I mean, it's a long journey, so make sure that it's one you're really, really excited about. And then if you're really excited about it, just go do it. Don't worry about what everyone else says. If you're excited about it, if you believe in it, then just go do it. But if you're not sure, it's, you know, I mean, Amazon, you know, a trillion dollar company, but like 20 years later, right? You gotta be excited about that 20 year journey. But if you're excited about it, then go do it.

Prateek Joshi (50:44.338)
Yeah, that's an amazing nugget on which to end the podcast. Brad, thank you so much for sharing your insights here. Loved your wisdom and just the product building muscle that you've developed. I think it just shows. And I'm so glad that Cobot has a grand vision of building and deploying these robots. So thanks again for coming onto the show and sharing your insights.

Brad Porter (51:07.933)
Awesome, great to chat.

Prateek Joshi (51:11.826)
Alright.