Infinite ML with Prateek Joshi

Building Floor Cleaning Robots

Prateek Joshi

Mehul Nariyawala is the cofounder of Matic where they are building the world's most advanced floor cleaning robots. He was previously the cofounder of Flutter, which was acquired by Google. During his career, he has held roles at Google, Like.com, and Salesforce. 

Mehul's favorite books:
- The Martian (Author: Andy Weir)
- Lyndon Johnson series (Author: Robert Caro)
- Shoe Dog (Author: Phil Knight)
- The Making of the Atomic Bomb (Author: Richard Rhodes)

(00:00) Introduction and Core Technologies
(01:26) The Evolution of Floor Cleaning Robots
(09:42) Constantly Updating Maps for Optimal Cleaning
(12:53) Differentiating Between Known and Unknown Objects
(16:03) On-Device Processing for Privacy and Energy Efficiency
(22:31) Data Collection and Privacy in Home Robotics
(24:32) Reducing Noise in Home Robots
(30:04) The Importance of Simplicity in AI Products
(32:17) Overlooked Trends in Robotics: The Rise of Rust
(40:51) Challenges and Opportunities in the Commoditization of Consumer AI

--------
Where to find Prateek Joshi:

Newsletter: https://prateekjoshi.substack.com 
Website: https://prateekj.com 
LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19 
Twitter: https://twitter.com/prateekvjoshi 

Prateek Joshi (00:01.206)
Mehul, thank you so much for joining me today.

Mehul Nariyawala (00:04.72)
Likewise, thank you for having me. Really appreciate it.

Prateek Joshi (00:07.382)
Let's start with the fundamentals. You're building something amazing. Can you break down the core technologies that enable robots to clean the floors in a house?

Mehul Nariyawala (01:02.48)
Sure, so I'll take a little bit of a step back and kind of give you some understanding on why we jumped into this space which will allow this as well. But the floor cleaning robot specifically, the space itself got started in late 90s, early 2000s when Roomba was launched. And actually Roomba wasn't necessarily the first robot. Electroloops had launched first one there.

And it was extremely amazing and clever innovation back then because back in those days there was no computer vision. It was very nascent field. There was no ability to build a map. So they built this robot that can bounce its way through the home and do it. And they just had a simple bounce sensor and the idea was like a pong game that can cover the entire space. Very clever. They also did two other things, which is making it very flat. So it's hard to get stuck underneath things and also made it circle because circle by definition can pivot.

on the same axis if it gets in trouble. So both of those innovations created amazing steps. Then I think around 2005, 2006 I believe, Nito, another Silicon Valley startup, came up with the idea that instead of using just the bounce as a sensor, what if we just use a laser pointer? And they used a single pixel LiDAR, so think of it as what self -driving cars like Waymo have, but very, very rudimentary in just one laser. And it bounces and it bounces to the wall and tells the robot where obstacles are.

So that was the two step process of innovation of the robotics space. And since then, that's pretty much it. Along with that, they did have things like clip sensors and proximity sensors and things along those lines or ultrasonic sensors, but that's really it. So from 2002 and now we are in 2020 -24, so for 22 years, this has been the two primary ways that robots have cleaned our space. Now the challenge this, and this is really why we,

started this out is we were at Nest back in the days and and we were familiar with homes and all the chores we do in homes and we knew that floor cleaning is a big problem and some of these robots because they don't have any understanding and contextual awareness they get stuck all the time they chew wires they chew Legos all that stuff and We just kind of said they're saying that look there are 200 plus self -running car startups in the world There are 200 plus industrial robotics companies in the world yet. No one is paying attention to homes

Mehul Nariyawala (03:22.64)
It's like it's as if Blackberry never improved, right? Blackberry was super popular when this Roomba's launched and it's as if no one ever innovated on it. And so we just got really, really curious to why. And at a very high level, we came to this conclusion that if you're trying to build self -driving cars, you do need Google Maps and GPS. Those are enabling technologies because car has to know where it's located on the road and where the road is going. But when it comes to home or indoor environments, these robots have zero understanding on where they are.

Prateek Joshi (03:25.494)
Hahaha.

Mehul Nariyawala (03:51.504)
which part of the home or which part of the room they are located in. There are no Google Maps for the home and there is no GPS. So in order for robot to navigate this space in a very accurate way, we have to give them the ability to map the indoor world on the fly, adapt to it, and then go from point A to point B precisely 10 out of 10 times. And that can only happen in our view using vision as a sensor, cameras as a sensor. The reason being is that...

the world is built entirely by humans for humans to fit our perception system, which is vision based. So if you can give it a vision as a system, then that's an ability that we can solve. And that's how we started thinking about it. And to be honest, this insight isn't anything that is a rocket science all the raw artists sort of know, but ability to build just based on the vision, we think that only happened around 2018 to 2020 timeframe.

because what you needed was if you're trying to bring vision into it and cameras into home, privacy becomes an issue, number one. Latency becomes an issue. And because you don't want a robot that says, Wi -Fi is weak, so I'm going to fall down the stairs, or oops, I can't function. Or the third one is indoor spaces are actually far more dynamic than the outdoor space. So they change all the time. So you have to constantly update these maps. They can't be static like Google Maps. So that's where we thought that enabling technology needed, which is the AI chips, as well as the

self -supervised learning and few other algorithms. They really started becoming prominent in 2018 -2019. Obviously, we've seen the advances ever since that.

Prateek Joshi (05:26.582)
Vision is a fascinating angle and so many questions here. But first, just to set the baseline, can you explain what are the common sensors that go into a robot like this? Obviously camera is one of them, but just to function, what sensors do we need here?

Mehul Nariyawala (05:31.824)
Mm -hmm.

Mehul Nariyawala (05:42.512)
Mm -hmm. Mm -hmm.

So for us, at least for the Matic robot that we've built, we only have five simple RGB cameras, four omnidirectional microphones, and IR illuminator. So nighttime for nighttime because it's just simple eyes. It needs a light. And we use IR light, which robots can see, but humans can't see. So those are the only sensors we have on our robot. Now, this is very opposite of traditional robotics.

Traditional robots are usually Christmas tree of sensors. And they just, they have proximity sensors, clip sensors, so on and so forth. And this is because understanding world based on vision alone is actually quite hard. That's a challenging software. And that's where I mentioned some of the new building technologies really happened in 2018, 2019 onwards. But we took this contrarian approach because in some ways we don't have a proximity sensors. We don't have clip sensors. We just use vision to solve the problem. So our entire,

was that look if we can teach robot to see the world the way we do perceive the world the way we do just using cameras and software then software -based autonomy actually improves much faster so we took very much a Tesla route to solving indoor autonomy even for floor cleaning space then a lot of people out there.

Prateek Joshi (07:03.862)
Can you compare and contrast the mapping aspect for a robot like this? And especially the two buckets I'm thinking of is doing it with other sensors without relying too much on vision versus a vision first approach.

Mehul Nariyawala (07:08.208)
Mm -hmm.

Mehul Nariyawala (07:20.432)
Absolutely. So first I'll talk about the mapping itself and then we'll do the vision first. But fundamentally, historically, if you actually go and look at Google Vamo project, which was called Shoffer, before it got started, Sebastian Thrun, who is the original starting co -founder inside Google of this project, built Google Street View Maps because it knew that it needed an understanding of the 3D world it's trying to go. OK.

In the same exact way, we realized that before, if you want to navigate an indoor space precisely, you have to understand where things are, what objects are, how far away they are. And for that, you can't have just a two -dimensional floor plan like maps. The way to think about bounce sensor is really imagine putting blindfold around your eyes and then just walking, and as it bumps into it, you just go into a different space. So that's really the bounce sensor.

The LIDAR -based maps is literally, you still have the same blindfold on, but now you have your hand out. So every single time you touch something, you just draw a border. There is an obstacle here, so I'll draw a border. But that means anything that is above your hand's height or below your hand's height, you're not going to see it and you're going to run into it. So that allows, typically the LIDAR -based system allows you to build a two -dimensional floor plan like maps. They usually have an understanding of...

where the walls are or where the big counters and obstacles are. But they don't necessarily know if there is a pencil or wires or Legos or toys in the way. And this is the differentiation. Now what we are doing instead is we decided to build a full blown 3D map that we have, which is as we walk around our floor, we're able to differentiate between what's on the floor from the, all the way from whether it's a rug or a table or toys or dirt and...

Our entire thesis, as I mentioned earlier, has been that if you can see it, then you can navigate around it. If you can see it, then you can actually adjust your cleaning efficacy, cleaning method, everything along those lines. So that's really it. The differentiation is that instead of building a two -dimensional floor plan, we're giving robots the same understanding of the map that the humans have so that it can manipulate and do things with it. And specifically the vision for... And sorry, you also mentioned a vision -first approach. Why are you doing that?

Prateek Joshi (09:36.502)
You mentioned.

Mehul Nariyawala (09:42.544)
So this is really coming from a Nest background and this is where we'll jump into it, but shipping any hardware product is hard, right? Hardware is hard, that's a cliche. And at Nest, we had this rule of thumb that a single sensor you add in a hardware, assume three software engineers on a flip side as a permanent cost. So the single sensor, the more sensors you have, the bigger the team. The more sensors you have, the more complex the product. More sensors, the more calibration, more complex the supply chain, higher the bomb cost, so on and so forth.

So using multi -sensor approach to build a robot is actually really, build a robot as a demo is really easy. But then shipping that robot, it becomes very, very hard. And this is where you can find Kickstarter projects after Kickstarter projects of consumer robots where they got funded. But by the time they were supposed to ship, either they were prohibitively expensive or they never got shipped because complexity rises with each sensor you add into the hardware. So for us,

Because we wanted to build in a consumer space, we had to make a robot affordable as well. An affordability piece of puzzle means that we want to limit the sensors and absorb all the complexity into the software. So that's what we did.

Prateek Joshi (10:54.454)
Now, when you buy a new floor cleaning robot, it goes around, it builds a map. And as I mentioned earlier, the indoors are way more dynamic. We keep moving things around the map, the floor keeps changing for a robot because the chair was here, now it's here. So how do you dynamically update the map once it's made?

Mehul Nariyawala (11:05.136)
Mm -hmm. Mm -hmm. Mm -hmm.

Mehul Nariyawala (11:18.224)
So the analogy we use to solve this problem is very similar to how we as humans behave. And what I mean by that is imagine you go into an open house to see a home you want to potentially buy. First thing you'll do is you'll walk around. And as you're walking around, you're not actually optimized for a precise navigation or precise point A to point B trajectory. You'll just explore, build a map. Now, if you go through that home for the first time or visit their home for the first time and you go back the next day, you don't remember all the details.

But as soon as you spend about 10 hours on it or day on it or two weeks on it, now your information gets very, very precise. So in the same exact way, we allow robot to first explore the home, start understanding the information, but as it moves around, it is actually constantly observing the changes in the environment. So it's constantly saying that, hey, here's how the map was, and here's the delta, which is yesterday the object wasn't here, today it's here. And that's what we do, right? We observe all the changes in the environment as we walk around. So as it's doing the task,

It's continuously updating the map. So in some ways from a user perspective, they'll understand that, hey, mapping happens at the beginning of the first time when it comes to your home. But reality is that it's constantly mapping and constantly looking at the space and then changing, adjusting it.

Prateek Joshi (12:33.686)
So let's say the robots been doing this for 10, 20, 100 hours. It knows the house now. Now, you know, one fine day you wake up and you want it to clean the house. Now, how does it determine the cleaning path or how does it pick the most optimal cleaning path based on today's mess I have in the house?

Mehul Nariyawala (12:38.512)
Mm -hmm. Mm -hmm.

Mehul Nariyawala (12:53.388)
Yeah, that's a great question. So there are a couple of ways, there are a few metaphors we're doing here. So there are three ways users can clean. One is there is a scheduled cleaning that you can set up and the robot actually works in a pitch dark as well. So you can set it up at 3 a in the morning if you really want to do and you can wake up to a clean home. So that's the first part, that's scheduled cleaning and that's scheduled cleaning. You're able to label the rooms and prioritize them.

And what I mean by label is actually, sorry, it auto labels the room. So it automatically figures out where is the kitchen, where is the bedroom, where is the living room, family room. And now you can just prioritize them. You can just tap on it and say, I want to clean kitchen first, I want to clean front door second, et cetera. And that's the one we are doing it. And then it will just follow the path. The second one that we are implementing is the metaphor is a continuous clean. And continuous cleaning is where it takes off few times a day, very quietly.

and explores your home and as it looks for dirt on the floor. And the way it does it is literally here's where the clean floor looks like. And if there's something on top of it, it's clearly not clean. Now, whether that's a dirt or actual Legos, those are the two questions it asks. And then it says, if I find a dirt, I'm just gonna clean dirt. If I find a stain, I'm just gonna mop it. So that's the second metaphor, that muck. And that second metaphor was an interesting one because we as a user wanna live in a perpetually clean home with perpetually clean floors.

Not once a day, once the evening, right? So that's the second metaphor. And third one is that you can actually tap anywhere onto the map and just draw a quick area. So let's say you just cooked and there is some dirt that you want to clean right below the cooking stove. Then you can just select that area and say, just clean this part, not my entire kitchen. I just want you to clean this one thing. And it goes and does it. Now, if there is a mess, it is observing it. So what we have is two separate things, global map and a local map.

And then we also do global planning and a local planning. So global map and global planning is based on the memory, based on the history that, okay, here's the map I saw yesterday, so I'm gonna plan along with it. But as it moves, it is constantly looking and saying, okay, is this still clear? Is there a new object in it? Is there any change in the environment? And that's the local planning and local map that is constantly building and doing delta to solve the problem.

Prateek Joshi (15:09.814)
If you look at all the floor cleaning robots on the market, one of the key problem areas is heavy debris. Meaning they don't know that, hey, you're not supposed to pick that up. And it picks it up and it gets stuck. And then they have to sit and open it up, repair the thing. So how do you handle areas with heavy debris?

Mehul Nariyawala (15:24.112)
of course.

Mehul Nariyawala (15:33.04)
Yeah, that's a great question. And that is fundamentally the reason why vision was the right way to go. Because the way to think about this is one analogy I use is that if you have pets, dogs, and cats, you can tell them to go ahead and navigate your home. And they can navigate and walk around the mess and without touching it. But you can't tell them that, hey, there is a Lego, go clean it up, or go sit on the couch, or go in my kitchen and hang out over there. Because they don't have a semantic understanding.

So in the same exact way, we not only build this 3D view of the world, but we actually give it a semantic understanding. And that semantic understanding means that it knows where the kitchen is, it knows where the refrigerator is, it knows what's on the floor. It can understand that there are wires, there are straps, there are Legos. And then, hey, this doesn't look like anything I know, so this might be an unknown object. So we have three categories that we teach, which is known dirt. There are things we already know are dirt, like popcorn, popcorn kernel, cereal.

any sort of liquid, known non -dirt that we're likely to find on the floor. So that's Legos, crayons, wires, papers, books, and then unknown objects. And that's how we differentiate. So if it's known dirt or known non -dirt, it can make decisions automatically. But if it's an unknown object, one of the features we will implement is it will just take a picture and say, hey, do I clean this? Do I touch this? And that's a very easy metaphor. And the way we think about it is even kids do that.

Even if you have someone helping you inside your home, if they don't know what it is, they'll say, hey, what is this? Am I supposed to touch it or am I supposed to clean it? What do you want me to do? So that's how we think about it, that even with humans, we give them instructions in our home on how to operate, how to do it. So sometimes the robot needs that help and it just asks the questions.

Prateek Joshi (17:18.934)
Now, if you look at the number of sensors and how it relates to the power of perception, meaning more sensors, you get more data, so again, it can infer better, but you're using a minimal number of sensors. So how do you balance between needing more information versus the amount of intelligence or perception you can get with a limited number of sensors?

Mehul Nariyawala (17:30.864)
Mm -hmm.

Mehul Nariyawala (17:44.4)
That's a great question again. And then we have this rule of thumb inside a company which we use, which is solve problems using software willingly, hardware grudgingly. Which is, we're not necessarily using minimal amount of sensors because we have some religion about it. It's more along the line that as long as you can solve problem with software, absorb it in software, because it might take a little bit longer, but in the long run, it's much easier to improve and be reliable. So that's one. Now, information to a certain extent, you know,

Prateek Joshi (17:53.27)
I'm gonna go to bed.

Mehul Nariyawala (18:14.544)
I'm going to, we as humans have multi five senses, right? And we have ears that are sensors, we speak obviously, and we can see smell. But if you think about it, 60 % of our brain power is actually just used for visual data information. Our primary sensor is vision, and vision actually is enough for a lot of different things. We do have microphone on the device itself, so we can potentially use noise as information as well.

But this idea that you need a lot of information or all the precise information, we don't necessarily agree with that philosophy. And sensor fusion itself is actually quite hard. And even for us as humans, the best way to think about this, just put a patch, one eye patch on your eye, right, for a day and try to walk around and try to shake hands. And there's a good chance your hands might be two inches away from other person's hand, right? So even a single sensor loss for humans.

is not immediately adjustable. So if you have multiple sensors and you have a lot of information, now if one of the sensors fails, the entire system goes haywire. So actually, more sensors can become a crutch instead of helpful thing, which is too much data can become a crutch instead of really helpful. So you have to find the right balance.

Prateek Joshi (19:28.758)
When you think about images, processing images, it's fairly compute intensive. So you're going vision first, you're capturing the data. How do you think about the processing power needed? How much of that is on device versus how much of that should go to the cloud? How do you think about that?

Mehul Nariyawala (19:49.2)
That's a great question again. And here we came from very much a user perspective. And in our case, we just realized that look, homes are our sanctuary. Privacy really matters. And a lot of people just don't want to share the vision data. So we actually decided to do the whole thing on device. Now this wasn't necessarily possible in 2015 or even earlier. It's only possible because now we do have powerful AI chips. And my co -founder, Navneet, helps back out Google Coral TPU. So we knew these AI chips are coming out.

So we do entire processing on device. Now, that doesn't mean that it's easy. There is an element of optimization that we have to do in optimization. It forces you to distill your neural networks to make sure that they work on the device or they work with limited amount of computes. So it's really a game between here's how much computes we have available and here's the accuracy we want to achieve and can we fit it. And that does take a time, but it's doable. Just like the analogy I'd use is a lot of people have run LLMs on their

MN chips and computers and we continue to see amazing work done by a lot of things. It's the same exact concept.

Prateek Joshi (20:54.39)
And for a floor cleaning robot, you have to move, you have to clean. So the energy requirements could be higher than the average robot. So how do you think about managing the energy requirements and what amount of considerations, both from software and hardware, go into this aspect of the robot design?

Mehul Nariyawala (21:01.168)
Mm -hmm.

Mehul Nariyawala (21:15.984)
Absolutely. Another great question and that would be a consideration in any other robotic application except for floor cleaning only because the vacuuming itself takes so much energy that the amount of energy that we need to do all sorts of AI or processing is actually still relatively lower in that sense. So to us, we had to put big battery into the robot to make sure that it can

Prateek Joshi (21:28.566)
Yeah.

Mehul Nariyawala (21:45.648)
the floors, both vacuuming and mopping, for about two and a half, three hours. And because of that, the computer requirement, or at least the efficiency requirement for the power efficiency requirement for the AI piece of the puzzle is quite low for us. So we sort of got lucky in that piece of the puzzle.

Prateek Joshi (22:03.542)
And you mentioned privacy, which is very important. And people don't want the detailed map of their house sitting somewhere in the cloud. So how do you think about the data that you need to make the product work well versus data that has to stay private? So where does the data set? How much do you use to make the product better? And what data you'll never use?

Mehul Nariyawala (22:17.968)
Mm -hmm.

Mehul Nariyawala (22:21.68)
Mm -hmm.

Mehul Nariyawala (22:31.984)
Yeah, that's another good question. So two reasons why we did everything on the device. One is privacy, but the other one I want to before I forget is also latency, which is you don't want a robot that says, oops, I'm going to fall down the stairs because compute power is lower cloud connection is weak. So latency was another reason why we did it on the device because the robot needs to be zippy. But for privacy perspective,

there are sets of people who would always give you data because they just want to see the AI and advancement just go higher. But most people are very, very concerned about privacy. So we actually do not use any of their data and put it into cloud as well. We don't even collect it. Because we don't collect it, we don't have to worry about securing data or even getting hacked. So that's number one, that everything happens on the device. We don't collect the data. Two, how do we get the data to improve it? So...

The way we've done it is we have 50 people, 60 people team at the moment and almost everyone is willing to share the data. So we have that data and then we use simulation to augment that data. So we've sort of created about thousands of home environment of our own based on some of the data that we have and we run simulation, use that as a training set as well. Then for long -term, one of the things we want to do is because this...

device has this amazing Nvidia O -Ring chip, and because it will sit on the dock maybe at least 10 hours a day, then can you take the day's worth of information and actually just do training on its own without even getting involved? It doesn't need to be fast. And then it can potentially upload weights and biases to the cloud and see if we can enable some version of federated learning along those lines. So that's how we've been always thought about it, that once we get inside homes, we can...

use users' data without ever extracting any visual information and then just have it tuned for their homes and then along the way learn it for everyone else as well.

Prateek Joshi (24:32.854)
One of the main complaints, if you will, and in today's products for in this category is the noise. Meaning on average, these products are fairly noisy. And so when they're doing their thing, there isn't much you can do because it's so loud. You just wait it out. So how do you think about the noise aspect of the robot and what, if any, like what technologies are you building to kind of reduce the noise?

Mehul Nariyawala (24:40.144)
Mm -hmm.

Mehul Nariyawala (25:01.104)
This is this is one of my favorite topics because we were actually really really worried about Noise when we started out because if you're having a robot that it's constantly cleaning and noisy or noise itself is a pollution So you don't want to make no home dirty by having a lot of noise So we were really worried and we the reason we were worried is because we assumed that Vacuum by definition is loud because of the airflow. So if it's loud then then

All these guys must have tried to reduce the vacuum noise and just failed at it. Well, as it turns out, the vacuum industry uses noise as a metaphor for efficacy. So just like we were trained that the louder the car, the faster it must be. We've also been trained that the louder the vacuum, the more suction it must have. So it must be doing a good job cleaning. Turns out noise and vacuuming speed has no impact on efficacy at all. Most of the work is actually done by the brush roll.

The analogy we use is that if you have a dusty car, no matter how fast you drive, it's not going to get clean. You have to nudge the dirt. And if you just slightly nudge anyone with your finger, it gets clean super fast. So the same way, a brush roll actually does quite a bit of work. But the second thing is no one actually just tried reducing the noise. Turns out it was just one of those things industry didn't want to touch. So we started playing around with it. And there are so many different pieces of the puzzle. But one of the simple ones is literally just borrow the technology from a gesture and feel that in this case, just that.

cars and the mufflers. So what we do is internally we have designed a version of a muffler for just our own vacuuming system that actually turns airflow into pretty much a very smooth circular airflow versus all over the place and that reduces noise in itself.

Prateek Joshi (26:43.254)
That's amazing. That's actually one of the good insights. There's no reason for it to be noisy, except I think, as you said, they think people connect the level of noise to how awesome or clean the robot is. That's fascinating. I think many people would love to have a quiet robot, but I don't know why the industry people think noise is a good idea. That's crazy. Also, how...

Mehul Nariyawala (26:57.36)
That's correct. That's correct.

Mehul Nariyawala (27:02.704)
Okay.

Mehul Nariyawala (27:09.84)
Yeah, it's just one of those things that sorry, it's just one of those things where I feel like, you know, test lies the first car that came out and said, hey, you don't need to be allowed to be to be fast and break that metaphor. And I think someone just needs to come and take that risk and break the paradigm and metaphor. And once you do that, I think people will do it. But at the moment, they're just going with whatever consumer understands.

Prateek Joshi (27:12.854)
Yeah, go ahead.

Prateek Joshi (27:33.878)
Yeah. And how do you collect data or rather how do you use the data related to, hey, there's a bug or hey, it didn't function. Some early users will report that it's actually useful data. So how do you think about tracking those bugs and getting that data and doing over the year updates for the robot?

Mehul Nariyawala (27:44.24)
Mm -hmm.

Sure. Sure.

Mehul Nariyawala (27:52.812)
So this is another good question. So, you know, in this scenario, part of our background at Flutter as well as our background at Nest helps because both of those were computer vision applications. And what we learned along the way, Prateek, is that customers and users are actually quite helpful if you build a trust. And trust comes from asking for permission rather than forgiveness. So we tend to do different ways. If you're just honest with them.

and tell them that, hey, here's where the bug is and we need this data or we need this your help. They're actually quite willing to help. And the way we do this is twofold, which is everything in our robot is opt -in. So we do not collect any user data, even telemetry data without their permission. So that's the first piece of the trust. And second thing is whenever we encounter bugs, we try to actually create heuristics around it. So if we know that robot is bumping into things because, you know, camera isn't moving, but the wheel is moving, we pop up.

a little notification on the app saying that, hey, there seems to be a bug, there seems to be an issue here. Are you OK if we take 30 seconds of data before and after just to kind of debug? And a lot of people just say yes. And at Flutter, we actually started doing this as well. And a lot of people eventually came and said, you know what? We trust you guys. Just take our data. Can you just give me a global opt -in? You don't have to keep asking me again and again. So if you do it the right way and a systematic way, you can create a set of customer base who are willing to share a lot of data for you.

And then you just have to be transparent saying that here's the data we're collecting, here's how we're collecting it. And by the way, all the personally identified information is blurred out. So if you do it the right way, customers are very willing to trust you and work.

Prateek Joshi (29:31.094)
Right. That, I think that's a great way to take a lot of the data and incorporate that into the product development. Also, as you think about the edge computing and more processing power sitting on the device, can you just talk about like what have you found to be like good piece of hardware, obviously software side, what can go on the device and what's working well for you in terms of just.

compute and processing.

Mehul Nariyawala (30:04.112)
So I think the advancement at the moment, the advancement in AI chips is quite amazing. Now, whether it's Qualcomm or Andrel or any of this, they all have an amazing chips that you can use. It's just a matter of the software that you're comfortable with on top of that. In our case, we chose to move to NVIDIA recently and use NVIDIA because NVIDIA's CUDA software stack is actually quite amazing and it is open source. You can actually...

and make it work for your device very, very well. And you can run transformers in a lot of their libraries. So they have done quite a bit of work on their own. So that's how we are using it. But really, it's a choice where you can do it. The reason on a consumer product, at least I'll jump in here and kind of give you some two cents. On a consumer side, the challenge, Prateek, is that consumers do not want to pay for software. All the software you use, we use in day -to -day lives is almost free.

So if you have a cloud processing, then cloud processing itself is, and this is a challenge that all IoT devices face, that if you're doing processing in cloud, that costs money. But consumers just pay you once upfront when they purchase the product. So how do you keep paying that for a long, long time? And that's where doing it on the edge device is necessary across the board.

Prateek Joshi (31:22.902)
I have one final question before we go to the rapid fire round. There's so many developments, advancements happening, and many people are saying 24 is the year of robotics. So many things are happening. So what technological breakthroughs in robotics are you excited about now, both from what's applicable to you, but also more broadly for the robotics field?

Mehul Nariyawala (31:48.368)
Great question. So we started in 2017 and we chose robotics as a place to build a company partly because we had seen these two trends coming around, which is one computer vision and AI we thought would have a huge impact over the next 10 years and then the AI chip. So I will kind of go back to those ideas that just the amount of computers available, whether it's in the cloud or whether it's on the edge device, that's skyrocketing and it will continue to rise and that will enable a lot of robotics application and we're already seeing that.

The second one is obviously the self -support learning and generative and advancement of LLM that is driving some of these changes as well. And both of those technologies we are actually really, really excited about. Third one that I'm actually quite excited about, which is not as popular, is I do believe that invention of Rust as a language is actually quite amazing because there are so many challenges in robotics because it's a systems problem based on C++ and memory utilization.

And Rust actually creates a much better way for you to squash those bugs and actually build a systems application. So Rust itself is, I think, a huge advancement. And I wouldn't be surprised if 10 years from now, every single robot have a Rust as a backend language.

Prateek Joshi (33:04.822)
that's actually very interesting. That's a good, good and interesting connection between like Rust and the future of robotics. Perfect. With that, we're at the rapid fire round. I'll ask a series of questions and would love to hear your answers in 15 seconds or less. You ready? All right. Question number one. What's your favorite book?

Mehul Nariyawala (33:17.488)
Sounds good.

Mehul Nariyawala (33:23.024)
Sounds good, let's do it.

Mehul Nariyawala (33:27.952)
Okay, this one I can always struggle because I have a bunch of favorite books. So I'll answer questions in a few different ways. So one of my favorite science fiction book is The Martian. And then I love that author just because of the sheer audacity of the adventure and the way he solves the problem. It's absolutely amazing with humor. If it's a politics or biography, I really love Robert Carrows, Lyndon B. Johnson series. Those are four books that are favorite. Not only because...

It's the story about Lyndon B. Johnson, but it really gives you an understanding on how the United States government works in the Senate. For entrepreneurship, I love Shoe Dog. Even though I'm in tech, I really love the story of Nike because it's a company started in Portland where renter capital wasn't around and they built for 20 years before even renaming themselves to Nike. So what most people don't realize is that for 20 years, it was just, if I remember correctly, known as Blue Ribbon Shoes. So that's fascinating to me. And then...

One of the one last one that I absolutely love is the book called The Making of an Atomic Bomb. I don't have a strong opinion on whether bomb is a good invention or not invention, but the sheer amount of innovation that had to take place from 1941 to 1945 to make that happen, it's just mind -boggling. The amount of effort and it's a great example of what humanity is capable of doing.

when it sends its mind to it and when you take the politics and misaligned incentives out of the way.

Prateek Joshi (34:59.286)
Amazing. We love books on this podcast. So thank you for sharing the list. That's an amazing set of books. All right, next question. What has been an important but overlooked AI trend in the last 12 months?

Mehul Nariyawala (35:03.28)
Sure.

Mehul Nariyawala (35:12.464)
I think I'll specifically stick to robotics, but in robotics, I would actually say, to piggyback on the earlier answer, that rust, use of rust is an overlooked trend. I think I've since met about five companies, five robotics companies, who are slowly, slowly switching over it. And I think that trend is just going to skyrocket as we move forward.

Prateek Joshi (35:34.23)
What's the one thing about robotics that most people don't get?

Mehul Nariyawala (35:39.629)
That's a great question. And I think building any robot is hard. Building even robotic demo is really, really hard. And I'm so excited about some of the demos we've seen over the last years and some of the advancements we're seeing in robotics. At the same time, turning that demo into a shippable product, I think it's probably a thousand X harder. And that's a piece of the puzzle.

that many people do not understand and it's thousand X harder for two reasons. One is as I mentioned robotics is a system problem. There are many many disparted systems that has to work together and that itself is tough. But the second thing is that the way we use generative AI or any AI application today, we ask it for an answer and if we don't like it, we just tell it to regenerate. And we don't necessarily delegate the task completely. They're sort of helpers, copilots.

But if it's robotics, we want to actually delegate the task entirely. So the bar for accuracy, bar for completion has to be really, really high. You're not going to tell robot that, you didn't clean it again. Clean for next 30 minutes again three, four, five times because it's just too long. You don't have patience. So that's where I think that the bar for robotics is much higher than people realize. And just shipping the product is going to be far more challenging.

Prateek Joshi (37:01.366)
What separates a great AI product from a merely good one?

Mehul Nariyawala (37:06.896)
I think and again, I'll answer in the question context of robotics when in Let me take a step back For any product. I don't think there is any such thing as a great product or a good product I do have come to conclusion that there is only simple product in a complex product if it's simple by definition It's great. It's intuitive. It's easy to use and then simplicity means that it's intuitive. It's low latency It just works complex means that you have to use your brain

Prateek Joshi (37:25.706)
Right.

Mehul Nariyawala (37:36.72)
So to me, any product that is simple, whether it's robotics, AI, or anything else itself is a great product. That said, if you just ship a robot product, I would just call it a great product at the moment because there are so few amazing robotics products out there in the world anyways.

Prateek Joshi (37:53.494)
Right, that's actually a great framing. All right, next question. What have you changed your mind on recently?

Mehul Nariyawala (38:01.008)
Another good question. So we've been working on this product for about six years now. And for the first, I want to say five years, we didn't talk about it at all. And only since last November, we started talking about it. In hindsight, if I could do and go and do it again, I would build things in public, then just doing it. It wasn't that we were trying to be selfish, just that we didn't talk about what we were doing. And the reason I would do it in public is...

because of the social media, because of the internet, the community out there is so amazing. They're so supportive. And since we've been sharing this information publicly, the amount of encouragement I've gotten our team gets, it's just so mind blowing. And I feel like we missed that as an advantage by staying quiet for so long. So if I could do it again, I would actually do it publicly and I would encourage everyone else to do it that way as well.

Prateek Joshi (38:54.358)
Yeah, no, that's actually a brilliant observation because more and more I'm seeing like, coms is becoming like part of the vertically integrated thing that the company is becoming in addition to building and shipping the product. They want to, they want tight control on what, what we tell, how we tell it and how frequently we tell it. It's, it's a, I'm seeing like really, really good builders embracing it and including you. So that's, that's actually brilliant. Also maybe quick part B to that question then.

Mehul Nariyawala (39:03.824)
Yes. Yes. Yes.

Prateek Joshi (39:21.974)
For founders, many founders, they don't think about it. And even if they think about it, they're not great at it because we haven't done it before. So for somebody who wants to do it, hasn't started yet, what's your advice? What's the one thing you tell them?

Mehul Nariyawala (39:36.848)
Yeah, I would just tell them what I did, which is just start doing it and you'll learn along the way. You learn based on the feedback from the people you learn because people will give you feedback. It's amazing how many DMs I get on Twitter and X based on people saying that, hey, may we'll next time you do it, do it this way. So when you put yourself out there and when you try to do something, people sort of come out of good work to help you. And that was the part that was really exciting.

So you'll learn along the way and like anything else in your startups, it's iteration. So just start iterating and the earlier you start, the more refinement you have, more you can refine by the time you're actually ready to launch and talk to customers.

Prateek Joshi (40:14.55)
All right, next question. What's your wildest AI prediction for the next 12 months?

Mehul Nariyawala (40:22.32)
I don't know if it's a wildest and I don't know if it's just limited to next 12 months, but when we started out, we actually thought about doing a lot of AI for consumers and we wanted to stay on the consumer side of things. And one of the challenges we always felt, and this was true for Flutter as well, is that consumer AI will get commoditized. And because it will get commoditized almost always, it might become an open source or free resource. So then how do you actually build a substantial business on top of it? So to me, a lot of...

companies unless they embrace some version of enterprise or things, building a consumer AI product is going to be a struggle and then doing the right business model is going to be a struggle. So what we see right now in AI is a lot of value creation. What we don't know is how the value will get extracted and we hear that consistently and I think that challenge, that problem will become more acute a year from now than it is even today.

Prateek Joshi (41:20.534)
All right, final question. What's your number one advice to founders who are starting out today?

Mehul Nariyawala (41:27.408)
Solvable problem This this this sounds a very very simple advice, but I think I had run I had heard an interview from Kevin's system or something and he said that was the number one advice you give that founders ignore and his answer was solving a problem and then this is where You know why Communeer or logo make something people want is apps It's really really apt and amazing and so clever and I would go one step forward saying make something people need not just one because if it solves a real problem

will get there and specifically in the context of robotic internally we talk about this all the time that people do not want robots they want solution to their problems ultimately it has to do a job so our job is to not only make a cool robot but make a useful robot so we tend to say that make useful before cool or useful for school second

Prateek Joshi (42:17.398)
That's amazing. Useful first, cool second. That's a brilliant way to end the episode. Mayho, this has been a brilliant episode. I think obviously all the accumulated years of wisdom, it just shows in the way you talk and the way you explain things. So thank you so much for coming onto the show and sharing your insights.

Mehul Nariyawala (42:36.272)
Thank you so much for having me and amazing questions. Really appreciate it.