AI's Role in Energy Regulation and Compliance with Yuval Lubowich Artwork

Your AI Injection

Is AI an ally or adversary? Get Your AI Injection and learn how to transform your business by responsibly injecting artificial intelligence into your projects. Our host Deep Dhillon, long term AI practitioner and founder of Xyonix.com, interviews successful AI practitioners and domain experts to better understand how AI is affecting the world. AI has been described as a morally agnostic tool that can be used to make the world better, or harm it irrevocably. Join us as we discuss the ethics of AI, including both its astounding promise and sizable societal challenges. We dig in deep and discuss state of the art techniques with a particular focus on how these capabilities are used to transform organizations, making them more efficient, impactful, and successful. Need help injecting AI into your business? Reach out to us @ www.xyonix.com.

All Episodes

Your AI Injection

AI's Role in Energy Regulation and Compliance with Yuval Lubowich

March 05, 2024 • Deep • Season 3 • Episode 12

In this episode of Your AI Injection, host Deep Dhillon chats with HData's co-founder and CTO, Yuval Lubowich, about AI's transformative role in energy sector regulation. HData leads in refining regulatory data management for energy companies, streamlining processes from filing to analysis. Yuval unpacks the challenges of navigating complex regulations and the shift towards efficient data handling through standardized formats like XBRL. The discussion highlights AI's enhancement of data analysis, aiding in rate case justifications and operational insights. Yuval also envisions AI broadening access to regulatory information, potentially reshaping industry transparency and decision-making.

Learn more about Yuval here: https://www.linkedin.com/in/yuvallubowich/
and HData here: https://www.linkedin.com/company/hdata/

Check out our related content here:

Or learn about Xyonix's regulatory solutions:

Deep: Hello, I'm Deep Dhillon, your host and joining us today we have Yuval Lubowich as the co founder and CTO of HData. He balls at the forefront of transforming the energy sector with AI technology that helps streamline complex regulatory data analysis. Improving accessibility, efficiency, and aiding in compliance and strategic decisions.

We'll explore how evolved AI system is not just simplifying, but significantly transforming our interactions with energy data. Thanks so much for coming on. Maybe let's start, can you like share the moment when you realized AI could sort of fundamentally change the way Um, you know, we handle regulatory data in the energy sector and maybe as a set a little bit of context for us.

Deep: Like, what, what is the, you know, what is your company do at H data? And like, what is the regulatory environment? And then let's talk a little bit about the A. I.

Yuval: Yeah. Absolutely. So first of all, thank you for having me. Um, I'm excited. Um, let's start with the age data and what we do, uh, we operate in a segment of the industry called regulation technology or, or reg tech, uh, focusing mainly on energy companies and regulated energy companies.

Uh, and we are transforming how the entire industry is using AI and automation. To make it easy for everyone to file, explore, analyze, and leverage regulatory data. Oh, that's, uh, that's us in 60 seconds.

Deep: Well, maybe we start with pick like a target user of yours, like your core target user and tell me what they used to do before your solution came on board.

And feel free to characterize the pain they would have felt, I'm assuming.

Yuval: Absolutely. So this entire industry is based on huge amounts of regulation, huge amounts of data and files being sent back and forth between these energy companies, which might be utilities or gas pipelines or water, private water companies and so on and so forth.

And on the other side, the regulators who are managing and making sure that this Industry operates, um as it should be And that means that companies cannot just go and jack up the prices anything that they do everything that they do is regulated They have to file quarterly reports annual reports.

Sometimes even monthly reports to the government. There is a special federal agency the FERC the federal energy regulatory, uh information that manages and regulates these companies and up until We know Asia came along and up until this revolution, a lot of these reports were done manually, meaning sometimes even paper documents were sent to these regulators about several years ago in 2015, some changes started to occur, and the FERC decided to move to a new way of doing things using a new standard and new for them called XBRL.

Which allowed us to standardize and allow the industry to standardize the way these reports look like and how they're being filed. Fast forward several years, we, uh, and other companies are helping the industry file their reports in a structured way, which is great. However, we took it, uh, several steps further and aside from helping the companies file in this new standard way.

We also ingest all of the data and because everything is standardized, we are able to analyze slice and dice the data and start doing things that beforehand, um, a team of analysts, sometimes lawyers, sometimes interns. Of course, the entire industry had to manually grab these numbers, plot them in Excel or, you know, put them in a Word document and create, uh, um, their analysis based on that.

So it was a manual process. It was very, very time consuming, error prone.

Deep: Can you, can you maybe help me understand, let's get like, let's take a really specific scenario, because I don't have a good picture in my head of what one of these reports looks like, so maybe pick one of your company, like, it can be a fake company, but like a utility company or a gas pipeline or something, what do they have to file once a month?

Like, what, what is the government care? Uh, that they're doing, uh, and, uh, and what is it that they're filing and how is it that they're proving that they're, you know, maybe not violating something.

Yuval: Sure. So they're filing things like operational data, the filing, uh, information about their, um, fuel usage and power plant capacity and how much they're utilizing their capacity.

They're filing additional operational maintenance expense numbers. These reports, generally speaking, contains hundreds of pages and thousands, if not tens of thousands of data points, revenue, expenses, things of that nature. Now.

Deep: But before you go on, so is the is the point here that the government is ultimately because it's, I'm assuming the government's trying to like, protect people's pocketbooks on the other side, like, you know, individual consumers.

And so for something like. The electricity, they're trying to get at what are the true costs that the utility company has in generating that, you know, kilowatt hour of power that makes it into their, into their, uh, whatever, into their house. And therefore they want to know, um, and if you have to raise rates, you have to prove that your costs have gone up.

Um, is that basically what we're talking about is everything that we assume you would need to know in order to sort of understand true costs for 1 of these companies.

Yuval: Ultimately, it boils down to that. Um, what you've just described in the industry is referred to as a rate case. These companies cannot raise the rates even by one cent to ask consumers without getting approval from a regulator.

So these rate cases are done every now and again. And this is where the regulator, um, requires that the energy company or the utility prove why they want to raise rates. The annual or quarterly reports are Things that are happening regardless of the rate cases, right? Cases usually are based on these reports, but not just there is a lot of information, a lot of disclosure information that regulators require from these utilities in order to make a decision.

Deep: But some of the other issues that the government probably cares about is maybe safety concerns. Like, I don't know if it's a gas pipeline, like stuff that might be leaking out if it's a water, maybe there's seepage or maybe there's. I don't know the standards of the pipes have gone down or something. So, so it's probably around safety, I'm guessing.

Um, and then around this kind of costs, you know, and are there other like big broad categories that the government cares about with these companies?

Yuval: So the government cares about a lot of things for good or bad. The government cares about a lot of things. The FERC, the Federal Energy Regulatory Commission is not always looking at safety.

What they're mostly focused on is financials and operational data. The EIA, which is another federal agency, is, uh, the one that is looking at other things that might relate to emissions, might relate to safety, um, outages, things of that nature. There are multiple agencies that play here at the same time.

Companies have to file different reports to different agencies, different cadences in order to complete the picture for the government and for the state regulators.

Deep: Got it. So going back to the reports standardization that I think you were starting to describe. So I'm assuming and correct me if I'm wrong.

I don't know anything about this area, but I'm going to guess that somebody in the regulatory agency sat down and sort of said, okay, let's define what a power company needs to report. Um, and then let's talk about the different categories of stuff. Maybe there's. The energy that they generate, how they generate it, like, uh, like how much, you know, in the last X unit of time, all that stuff gets defined in, in my head, in a schema of some sort, you know, um, that's a standardized schema.

And then maybe they're putting it into a report, but maybe it could ultimately be the data, the raw data could be piped somewhere. Then maybe there's a bunch of. Softer fields like natural language fields that they have to say words to characterize things something like that. And then they put out these standards.

Across the different areas that they're regulating, maybe for the water company, it's different than the gas companies. It's different on your, like, facilitating, making it easier to basically populate these reports.

Yuval: Is that so? First of all, you're right in the way you described it. All of these reports have been defined in taxonomies.

This is a well-structured and well defined report. Which is somewhat similar, but not exactly the same between electric utilities, gas pipeline and oil pipelines, as an example. Now, these reports contain, for the most part, numbers, but also, uh, they contain text. So there's a lot of, uh, information that users can, uh, free type, uh, and, uh, footnotes and things of that nature.

So it's not fully structured, but for the most part, it is, uh, all around. Numbers now what we're doing and these reports are not just the only thing that these companies report on There are also a lot of straight up pdf documents war documents or just uh, you know emails and text files that are being sent between the regulated companies and the regulators As part of what we consider either raid cases or the way the government is looking at these things, they're collecting them in dockets, which is a collection of information revolving around a specific topic.

If you're asking permission to build a new nuclear power plant. Which just came online in Georgia, where I'm from a few months ago, the sheer amount of information, which is not just numbers and financial numbers, but also surveys and, you know, land information and rights and a lot of other things. All of these things are unstructured and.

They're reported to the government and to the regulators. And what we are, uh, doing, and this may tie to some of the, uh, questions that you asked earlier. How does our AI and generally speaking, AI is able to change the equation? Is because the generative AI solutions that, uh, we are using and others, uh, uh, of course, uh, are well aware of, they're able to crunch all of these documents very, very, uh, quickly and.

Derive the information from these documents. Now, there are, because we're dealing with regulation technology and there are a lot of legal implications, it is not as easy as feeding a document to GPT and asking a question. There are quite a few other things that need to happen in between. In order for people to actually trust what the system is, uh, uh, generating and the answers that they see.

Deep: So your target user are these people in these people, in these utility and other companies that are responsible for. Logging into some system, maybe that's yours, or maybe that's the government's where they actually fill out this data and the evidentiary information, the supporting information, the surveys, emails, all that stuff they need to be able to, like, navigate that quickly search through it, but, you know, in a more modern sense, like, reason against it and kind of.

Formulate arguments and put them in these, uh, report fields with, I'm assuming reading between the lines of the like citations and stuff to the original source controller, is that kind of what we're talking about?

Yuval: Uh, that's exactly what we're talking about. The only, um, minor correction here is that our system and what we do at Hdata is we facilitate the filing of the structured data unstructured is done using other, uh, means and other technologies.

This is for the submission. The analysis will take care of both the structured and unstructured. Usually, uh, today, the unstructured reports that are being sent are, uh, they're leveraging modern technologies such as email and sometimes, uh, like somebody scanning a document and sending it via mail. This is the next evolutionary step of the transformation that the energy, regulated energy segment is undergoing.

Deep: So your clients interact with your system, but ultimately your system files it with FVRC or, or who, whoever the regulatory body is. ferc. Yes, that is correct. Uhhuh . And, and they, and they have whatever system supports data. Um, and maybe that's. Easy to automate. My guess is probably not like that interface is probably a pain point for you guys.

Yuval: That interface is, uh, relatively straightforward because at the end of the day what happens is that our systems generate, uh, the, the information and the files the way the FERC is expecting them in this XBRL format and we send it via API, the rest is relatively easy. Uh, our machine connects to Dell machine and we send the the raw XBRL after it passes validation on our end.

Uh, and of course, our customers are using our platform in order to. input the data, make sure that everything is correct, validate it, and then they file it. So the filing stage or portion of the process is the least painful one.

Deep: Need help with computer vision, natural language processing, automated content creation, conversational understanding, time series forecasting, customer behavior analytics?

Reach out to us at xyonix. com. That's x y o n i x dot com. Maybe we can help.

Deep: Okay, so let me sort of, I like to play this game where I try to guess what you guys are doing and, um, based on what I know so far, so that, so that, so that we can sort of see, so, my guess is behind the scenes you have, you have to take all this documentation, these, uh, you know, that's coming in these various formats, PDFs, uh, emails, whatever, whatever.

Um, you have to kind of preprocess it, because I'm guessing a lot of these pieces can be fairly standardized, but there's probably a decent penalty for not parsing, parsing it well. So then you take this information, now you have to have like an efficient, um, search and retrieval system on it, like maybe a standard keyword search system, like, and then you also now need to, Sort of take whatever more understood form of it and pull it into an LLM of some sort, OpenAI, or, or, or maybe you have your own Lama 2 instances or whatever.

And then That process, now you have to like transform it into sort of maybe question answer pairs or something where they can ask kind of more reasoning style questions against it, ultimately getting to your answers. But my guess is that you also have like a lot of repeat templates or patterns that they run into and that you have to be able to, or it's, it's maybe you don't have to, but it's like helpful if you can kind of map those.

For your users in a way that maybe you pre-populate the queries, maybe even the questions that they need to ask. Um, something, am I like off on Mars, or am I kind of to the ballpark or,

Yuval: uh, I'm smiling because it's as if you've, uh, looked at, uh, you know, our backend, uh, solution. Um, so you are right. The first thing that we do, the, the system automatically monitors, uh, a lot of, uh, data sources and we ingest all of this data into our, uh, data lakes Now.

Each and every piece of content that is ingested into the system undergoes one of two things. If it's a structured, well formed report, we store it in a structured way in our relational databases and in our data lakes. If it's an unstructured piece of content, meaning let's call it a PDF file that somebody sent to a regulator or vice versa.

We do a lot of pre processing, but mainly essentially what we're doing is we're pre indexing that particular piece of content. We're creating the relevant embeddings. In a way that is relevant to our sector, and we're storing that in a vector database. When the user is asking a question of our AI solution, which we fondly call the librarian, because it resides within the library.

The system does a bit of, uh, pre processing and prompt engineering, uh, behind the scenes to make sure that what the user is asking is, uh, something that the system can understand. We're pulling back the relevant, uh, chunks. Um, which is the, the relevant embeddings that we believe will contain the answer.

And only that we're sending, uh, to the LLM. Now we, the way I'm looking at LLMs, uh, uh, today, we don't do specific training on the LLMs. We just optimize our execution pipelines and our engineering that is wrapped around the LLMs. LLMs, as far as I'm concerned right now. I'm printing them almost like a next generation programming language.

I mean, our platform, I don't care if we're sending a query to GPT four, three and a half, um, and any other model out there, we are agnostic we've, we've figured out which LLM produces better results on certain types of questions. So we do send it to the relevant model, but for all we care, uh, it is just that.

The defense engineering is done before we send the questions to the LLM, uh, which allows us to essentially eliminate hallucinations. It allows us to create citations and we're opening up this black box. And so when you're asking a question as a, as a lawyer or somebody that deals with the regulation, we not only show you the answer, we can take you to the exact locations within the one or more documents that we sourced the answer from and show you where that answer was sourced from.

Deep: Yeah, because I imagine you're doing like a very granular, uh, indexing of the document, like you probably have a sentence level embedding lookup to get down to a handful of things that you can send to the LLM that it can, it can tailor its questions. So one question I have is, do you find, um, so, so you've got your reference materials.

One of the problems that we see a lot is there's a lot of advantage to leveraging global knowledge in the LLMs that, um, will lead to hallucinations, but will also let you kind of tap into deeper question and answers about stuff. So I imagine, like, Somebody at a I don't know, one of your your your case of the nuclear regulatory plant or the nuclear plant that they're looking at.

Maybe they want to they're in there and maybe they're interacting with your system specifically around the materials they have. But they might have a more general nuclear plant related question. And you have to now decide, is it okay to go to the LLM general knowledge to answer this? Or do I only You know, stay in my strict sandbox.

And I'm curious, like, do you ever allow that breach? If not, how do you handle it by indexing, you know, public documents into your, into your sort of safe world or.

Yuval: Yeah. So the answer is we do not allow that. In fact, one of the things that we emphasize to our customers and potential customers is that the system will only answer questions based on the data that it contains.

We are not allowing it to go to the internet and source additional information. It is far more important for the system to generate accurate answers without hallucinations versus making it smarter or, you know, uh, more rich in terms of its knowledge base. We have A huge amount of data within the platform, 65 million unique data points and close to a hundred thousand unique documents that are updated every 10 minutes.

And we're adding more and more. There is enough data in the platform to answer the questions that people are asking with. Or in our use case. Now, customers can upload their own documents into the platform, which is fine, and we treat it as if it's part of the sandbox, just in a more secure way, because it's their documents and not everyone else's.

However, the system was trained to say, I don't know, I have no information about this, uh, subject. It's perfectly fine. If a lawyer is asking a question about the regulation, and the system will make up a fact, It is far worse than saying, I don't know, go find it elsewhere. We focused a lot of energy on that to make sure that if you're getting an answer, it's an accurate answer.

And it's based on the data that is in front of. Our language model in AI solution. Yeah, that makes a lot

Deep: of sense in the world you're in. The cost of even a subtle hallucination can make, can be significant, I imagine. Yeah. Um, and so you probably want to really dial down those creativity controls and the You know, the temperature and the other parameters in the LLM so that you're really careful about that.

The only reasons from these, you know, and vetted source materials that you just gave it.

Yuval: And I can, you know, maybe, uh, just an example that happened. Last week, uh, we went on site, uh, to a potential customer, uh, and essentially it was a bake off between, uh, their homegrown solution and, uh, our solution. Again, I'm not trying to, to plug anything, uh, age data related here, but it does, uh, talk about the differences.

They, um, they used and they were very excited to share that they are now using. gpt4 turbo with the huge context window and a lot of tokens and uh, that was great and they sent a bunch of documents, uh, into the, uh, the system by the simple user interface. And they asked a few questions about a certain rate case and more importantly, what did a certain expert witness say in relation to that raid case and the system Gave a very very beautiful answer Unfortunately, unfortunately in the room.

We had a bunch of people that knew that particular person They set in on his testimony and they, or one of them said the system is now citing two answers that that expert said. One of them relates to a completely different trade case that happened several years ago, and it has no bearing on the question that we're now asking.

So that's a very, very good example of a hallucination and the bad and the critical impact that it might have on the company. If they just go with a plain, uh, naive approach of let's throw all of the data at GPT or any other language model and hope that, uh, you know, the answers will be, uh, accurate. We, because of the approach and the engineering approach that I just mentioned, we cut these documents to very, very small chunks, as we call them.

We only answer based on the data that is in front of us. The system generated a beautiful answer citing only the relevant rate case showing exactly where the information came from. There was no playing around with such a black and white answer. One is hallucinating and the lawyer is saying this is a bad answer.

And the other one is giving an accurate answer. So, different engineering approaches to, to the same system, to the same, uh, concept can yield very, very dramatic

Deep: results. Yeah, I mean, I, I think that makes a lot of sense in, in this, uh, in your case, you can always. I think the, you can always choose to go back to the source materials and pull them into your initial lookup for the reference materials if you need to.

So that's, um, that approach makes a lot of sense. So outside, are there. Cases outside of the text arena, you know, where you're, where you're helping to populate these regulatory forms, like, maybe are you, are you doing any imagery based reasoning where I don't know, they have a diagram of a. Gas pipe layout or something, or a diagram of utility company or something, and you're able to reason off of it.

Or are you, are you just still evaluating those capabilities within the LLM?

Yuval: So currently we do not, uh, we're focusing on two areas. One is the text approach, which we just discussed. And the other one is the blending of structured and unstructured. And I'll give you an example of where we're heading because we have this, um, structured data store at our disposal.

Now we're working on teaching the system. Um, to be able to answer things like look at the return on investment for company X plotted back five years and compare it on a graph to their peer groups. that are based in the Eastern Seaboard. Such a query is challenging. The understanding that it requires is, uh, is challenging and the blending of both structured and unstructured, uh, is not trivial.

If you want to prevent hallucinations, if you want the system to understand, uh, what's going on, and if you think about the steps that such a system takes, there are a lot of intermediate small steps that are important. Like, when I say my peer group or the peer group of a certain company, What does that mean?

How do we find the peer group? How do we pull SQL, uh, database and make sure that the SQL is generated correctly in order to Bring us the numbers then that the system can then plot the graph. All of these things are, um, at least in the regulated energy segment, uh, these are, it's science fiction, uh, something that, uh, might take, uh, a user.

20 seconds to type, and then the system another 20 seconds to generate would take days before, uh, to, to just do, because you had to manually pull the numbers, you had to understand what are, what companies are part of the peer group and why, and then you had to put it in Excel and create a graph. That, that's easy.

Deep: So, this particular feature is this. Somebody in one of the energy companies that is using it, or is this on the regulatory side where somebody in the regulatory side is trying to compare and companies to one another?

Yuval: That's actually on both sides. I mean, it's an interesting, uh, ecosystem that we live in because both sides, both the regulated and the regulators, they're trying to be more intelligent about the decisions that they're making.

I wouldn't call it an arms race, but they're both trying to be as educated as they can about the other side. And, you know, when you're asking to increase your rates, you have to justify it. Maybe you can justify it by saying these are my peers, and my peers, um, you know, because of reasons A, B, and C, they charge more, right?

So I deserve to charge more, and vice versa, right? If you're asking and the regulator is saying, but all of your peers that make roughly the same, are located in the same place, have the same population density, and so on and so forth. They charge less. Why should you charge more? All of these things and this analysis that's happening, they affect all of us on a daily basis.

We're just, you know, as consumers, we're unaware of what's happening. But this is constantly happening.

Deep: That makes a lot of sense, right? Like, um, benchmarking companies against peers gives you a place to kind of stand when you're trying to figure out why their energy output is accelerating or decelerating at a greater rate.

I don't know, maybe it's a coal plant and it's decelerating and relative to solar plants, it makes sense, but relative to coal plants, maybe it doesn't. And suddenly it becomes an issue because you already have a pretty robust view of the world. Are you using your. You could use your LLM world to help determine something like peer companies, but you could also just have a knowledge base and sort of more old school.

Curate that, you know, with kind of more classical ML techniques. Do you guys find that you're leaning heavily on on your global knowledge to mine it itself? Not only for your clients, but for features like this?

Yuval: Definitely. This quote unquote simple task of, uh, figuring out which companies consists of, uh, or my peer group consists of is very time consuming in the old world.

It took a lot of time because if you think about it, let's even take a step back, uh, an annual report that the company files to the government might contain 20, 000 unique data points. And these data points are similar across electric utilities. How do I, as a, as a person, as a regulator, or even somebody who works at, uh, at these electric utilities, how do I figure out which data points I care more about?

To then do a cluster analysis and say, okay, these companies are my peers versus those that you might think they are my peers, but they're not. And it's not necessarily about the competitors, it's actually peers, right? So is it just revenue, but revenue is not enough? Uh, is it the geolocation? If you're up in the mountains versus let's call it in, uh, in Texas, where there's a lot, uh, a lot of sun, your economics change.

If you are servicing a very, very dense populated area, then you have to spend a lot less on power lines and transmission lines. And Will says you have. Uh, you're servicing a rural area, you're spending a lot more on transmission lines. This is both a science and an art to figure out who are my peer groups.

And of course, as you might imagine, companies tend to choose the peers that they want to be taken, uh, you know, and measured against and the regulators do exactly the same. Uh, and this process is usually done in the old world every five years because it is such a lengthy and complicated process. But imagine that you had a magical black box that you can describe some attributes of your peers and the system will do it for you.

And yeah, it might take a minute. And you might create multiple peer groups, right? For the sake of reason X, this is my peer group. And for reason Y, this is my peer group. All of these things are, uh, um, solutions that AI can help with.

Deep: Have data? Have a hypothesis on some high value insights that, if extracted automatically, could transform your business?

Not sure how to proceed? Bounce your ideas off one of our data scientists with a free consult. Reach out at zionix. com. You'll talk to an expert, not a salesperson.

It feels like if I were in your utility companies, your energy companies, I mean, certainly present a very compelling case for why I need to use your platform for regulatory submissions, but. You also are kind of alluding to a compelling case for very different scenarios where I'm just trying to understand the industry better.

I may be outside of the regulatory submission scenario, but you have, like, a. A pretty large knowledge base. You have a large, uh, repository corpus, you know, of valuable content. I might simply just be performing analysis to figure out like, okay, I built a solar plant in, you know, in this part of Texas. I want to sort of maybe do the economic analysis of this part of California or something like that.

Is that a case that you're looking at and that you're trying to support more of?

Yuval: That's absolutely true. And when we think about what we're doing today and where we're heading, we're building an ecosystem. It is not just the regulated energy companies and the regulators. It's the consultants that care about this segment, the big four and others.

It's banks and investment, uh, uh, firms that want to make decisions based on data. It's, uh, law firms that work in this space. This data, which up until recently was very, very hard to come by, is now at everybody's fingertips if they know how to get it. Uh, and so it's not just about these regulators and regulated companies.

Deep: Yeah, I mean, hedge funds, like everybody, and there's a lot of, a lot of interest in, in this kind of energy sector data. I'm imagining though that there's a lot of questions here, right? Like, when, when an energy company puts up their private documents, they're probably like, tell me if I'm wrong, but I'm guessing they're not giving you permission to give that to hedge fund traders to go off and use it, right?

Absolutely. You know, and so they, there's like a purpose for it, but the public data, of course, you can kind of do whatever you want with, um, which, and I'm curious, like, I'm guessing because these are utilities that a lot of this data actually is forced to be public. Is that correct too?

Yuval: That is correct.

I'll say this. When you're working on your own internal numbers, of course, they're private and proprietary. When you're crafting the report, uh, to the government, it's still private. As soon as you click on the submit button, it becomes public domain. Everybody has access to it. If you know where to look, everybody can, uh, rie the, the report that you just filed and a second ago was private and now it is public.

So we grab everything and all data that is submitted to the government and to the relevant regulators and relevant federal agencies. We ingested into the platform and make it available. But there's also a portion of the data that is private. You've not submitted it. Maybe it's a communication or you're preparing something or, you know, preparing testimony.

We do have a portion of the platform, which we call the private catalog still within our, uh, library module in which you upload your own documents and they are, uh, made available only to you. So even us at, uh, Hdata, we have no visibility into what's going on there. We're not, of course, using this data to train the models or fine tune them.

You get to interact with the model as it was built and that's it. We have zero visibility into what's going on there, but. Our customers can still leverage our generative AI solutions. In order to derive insights from these documents. One might think that if they're creating a document and they still haven't filed it, it's fresh in their heads, but some documents are decades old.

And they're still being looked at. This expert, I'll give an example. Our expert said something five years ago in a certain context and now he's about to testify again. Let's make sure that he is consistent. Right? These sorts of questions are, uh, super sensitive. And they are, of course, not made, uh, available to anyone else, and nobody can see

Deep: them.

Do you find that you have to make a, that you have to kind of focus on one customer, maybe the energy, like the utility, the energy companies, or, or do you find that this message of like, We can build all, you all these features and your private data stays private. The second you publish it, though, we also are operating on the other side of the fence and maybe building features for regulators or whoever.

Do you find that that is a message that's easy for your core users to receive, or do you find them kind of wanting you to just totally be on their side of the fence only?

Yuval: Yeah, up until a week ago, I would have answered they don't care. They know that we're also selling to regulators and we're playing both sides of the aisle.

Actually, as I said, it's an ecosystem. So there are multiple sides to this aisle, not just the regulator and regulators. Last week, I heard the question from potential customer asking almost jokingly, but I'm sure that they meant it. What would it take not to sell this thing to the regulators in our state?

Now, of course, we're not doing that, but, uh, this was the first time I had such a question.

Deep: I mean, it comes up with startups, right? Like, you know, in the, in the startup arena, you sometimes you have to, you really, and, and just companies in general, until you're a massive company and you have lots of different, very different users and stakeholders, I, I imagine this could be a.

A challenge that you have to kind of navigate, you know, uh, and it may be one where you decide we're, we're just going to totally stay on this side of the fence for the next couple of years until, you know, we have unambiguous, uh, understanding of our, you know, our clients sensitive spots. And

Yuval: in our mission statement, when we started age data was to.

Help facilitate this digital transformation for the entire industry That that means not just the regulated companies. It's the entire industry, right? We're building an ecosystem So yes, of course, the regulated companies are super super important. They are core to our activity. However Regulators, lawyers, consultants, private equity firms, all of them work and consume data, uh, in this, uh, industry, and we want to service all of them.

The one of the

Deep: questions I want to ask you is, um, you know, a lot of our listeners are, you know, product managers or technologists that are looking at just a blizzard of activity by the, by open AI, you know, and it seems to me like your business. Is like a, um, it's like a basis function. It's like a really sort of fundamental template of of what 1 can build here.

So, and it's like, you have, um, you have a vast need to digest. A large repository of publicly accessible information, but in a way that significantly transcends simple like keyword search. Not that it's really that simple for those of us who spent decades building search engines, but, you know, it's it's now it seems simplistic comparatively speaking, um, at the same time.

Um, there's also this, um, great need to have that reasoning available. Um, it's also tailored and customized enough that you're kind of not right in the crosshairs of the big elephants in the arena, you know, like as a Um, maybe not a small company, but as a smaller company, you know, nobody wants to be like in the short path of what Google's about to destroy or what Amazon's about to destroy, or in this case, open AI.

So, right. So, so, so going deep and specializing, you know, is clearly a benefit understanding these, like, I imagine a lot of the benefit your brain at the table are these really granular cases where somebody needs to, like, you're seeing repeat scenarios where somebody has to fill out this template in a particular way.

So that you can start optimizing some prompts behind the scenes to try to, like, make that process easier. Maybe not even just prompts, maybe some, you know, regular old classical ML or whatever. Um, what, what advice do you have for people that are, like, looking for the ingredients? that they can build on a large, on top of this LLM world, and what are the potential pitfalls that, you know, you would sort of, sort of recommend people really think through?

Yuval: So, I kind of alluded to it earlier when I said I consider these LLMs as the next generation programming language. You don't care, you've never asked anyone. I'm guessing, did you use, uh, Java 1. 6 or 1. 7 for this application that I love and I use? Uh, you don't care if it was built using Java or C or C sharp.

You care about the functionality and the tools that are now put in front of you and you can use to solve your own business problems. And that's how I, uh, look at, uh, what we're doing today. We have no intention of building. Uh, uh, an LLM from the ground up. There are giants that do that, and they are doing it very, very well.

However, they don't have the in depth industry understanding and knowledge and subject matter expertise to go and build a solution that is based around these LLMs that is catering to specific needs that customers require. A standard off the shelf chat GPT or Entropic or Gemini will not Allow users to manage their entire rate case across the industry from their user interface.

Nobody would ever even consider doing that. But if you go and build a solution that does that, and oh yes, by the way, it does use an LLM underneath, that's great. And so I applaud the rate of advancements and new features that OpenAI and Google and everyone else, they're putting out into the market because it helps us.

But. At the end of the day, we're building an application that leverages these capabilities. So that's, that is the main difference.

Deep: Yeah, I mean, it's like maybe stated another way, I think what you're saying is. Go where the customer problem is and go where the, where you have access to data in a unique way, that's not sort of in the direct path of a generalized

Yuval: solution.

This is a tool for, I mean, that's the product management answer, right? Go build a solution that customers need and answer the pain point that they have. Build it, uh, when you, or while speaking with these customers or potential customers. Don't build it in the dungeon. Invent the problem, then invent the solution, and then try and sell it.

Uh, and the tools that you have at your disposal, they will change. Today it's, uh, GPT 4, tomorrow it's going to be, uh, Gemini, or GPT 5, or whatever. That's fine, but it's a tool that you use, and these tools are Built in a very, very generic way, they are not solving specific business pains. Now, if your product is, uh, what I call a feature, right?

If your product is let's upload the file into OpenAI and uh, run a summarization, uh, prompt, then great. You. You screwed out and make French. Yeah, but that's not, uh, what we're talking about. We're talking about building a solution, not a feature.

Deep: Yeah. I think that's well put.

Kind of on a related note, like how do you think about your dependencies? On some of these providers. So, for example, I mean, everybody knows 3 or 4 weeks ago, this utter chaos unleashed by, you know, open eyes. Um, who, how, how do you, how do you process that? So, like, you know, like, there's. Particular companies with varying levels of credibility, there's, you know, and efficacy, frankly, right?

You know, like, I think most people wouldn't be working with GPD for if it wasn't so much better than everything else until maybe a week ago. I, I haven't gotten a chance to dig in super deep with, with, um, Google's latest release, but I spent, you know, a fair amount of time and it seems like it's the first thing I've seen in a long time.

That's actually a contender with GPD for, so, yes, but. Even with the service level, and there's other ones, there's AI21, there's, you know, you, you, you mentioned a number. It seems like there's a shift going on from a world maybe a year and a half ago, where startups and young technology companies were generally speaking, never trying to be dependent on external services.

They would take open source. Build them into their own docker containers, manage them, et cetera. And everybody kind of like all their investors wanted them to do. So these models are expensive to build. They're big, really big. Um, Facebook is offered a, quite a compelling open source alternative, but it's nowhere, Lama two is nowhere near GPT four level of efficacy.

And for your case, while it is specialized in the sense that it's, you know, you've got all this specialized regulatory data, when you go in with your You know, you're 10 or 20, you know, sort of reduced set and you need to, like, kind of reason against it. At that point, you might get good enough, you know, efficacy with, with Lama 2, but you might have maybe some fancier, more heavy stack reasoning scenarios where you need to go to other models.

So I'm curious, like, how do you parse that landscape? How much do you care, um, to be dependent on. A big GPT 4 like OpenAI model, how much do you care about pulling stuff in house and running your own LLAMA2 instances and taking it over and how much of it is latency driven? How much of it is cost driven?

How much of it is efficacy driven and yeah, maybe, maybe talk to talk

Yuval: to us about that. Yeah. So the way we've engineered our platform, uh, allows us to change the language model with two configuration lines. So first of all, we're not dependent on anything. We do like to work with the open AI because they do produce interesting results, but we're not, by the way, always using GPT 4.

We have some scenarios where the system will automatically decide to go to GPT 3. 5 because it's good enough. Now, whenever we're testing a new, um, a new capability, We actually have an internal set of tools that allows us to evaluate different technologies and that might be, let's evaluate GPT 3. Turbo, let's go with Lama 2, uh, let's look at Entropic, and now we can say, uh, Gemini.

Let's look at, uh, these six, and we run the same set of queries, the same set of prompts. against the same set of documents, and we see the results. And then we have subject matter experts that know how the result should look like, and we make a determination, and we, uh, score each one of these, uh, results that we're getting, and of course, the follow up questions and the follow up answers, and we know what we need to integrate with first, but we have backup plans.

So if the open AI implosion would have occurred for us, it would mean a configuration change. And we're, uh, you know, we're up and running back again, no harm, no foul. That's to me is the way to look at what's happening here. I am not in the business of running my own data center and, you know, managing our own LLMs.

If everything else fails, okay, maybe, but there are. Enough options out there not to seriously consider that at least at our stage and, um, you know, where we are today and yeah, I

Deep: think that's, I mean, that's a very prudent approach, right? Like, let's not get it seems like the general structure around these things is fairly common, you know, in the prompt out comes some response.

Um, and I think that's that that that's that's reasonable when. When. Um, when one thing that you said was around your sort of assessing efficacy, I'm curious how you think about efficacy assessment. It's an area where the academic literature is really pretty primitive there relative to where we are, even simple, uh, or what seems simple now, but wasn't a year and a half ago, like, you know, sentence embedding, like distance, uh, this is from, uh, Let's call it a perfect answer to whatever answer the LLM is providing has all kinds of problems now.

Do you find yourself having to use LLMs themselves to assist in the efficacy assessment for your other LLM prompt? Or, and like, how do you think about it? Are you still. Kind of approaching it kind of in the more machine learning sense, where you're trying to get down to maybe a similarity metric between a perfect answer and the bots answer.

And how do you even define that when there are oftentimes multiple perfect answers and, you know, maybe talk to us a little bit about how you guys think about efficacy assessment.

Yuval: Yeah, so, first of all, I would say that, uh, really, uh, one sees a perfect cancer. That's just the nature of the beast, or at least in our world.

Um, if you take two humans, two lawyers, and ask them the same question, you will get varying degrees of the same answer, right? It's not exactly the same. And so, we use a lot of subject matter expertise to evaluate what is put in front of us. We, um, use a lot of people to really understand the value and, uh, the quality of the answers.

This is not something that is done using, uh, machine learning or other LLMs for the unstructured, uh, content. Now, when we deal with questions that, that look at numbers, Then, of course, it's much, much easier. We have our databases and we know how to access them and we know how to access them automatically.

So if you're asking a question like what was the return on investment or how much net revenue or loss did the company X have last year, that we know how to get to automatically and we can verify very quickly whether this was an accurate answer or some sort of hallucination. So it's not a fully automated process.

The training, uh, what we've found out because we started putting these systems in front of, uh, first of all, potential customers, and then customers is that they provide feedback on the, um, the quality of the answer we've built, uh, A built in mechanism in our portion of the system that looks at the private public data.

They have the ability to provide feedback and we've seen that they people do like to provide feedback, especially when it's bad, but still provide feedback. So this is not an accurate answer or not. Uh, not how I would have answered it. And this is why. And we take this information and then, uh, we do a lot of automatic prompt engineering, learning from the feedback that we were getting.

So we're not really like updating the language model, but we do a lot of prompt engineering. And this is where we've seen. The magic happens if you ask the right question or the question in the right way, you'll get amazing results.

Deep: It's an, it's an interesting scenario, even with, even with prompt tweaking.

One of the things that we find a lot is, um, when you're releasing a new model for something, and in this case, I don't mean a lower level LLM model, but like, you know, for whatever it is you're trying to do, whether it's, I don't know, a dialogue summarization or, or, you know, whatever you're trying to generate.

It's different now than it was in the kind of in the more the pre generative model world back then, you know, you had these very clear precision recall metrics and it was, it was very straightforward to get down to a number. And if your numbers higher by a little bit, you go for it. And whereas now. I would say it's challenging, but not impossible to get to a number and release right away.

But the criterion has to be sort of considered more. And the cost of using humans can be substantive, right? Like, and even if it's just takes a while, right? Like, you know, you've got a bunch of engineers that talked a bunch of prompts around. Now you got to release. You want to feel confident that you're definitely better than before.

So you want to have like, you know, a few thousand exampler answers agreed that they're not. Perfect answers, but you could create a category where it's an acceptable answer and an unacceptable answer, and you could look at, you know, kind of distances to that. So we've seen a lot of movement and work in that arena, and we've been doing a lot of it ourselves and also.

Kind of taking more of a rubric approach. So basically saying you have this answer. Let's come up with a number of dimensions that we're going to provide a score on. So like, how empathetic is this answer? How concise is this answer? How, you know, that kind of thing. And then you get a score around that.

And then for a given application, you know, you can sort of say, well, you know, conciseness really matters, preciseness matters, and then you can kind of get to a score. Yeah. So that, that's, that's something that, that, that we've seen a lot of. So thanks so much for coming on. It's been, I feel like I've just learned a ton about your space.

I'll probably, I'll end with one final question. I like to ask this a lot. If we fast forward five years out and all of your dreams with Hdata come true and you're able to. Sort of realize everything that you want, like, what does the world look like? Um, and maybe for, for the person buying electricity, you know, at their house, uh, maybe answer from their vantage, maybe from the, from your core customer, you know, who's filing these reports and, you know, to anyone else that you think would be affected.

Yeah,

Yuval: I think that the, um. Visibility is key on by that. I mean, the more power we put in front of people and allow them to understand decision making, the better we are all off if I, as a consumer, understand why my price went up. It never goes down, but why it went up. And there's a valid answer to that. I might be less, uh, less upset than, you know, just somebody increased the rates.

So I hope that we'll be able to get to a point where These sorts of generic understandings are put in front of everyone, not just the experts like the one working at the regulatory office in the utility or the regulators themselves. I, when I think about an ecosystem, one of my consumers should be you and me going and asking, tell me why did Georgia Power increase the rates last year versus what it was two years ago?

And the system might generate something, right? I'll be able to read it and understand, okay, we had a bunch of storms, there was a fire, maybe a nuclear power plant came online, and it was only a decade behind schedule, now I understand, okay, I may not like it, but I understand why. Uh, things of that nature, I think, uh, will go a long way.

Visibility usually helps maintain reasonable prices. This today is the job of the regulators, but the more we distribute this power to average jobs, like me and you think the better we are all off.

Deep: Awesome. Well, thanks so much for coming on the show. Uh, it's been great having you. That's all for this episode of your AI Injection.

As always, thank you so much for tuning in. If you enjoyed this episode and want to know more about AI, you can check out our recent article bars by googling Innovation Strategies 2024 or by going to xyonix. com slash articles. Also, please feel free to tell your friends about us, give us a review. And check out our past episodes at podcast.xyonix.com. That's podcast. x y o n i x. com

People on this episode

Deep Dhillon

Host