Leveraging AI

98 | Making Data-Driven Decisions With AI, A Deep Dive into Data Collection and Analysis with Tianyu Xu

June 18, 2024 Isar Meitis, Tianyu Xu Season 1 Episode 98
98 | Making Data-Driven Decisions With AI, A Deep Dive into Data Collection and Analysis with Tianyu Xu
Leveraging AI
More Info
Leveraging AI
98 | Making Data-Driven Decisions With AI, A Deep Dive into Data Collection and Analysis with Tianyu Xu
Jun 18, 2024 Season 1 Episode 98
Isar Meitis, Tianyu Xu

Tired of tedious research processes slowing down your business decisions?

We all know the struggle: research is crucial, but time-consuming. Often, it gets sidelined because of our busy schedules. But what if you could leverage AI to transform your research process, making it faster and more efficient?

In this episode of Leveraging AI, Isar Meitis invited Tianyu Xu, founder of the AI consulting agency TYAI and a seasoned market research professional with experience at Twitter, shares his insights on how AI can revolutionize your approach to research.

Join us for a fascinating discussion on practical use cases, the best tools, and the exact prompts you need to supercharge your research capabilities.

In this session, you'll discover:

  • Why AI is a game-changer for research and data analysis.
  • How to use AI tools to handle both primary and secondary research efficiently.
  • The concept of Retrieval Augmented Generation (RAG) and how it ensures data accuracy.
  • Practical steps to analyze and visualize large datasets using AI.
  • The benefits and potential pitfalls of using AI for market research.

Tianyu Xu is the founder of TYAI, an AI consulting agency based in Singapore, serving clients globally. With extensive experience in market research and AI applications, Tianyu is passionate about helping businesses leverage AI to gain actionable insights from their data.

About Leveraging AI

If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!

Show Notes Transcript

Tired of tedious research processes slowing down your business decisions?

We all know the struggle: research is crucial, but time-consuming. Often, it gets sidelined because of our busy schedules. But what if you could leverage AI to transform your research process, making it faster and more efficient?

In this episode of Leveraging AI, Isar Meitis invited Tianyu Xu, founder of the AI consulting agency TYAI and a seasoned market research professional with experience at Twitter, shares his insights on how AI can revolutionize your approach to research.

Join us for a fascinating discussion on practical use cases, the best tools, and the exact prompts you need to supercharge your research capabilities.

In this session, you'll discover:

  • Why AI is a game-changer for research and data analysis.
  • How to use AI tools to handle both primary and secondary research efficiently.
  • The concept of Retrieval Augmented Generation (RAG) and how it ensures data accuracy.
  • Practical steps to analyze and visualize large datasets using AI.
  • The benefits and potential pitfalls of using AI for market research.

Tianyu Xu is the founder of TYAI, an AI consulting agency based in Singapore, serving clients globally. With extensive experience in market research and AI applications, Tianyu is passionate about helping businesses leverage AI to gain actionable insights from their data.

About Leveraging AI

If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!

Isar:

​Hello and welcome to Leveraging AI, the podcast that shares practical, ethical ways to leverage AI, to improve efficiency, grow your business and advance your career. This is Isar Meitis, your host. And we're going to talk about a unique topic today. The reality is almost everything we do at work, whether it's marketing, finance, Sales strategy, all of these things require to do research. And the reality is that most of us doesn't do a lot of research or at least not enough in order to be more successful than we are. And if you do proper research, you will have a lot more relevant data in order to make better decisions in whatever it is in your business that you're trying to do. So why don't we do a lot of research? Because it's a lot of work and most of us have a 10 other things that we have to do other than the research. So either skip that part completely or we just not invest enough in it. But the reality is that with AI capabilities today, you can do multiple aspects of the research from Data collection, whether it is primary research or secondary research, and then the data analysis side, whether it's qualitative or quantitative data analysis, all these things that used to take a lot of time, you can do significantly faster and then benefit from it in whatever it is you're doing at work. And so this is a very. important aspect of almost any job in business that most of us do not have the skill to do. But our guest today, Tianyu Xu, is an expert exactly on that. He's actually an expert on a lot of other things. And he has a AI consulting agency that's called TYAI. He Lives in Singapore, but supports people all over, hence how we met through LinkedIn and started talking and, Beating about different stuff in AI. When I learned he's a research expert and he's, he is a research expert because he did that for several different years in larger companies, including Twitter. So that's his story. Real background is doing market research across different aspects of professional business. And now he's taking AI and applying it on top of that, which I find extremely attractive. and so this is what we're going to dive into today. It's going to be fascinating. We're going to do actual use cases, show you the exact tools, the prompts, everything you need to know in order to do better research in your company. And so I'm really excited to welcome Tianyu to the show. Tianyu, welcome to Leveraging AI.

Tianyu:

Thank you so much Isar and it's a pleasure to meet everyone here.

Isar:

Awesome. let's dive in. I, listen, I'm personally really excited about this because I'm a data freak. I try to run all my businesses based on data and knowing how to do better research faster is a gold mine. So the stage is yours.

Tianyu:

Yeah, definitely. I think, When I first used ChatGPT in December, the first use case, the first, the very first use case I do was to use it for research. it was quite disappointing at that time because, that's the first time I learned concepts of hallucinations. Because when you learn, when you ask ChatGPT to come up with, industry insights, statistics, You will get that in minutes, in seconds. And then, at that time, there was no, ChatGPT did not, could not connect to the internet. So it comes, it just fabricated all the data, all the insights. so I come to the conclusion that, okay, maybe you can't use it for Research purpose. And then, yeah, then my, my, my view of charge abilities has also been changing along the way over the past 18 months. And I think fast forward to today, I think the general AI tools are more than ready to handle any type of research task. Because, number one, all the tools are, because number one, people are more aware of these issues. So when we design the prompts, when we write the prompts, we know what kind of prompts to, craft. So that the LLM, so that the AI tools will come up, will be less likely to come up with hallucinations. So that people are more skilled in, more skillful in handling the AI tools. The second top, the second reason is that The large language models are themselves do not produce knowledge, but they can connect to a knowledge base. So that's why there is a RAG approach that there's also, most of the LLMs nowadays can connect to the internet. so you can pull data from the internet, like how you do the search on complexity, then it gives more credibility to the data, to the insights you get from, from generative AI tools. So I think at the moment, this is best is actually the best moment for everyone to dive in the, this aspect, this, the research tasks. No matter whether you are a research professional or you are just a professional, who needs some daily research, you, it is the now is the best time to, to dive in.

Isar:

I agree. I will highlight two things for people who are new to the show. So if you haven't been listening to the show for a while, there's two things that Tianyu said that you may not be familiar with, one is RAG or RAG, which is, Retrieval Augmented Generation, which is a mouthful, but what it means is you can give the large language model or an AI model, a single the data. so you can give it access to documents, to folders, to whatever data you want. And then it knows how to answer based on that data. So this is the holy grail because then it learns only from your documentation and then it doesn't make stuff up or it gets random information off the internet that may not be what you're looking for. So this is number one. And the other thing he said is Perplexity. Perplexity is if, Google and ChatGPT had a baby, that's what you would get. So it's a large language model that is really, that is connected to the internet and can be used to give you real answers versus search results. So what Google tried to do in the last two weeks. And by the way, now they turned it down almost back to zero because it wasn't great, but Perplexity got it right. It's a, an amazing tool. I use it every day, probably more than I use Google at this point. And it's a tool that can help you do research. And I'm sure we're going to talk about this later in this episode.

Tianyu:

Yeah, definitely. And I would also like to elaborate on a bit more on the possibilities and the risks as well, because for the possibilities, we know all the tools are ready. And, and so to be specific, you can, the tools are more than ready for you to convert anything from raw data to, to visualization, to insights, before you may need, advanced Excel skills or Python, or, you need SQL to query the database. but now what, as soon as you, as long as you have the data, the raw data, you can even use chatGPT to analyze the raw data and get the insights within a few minutes. it's the same case for Perplexity. So basically the, What perplexity does is that it understands your query, and then it separates your query into multiple queries, multiple Google searches. Imagine you can do 100 Google searches. In one goal, and then, the A. I. Work. So depending on what A. I. Model, you are connecting Perplexity with basically it summarizes all the 100 such results into an article into a report ready for you. So that is the, that's the best thing about, using J. I. For risk for market research. At the moment.

Isar:

Yeah. And I'll say one more thing. when it gives you that summary and it's a beautiful description that how you describe perplexity, but when it gives you that summary, that article about the a hundred search results that it read, it also gives you citations. of each article with actual links. This is where I find this information. This is where I found that information. So you can go and check two things. A, that it's respectable sources and not, Jane's blog, which, okay, I, maybe Jane is awesome, but I don't know. And so you can see that it comes from reputable sources. Yeah. But also you can actually click on the link and see the actual source and see a more information and be that it makes sense. And so you can dramatically reduce the chances that the data you're using is crap or hallucinations just because you can go back to the source and check it.

Tianyu:

Yeah, absolutely. And I remember a meme I saw last year. so they were saying, so on the meme, it says that, the job of my research was, so you spend 80 of the time doing primary research, analyzing data, and then 20 of the time, doing the, report writing and, final checks. And then after using ChatGPT, you spend 20 of your time doing the research, collecting data, writing report, and 80 of your time doing the verification, fact check. that has been the challenge last year. Only until recently, because of perplexity and because of the improved, GPT O, GPT GPT4o on ChatGPT. And, you, we, we will spend less time on fact check. We, it is still needed, of course, but we will spend less time on these kind of tasks.

Isar:

Awesome. So let's dive into the practical aspects of this. what kind of use cases are possible, which tools and how you do it?

Tianyu:

I can give you one example. let's say we, If we want to analyze the, some raw data from scratch and our goal, so there are a lot of use cases, but maybe, we, I would just focus on one use case, which is writing, writing a report. the analyzing the property market in Singapore. So I already have the raw data downloaded because in Singapore, a lot of the, a lot of the data, a lot of the public data are readily available and they are, That's the quality is one of the best because you can just download the data from the website and then plug into ChatGPT and then visualize everything and analyze the data immediately. You don't need to clean the data because it's already well structured and cleaned. So maybe. So today I would like to show you how to, how do we convert the, hundreds of thousands of real estate transactions. Transcribed into charts and then into insights and then into a, report in a few minutes. Oh, fantastic.

Isar:

Yeah. And as, as Tianyu is pulling the data, on the screen, for those of you listening, we're going to describe everything that we're looking on the screen. if you want to also see the data, you can also go to the Multiplai YouTube channel. So that's Multiplai instead of Y with an AI in the end. And so if you do that, you will be able to also see this, but you can stay with us if you're, driving or walking your dog or doing the dishes and you can't watch YouTube right now, then stay with us. We're going to describe everything that's on the screen. I will also generalize one thing, which is, this is, the example that Tianyu has is obviously his data that is Public that he can share, but you have such data from multiple sources in your business. This could be your financial reports. This could be your sales data. This could be your customer service data. Like we all have tons of this data that we usually try to do in Excel on our own, but sometimes it's very hard because it's a huge amount of data. One thing that I will say before we get started, and if it's something you wanted to say, I apologize if I'm stealing this from you, but the data. As humans, we usually try the data to be easy for us to read, meaning we add gaps in the data, like between every month to the other month, we do a space of one column, between every year and other year, we do two column spaces, between, rows of departments, we do two spaces of rows, and it completely confuses the AI when it's trying to analyze the data, and it's trying to fix it, and sometimes it will ruin the data as it's trying to fix it. So before you upload the file to ChatGPT in this case, but it's the same exact thing if you do it with Gemini, remove all the human stuff, like no spaces, no gaps, no fancy stuff, just the raw data with the right headers and the right columns, in place, that's going to make it a lot easier. And the other thing that I've learned, and actually that's more of a question to you, but I can tell you what I found is when I load CSV files versus XLS files, I get better results. So this is. Coming from me, I'm not 100 percent sure. I'm sure you've done this more than I did, but I'm wondering if you got the same stuff.

Tianyu:

Oh, so on the point about CSV files, generally speaking, CSV files are smaller. And then if you analyze data with Python, it's more efficient. Analyzing CSV instead of the Excel. I think because of Python, not because of ChatGPT, because of Python, it's easier to analyze CSV file.

Isar:

And again, those of you who don't know what Python is a programming language that's existing for a very long time that was used mostly, not only, but mostly for data analysis by data scientists. And now, both, ChatGPT and Gemini write, Python code when you give it these kind of tasks. And that's why it's critical. And that's how we can do all these things because it's not ChatGPT trying to figure out. It's literally writing a piece of software to help it do it.

Tianyu:

Yeah, absolutely. And remember when ChatGPT first came out, there were hallucinations, there were problems with Math. And then later on it becomes smarter. So instead of giving you an answer about math question, charge PT will just pull out, we'll just use Python to do that calculation. And then we'll tell you the accurate result. Yeah. So in that sense, we are not using ChatGPT to analyze the CSV file. We are using ChatGPT to instruct Python to analyze the CSV file and then return the insights analyzed by ChatGPT again, then deliver the insights to us. So that's the whole process.

Isar:

Okay, so let's down in. So we have chartGPT 4o open. What do we do next?

Tianyu:

4o is the default. I'm not sure about your experience. In my experience, it's, it's faster, but it's not as smart as GPT 4o. It's getting, it's quite lazy in these days and it just, it's easier to, it's easier for you to get, to get errors in Python than GPT 4, but since, but I think because, because of the, the timing today, we will just do it on GPT 4 and then I will break down the long prompt into a smaller requests. Okay. in, in, in many cases you can use a super long prompt on charge GPT. for example, I just uploaded this file, this is the requested, this is a resale flat prices, of the public housing in Singapore from 2017 to 2024 and every single. Transaction is there. So there are about 200, 000 transactions, a very detailed strategy, public data is impossible to manipulate on Excel. but we have it here. Normally what I would do is to write a long prompt and ask ChatGPT to clean the data and then identify the structure, describe the data. And then, I then create the charts based on what I want and summarize, create, executive summary. So that is one prompt. but that prompt, the long prompt may also have problems because especially for GPT 4o, it tends to come to the, jump to the conclusion very quickly, so it will always end up with errors or hallucinations. And that's why today I will break down the long prompt into steps. Each step. I will only do one thing.

Isar:

Yeah, I will add something to that. In general, best practices is to break this down into steps instead of giving it really long prompts for two reasons. One, as you mentioned, it, it tends to be more accurate on everything that you do, by the way, not just on the analysis, but the other thing, it provides you more control because you can see in step one, if it's doing what you intended it to do, and if it really understood the task before it continues to step two and step three, and so on. And if it, Did not do step one correctly. You can go back and correct it. And then only then continue to step two. So you would get better results and you would get more control over the process to have exactly what you wanted. And so it's recommended in general, and even by the companies themselves, if you go to the best practices from Claude or from open AI or from Google, they're all saying the same thing. if it's a really complex process, break this down into a step by step, a set of instructions.

Tianyu:

Yeah, absolutely. So right now, I'll just ask the I'll just ask to describe the file and give me a data sample. Let's see what's inside this CSV file. And what it does is to, jump off with analyzing. And then you can see the Python, how it's writing the Python code. so it's, Pulling out. So it's pulling the CSV file, which is uploaded to the sandbox in the inside ChatGPT. so this set the fire in the sandbox will be deleted after three hours. So there's, so it's not restored there forever. And then we can see that I'll say

Isar:

two things about the last sentence you said, sorry to stop you, is the fact that these files being deleted is good and bad. It's good because they don't save your data, which is awesome. So you can upload data that is a lot, that is more sensitive than other data that you may put into ChatGPT. It's bad because if you want to come back to the conversation, it doesn't have the data anymore. And then it asks you to re upload the data. And then sometimes it works and sometimes it doesn't. And so it's, if you want to do one of these analysis, just stay with it and keep using it versus, yeah, I'll come back after in the afternoon after I'm done with my meetings, which happens to me a lot in chats. but in these kinds of chats, again, it will ask you, I don't have the data anymore. Please re upload the file. And then sometimes it just doesn't get it right. I don't know why, but that's just the nature of it.

Tianyu:

I can totally relate to that.

Isar:

Yeah.

Tianyu:

The biggest problem is that the context window is not long enough. So it doesn't capture the entire conversation.

Isar:

Yeah.

Tianyu:

That's why you feel conversations longer than maybe 10 messages. And then it only, let's assume that it only captures the last, the latest 10 messages. So they forget what's the file name and what library is used and what was done before that. And then you come, you just, you just come, you just end up with the same loop with the same error again and again.

Isar:

Yeah.

Tianyu:

Yeah. So I agree with you. Cheers. If you want to do it, to get a job done, just get it done within one chat.

Isar:

Yeah.

Tianyu:

Yeah. So right now we have the, yeah, we have a preview. We have a preview of the CSV file, which contains the month, town and flight, a flat type, two room, three room, block number, street name, story range, 10 to 12, first one to three, and then a floor area in square meters, flat model, and least common state, remaining list. And, retail resale price. So basically this tells you, every single transaction of the resale flat in Singapore from the past, seven years. and then we will, and then CharityPT also returns the, the first five rows of the dataset in the form of a plain table. here, and then we can start the analysis. For example, I want to see, overall trend, show the overall

Isar:

trend of the, housing of the resale price in the whole period, in the, in the line chart, for example,

Tianyu:

and these kinds of requests are very simple. So it's, so it just, it will just create the line chart that, that shows the average resale price over the past seven years.

Isar:

Over 200, 000 rows of data. Again, you cannot do that in Excel. It's just not going to work.

Tianyu:

No, that's impossible. Yeah. Yeah. One thing I like about the latest update about chart GPT is the interactive chart. So if you notice on this. Chart. there's a place. there's a button to expand the chart. So if we expand the chart, we just landed in the, in the page of, in the, I don't know whether they have a name for it. We just landed in the page of the chart. So on the left hand side, we can see the chart in the full screen and on the right hand side, we can see, we can continue the conversation. Yeah. this chart is. adjustable. We can change the, the color of the line, or we can even change the, I think we can change the type of the chart. Yeah.

Isar:

Yeah. Yeah. So those, again, those of you who are not watching, we now have a screen where instead of the regular screen of ChatGPT, when you expand the charts, you get about three quarters of the screen is the chart. and then just the sidebar on the right, that is the continuation of the chat where you can continue manipulating, cutting, slicing, whatever you want to do with the data, in that. And that's extremely helpful, but you can also click on different sections in the chart itself because it's fully interactive and change stuff in the chart. And that's. Magic. This is probably the easiest user interface to change charts that I know of. It's even easier to do it in Excel or Google Sheets.

Tianyu:

Yeah. I forgot to mention, as soon as you upload the CSV file to chatGPT, the file itself become the, become an interactive document. Correct. So you can click to expand the table, from the file name, and then you will be able to see the entire table. Can you imagine how many 200, 200, 000 rows of data all in the same table?

Isar:

Yeah. And I will say something cool about this when Tianu says the chart, the table is interactive. It's interactive. Meaning you can click. And select several lines of data, just like you do in Excel and say, I want you to create a chart from this and it will, and so like you're doing in Excel, only you don't have to know what the functions are and how to create tables. You just ask what you want to ask, but you don't have to use all the table. You can usually, you can really be selective just like you are in Google Sheets or Excel. You can select columns, you can select specific cells. You can ask it to create whatever you want it to ask and you can do it. And it's amazing.

Tianyu:

Yes. So for example, I just selected two, two, two columns. One is the least common state. The other is resale price and asking, asking charge a PD to show a histogram of resale prices. so it is essentially. only takes, just, just took the two columns and then created the histogram.

Isar:

And so one more thing that I want to say, because it's really, you and I are thinking geek about this for a week now, but one of the amazing things is we know as humans that are not data scientists or research experts, we know three kinds of graphs, right? A line graph, a bar chart, and in a pie chart, that's. Probably 90 percent of what we do. The reality is there's 20 different kinds of charts that we don't know. And what I do sometimes now, because this tool is so amazing, I'm like, okay, this is the data that I have. What would be the best graphic representation for, to show. This and they're like, Oh, scatterplot trend, blah, blah, blah. I don't even know how to call it and he will generate it for him. Oh my God, this is amazing. I didn't even know, I didn't even know this thing existed, but if you don't know what is the best way to visualize your data, the data and you know what you're trying to visualize, you can literally ask and it can give you one or two or three. And then you can literally ask it to make it and it will. So the sister Graham, again, you're not looking at it. If you're just listening to the podcast is an amazing graphical representation for what we're looking at, which I wouldn't have thought about.

Tianyu:

Yeah. let's follow what you said. We assume that we don't know what we want. We can just ask Chattopadhyay. show me, maybe act as a data expert, data analyst. identify, top three insights about

Isar:

this from this data set and generate three best charts that represent The insights

Tianyu:

be creative. Yeah. We can, if we don't know what to look for, we can just ask charge GPT, yeah, to give us a top insight, top line insights from this data set and generate the charts based on the insights. yeah, it's much easier in this way. I tried the same thing on Gemini, but, Gemini is not as smart as charge GPT because you have to tell Gemini exactly what you want, otherwise you'll get nothing. All right. So we have something very interesting here. So first we have the average resale price by flat type. You can see that, obviously the height, the number of rooms correlates with the number, the resale price. and then we have the retail resale price trends in different towns. in the line chart, you can see some towns, some of the towns are. Definitely have, going through, more, high, higher growth than the others. I

Isar:

want to pause it just for one second with this particular chart. So what, again, you need to remember, this is a machine. So in this particular case of comparing different towns, there's 50 of them on the chart. So think about 50 lines on the same chart. You can barely see anything, but the cool thing now you can pick that charts and tell it, show it to me for just the top. the three cities with the fastest growth, and then it will basically pick out of the whole list, three cities with the fastest growth. and it will do a completely different chart by just asking for what you want. So you don't need to be a data scientist. You don't need to go through now. How do you even decide which one has the fastest growth, right? because you have to go through each and every one of them and look year by year and so on. You just ask the question and it will do it. And Tianyu, as I was saying, it did it and we're going to get the updated chart.

Tianyu:

Yes. Okay. I think there's a, there was, there's still something that we forget to do at the beginning, which is the, you still need to instruct chart GPT or what, what metric do you use for the measurement? For example. we are looking at, so charge is taking average resale price of the entire apartment. As the measurement, but in most cases in various state, you talk, we talk about, price per square meter or price per square feet. Yeah. So I can ask charge if you think that, okay, in your market, price per square feet is the right measurement. Then we can ask charge PD to, to calculate, to recalculate, price per square feet, and then create new charts, based on this. Yeah. For example, I want to show, let's show, let's say, show me the, show me,

Isar:

show me top five towns by the average price per square feet in the line chart. And here I

Tianyu:

want to show you that it's going to make a mistake.

Isar:

so you've tried this before. I tried this. I know. Or you have a crystal ball. One of the two. Both are going to be cool.

Tianyu:

Yeah. I want to show you the mistake because it sometimes if you do not be specific in your prompt, it may just assume the metrics. For example, one of the columns is floor area in square meter. And then we have another one, which is price per square meter.

Isar:

Okay.

Tianyu:

Oh, okay. Sorry. I think this time it gets it right. Last time when I tried the same prompt, it forgets to convert the square meter to square feet. But now, It's According to the Python code, it's converting the square meter to square feet by times. Yeah. 10. 7. Yeah. 10. 76. Yeah. So that's all right. And we got a graph. We got a graph. This is correct. but The Y axis is, still has some problems as we can see. It's not properly, is not properly calculated because the scale is the scale of the,

Isar:

the regular price. The scale

Tianyu:

is the scale of the whole price, not the scale of the per square feet.

Isar:

we can send a reminder per square price equals, average price. price divided. Price divided by. My total square,

Tianyu:

total area in square feet. So

Isar:

just to explain what we're doing right now, sometimes it will get it wrong. And that goes back to, it's not an hallucination. It's just a misunderstanding of how to calculate the data. And you can just go back and explain to it. And explaining can be in just telling it just with Tianyu just did okay, this is how you calculate the price per square feet. but sometimes it, you can upload an example. Okay. Like here is how you, here's an example of how to do this right. And then it's Oh, now I get it. And it will do the calculation. It's still probably faster than doing it yourself in Excel, especially if you have so many rows, it's just impossible to do.

Tianyu:

Yeah, absolutely. So here we have the correct. Yeah, in just 10 seconds, we have the correct, we have the correct charts and you can see that the price, for example, the central area, price per square foot in the central area in 2020, 2017, was, 700 Singapore dollars. And then it moves up to about 900, 900 Singapore dollars in 2024. Yeah. So that, so these are the top five calls, by. In terms of the price, in terms of price per square foot.

Isar:

Yeah. So I want to pause it just for one second to do a quick summary for those of you who may have lost us, the details that we're talking about, because I think this is maybe one of the most magical capabilities that we got since ChatGPT came out back in November of 22, you can take a huge data set, as long as it's. Organize properly, save it as a CSV file. And this doesn't have to be, from the internet or from your database. This could be something from any of the platforms you're using. So if using a marketing tool, it, there's a way to export the data. If you're using an ERP, there's a way to export the data. If you're using a Salesforce like CRM or one of those, there's a way to export the data. So you can take any data source that you have. Export as CSV and 99 percent of them has the ability to export the data as CSV. Upload the file, literally drag it into the prompt line in ChachiPT and start asking questions about the data. And the cool thing is that new user interface allows you to dive in and look at the table or look at the charts that it's creating and then manipulate them either by clicking on the right things in the graphs in the charts or or in the tables or literally by, by asking. Now you can have a conversation with it about what's going to be the best way to display it. What are the best insights? let's say you don't know what are the most interesting insights you can find in this about X. And it can be. Not even about X, just very generic. But let's say you want to find something specific in your data. You can ask about it. So let's take the ERP as an example. what are the pieces of hardware that we're always behind on ordering and then it delays our work and it will find it for you and okay, out of this, which vendors we have the biggest problem as far as delayed shipping, and it will find it for you, and this could be again. Years of data and hundreds of thousands of rows of data, it's literally impossible to do in any other way. And so you can do this now on literally everything in your business. And the really cool thing, and I'll take this back to the real world. If you are a data scientist, this is obviously magic, but you could have done this on your own. Like you could write Python code most likely and you can figure it out. But if you are a manager. Or any operator who needs the data. Historically, what you did is you went to your data scientist, send him an email and say, Hey, I need this data. Please give it to me in. By the end of the week, because it would take time and it would have other stuff they would need to do. And then you will wait three, four days and then you'll get the data and sometimes it's not exactly what you wanted. Or you have a follow up question, just like we just did. Okay, what are the top three out of all the cities? And so now you have to send another email and wait another three days. And so a week has passed until you get the answer. And now you can literally drag the data in, ask two or three questions, and in the time it would have taken you to write the email explaining to your data scientist what you want, you have the answers. This is nothing short of magic when it comes to making the right decisions based on data versus based on hunches or ideas or historical trends or whatever. However, you can really get real time data analyzed very quickly. On your own, regardless of what it is that you're doing in your organization and it's pure gold.

Tianyu:

Yeah, absolutely. And this, you just remind me of my time as an analyst. So I joined, I was in the meetings, And then, yeah. And then I was, after the, after my presentation about data, about, some market insights and I always get feedback, can we, do a deep dive into another market in Indonesia market, or can we do, can we add something else to the chart, or can we, can we do another study? Can we compare the two studies? And then, of course, everything is possible, right? As long as we have the data, we have the time. But then my response at that time was that, okay, now I would have to calculate, I would have to estimate how much work, how much time it's going to take to complete all the work. And nothing can be done. In the meeting during the meeting, right? But I'm actually now you go to a meeting and then you come with, you have charged BT with you. You have all you have the raw data and maybe you have some new, new data sources from the Internet during the meeting itself while you're discussing, just like you and me, while we are talking, we can already have all the data analysis down. although they're not perfect, but at least we have the, top line insights and we know where to dig further. Yeah. So that's the efficiency that, ChargePT is bringing us.

Isar:

Absolutely. I will add two things because one of them you just mentioned and the other just came to my mind. So the thing you just mentioned is you can add data from other sources, meaning let's say you're having a meeting and you already have some of the data is Oh, but what if we compare this? So let's take your example. So you, we have the data on prices pair, every transaction basically that happened from a real estate perspective in Singapore. But let's say we want to correlate that to the number of people who live in Singapore. So I'm sure there's another government agency where you can find that data and you can download that data. Now it's completely independent sets of data, but they both have a timeline. So they both have years and months and maybe even towns. And now you can say, okay, upload these two files separately, and it will know how to combine the two together. So you can ask questions about what is the ratio between the number of people moving in and out of a town to the increase or decrease in, the price per square feet of, the flats and the apartments in this town, and it will do it. So you're not even limited to just one file. As long as there's one common column, one that you say, okay, this could be whatever parameter it is, right? You can bring it in and you can start combining data. And again, that by itself is just magic because you, in theory, if it's a small piece of data, you can do it in Excel with Vlookups and pivot tables. And, but it's very complicated. And again, you can do this. On a huge amount of data. And you need to be somewhat of an Excel whiz to know how to do these kinds of things, but here you just literally ask the questions and he will know how to do that. So going back to what you said, if somebody has an idea of, Oh, but what if we can compare this to that? Just do it right there in bed. The other thing that I will say that I think is critical for us as Companies and as individuals, if you're just one person, like a solopreneur, because this capability now exists, we have to start saving our data in a different way. Meaning I told you in the beginning to make this work, the data needs to be as clean as possible. No gaps, no different layers, no weird connectivities, no four tables in one, tab, because it's easy for me to look at all of them together. Cause just, it just confuses it. So to be able to do this analysis in the most effective way, you want to save the data as flat and clean and simple as possible, which is not the way we're used to it. We're used to creating these weird, combination of things because it's easy for us to look at. Now you can do both, right? So what I suggested, I literally had a session about this, a week and a half ago with a company in California. I was there at their offices showing them how to do this with this huge amount of data that they have. And what I told them, let's create a mirror Transcribed by https: otter. ai table, keep the stuff that you're used to because you're used to it. Like you're looking at this every week, you have a automation that creates it into reports, like all that kind of stuff. So your dashboard, basically, you already have that. Don't break it, but create another tab that is basically mirroring what's on the first tab, but without spaces, without gaps, without all the bold things, without having four tables in the same chart, without. Just have just the data and it's just this cell equals this equals this. Like it's not a new table. It's just mirroring the original table just without all the fancy stuff you've added and use just the data. That table, the simple one as the source of data for ChatGPT. And then you can enjoy both worlds. You can enjoy the old school way you're used to doing things, but you can also upload the data and ask questions that you don't have answers to right now. And that combination, because all this stuff you already have, like I have an Excel that knows how to calculate this and that I need every day. Okay. Awesome. Don't break it. there's no reason. to do that in here because you already have the solution. But other stuff you don't have the answer for, which could be additional questions on the same data, or combining it with new data, or a new set of data that you didn't have before, use that in ChatGPT. anything else you want to add? I think this was absolutely amazing.

Tianyu:

So for chatGPT there, we just, I just want to mention the limit, because, on chatGPT you can upload up to 10 files, of different types, and then the total number, the total size should be 500, Within 512 megabytes, this limit is changing all the time. But recently, I think the most recent is 512 megabytes. so I think this is pretty, this is good enough for most of the light data analysis exercises, not for data science, but easy, but it's quite good for most. white collar workers, I would say. Yeah. Yeah. yeah. Another thing is that you can actually use, charge GPT for, to format the data. For example, just as your example, you can, the, they have their, the client might have their own way of, building the Excel dashboards. And then you can also ask ChatGPT to convert that, dashboard into the destination format. As long as you provide clear instructions and, an example of the destination format, then it will be quite easy. You can automate the entire process. So that's, that's another benefit.

Isar:

I agree. Can you, this was amazing. This is such, such an important topic that I think, I don't think I know, because I talked to a lot of business people. The vast majority of people do not know that capability exists. It's relatively new, right? Like they added it. two weeks ago, or maybe three, but that's, by the way, the weird thing is they didn't announce it. Like when they did the big announcement the day before the Google thing, they announced a lot of other stuff, like the vision thing that they didn't release yet, the audio thing that they didn't release yet. And a lot of other cool stuff, that they didn't release yet. And then they released this two days later without telling anybody. And yet it's

Tianyu:

by the way, we also have this, Oh, this is available for you.

Isar:

I want to really thank you. This was an amazing session. If people want to follow you, work with you, learn from you, what are the best ways to do that?

Tianyu:

Yeah. Thank you so much. yeah. So the best way to reach me is LinkedIn. Cause I am on LinkedIn every day. there's, it's more efficient than sending me emails.

Isar:

Yeah. Awesome. Perfect. Thank you so much. This was really great. I'm sure we'll do something like this again in the future on maybe other topics. I really appreciate taking the time and joining us today.

Tianyu:

Fantastic. Thank you so much, Isa.