We Love Ugly Data! The Deep Analysis Podcast

We Love Ugly Data! The Deep Analysis Podcast - Episode 5 July 23

Alan Pelz-Sharpe


The show notes for this episode are available below if you want to follow the conversation. As ever, it's three topics in 30 minutes, this time with Matt and Dan; skip to the topics you're most interested in, or maybe sit back and enjoy the whole thing with a beverage of your choice.


Topic 1: What are the “4 Waves of IDP”?

Dan recently posted an Analyst Note entitled "The Fourth Wave of IDP is Here," and here he talks Matt through those 4 waves, from the original OCR of the 1960s and Forms and Templates of the 1980s and 1990s, through Machine Learning to the LLMs (or GPTs if you prefer) that have dominated release cycles this year.

Topic 2: LLMs, grounding, and data residency

Matt wonders whether, among the plethora of big announcements around AI from the largest software vendors, some of the biggest challenges faced in their implementation have been overlooked in much of the commentary? Using examples from Microsoft and Salesforce (both have talked and published extensively on the subject), Matt points out that using your own data to help "ground" requests being sent to LLMs like OpenAI's GPTs creates a series of challenges around data residency regional legal frameworks and plain "my brain hurts thinking about it" allied issues.

Topic 3: Documents, content, files, records, semi-structured or unstructured data: do labels matter anymore?

Dan discusses his recent Analyst Note, "Documents, content, files, records, semi-structured or unstructured data: do labels really matter anymore?" and ponders why we call things the names we do. Matt wonders whether it's actually all the fault of Industry Analysts in the first place.

Support the show