Informonster Podcast
Informonster Podcast
Episode 29: Data Quality in Healthcare
It’s time to take a closer look at healthcare data quality. This episode is the beginning of a series related to the challenges that all healthcare professionals face in gathering, organizing, and using patient data. Join the conversation as Charlie Harp shares how we got to this place as an industry and a solution for measuring the quality of patient information. Listen now to learn about the Patient Information Quality Improvement (PIQI) framework and how it can be used to help healthcare professionals make better informed decisions.
Contact Clinical Architecture
• Tweet us at @ClinicalArch
• Follow us on LinkedIn and Facebook
• Email us at informonster@clinicalarchitecture.com
Thanks for listening!
Hi, I am Charlie Harp and this is the Informonster Podcast. And today on the Informonster Podcast, we're going to talk about surprise, data quality. In this data quality session, we're going to talk about the problem we have with data quality and healthcare and kind of set up a future podcast where I'm going to talk about something that I call the Patient Information Quality Improvement Framework. What I thought was, before I do that, I should probably lay out the reasons why we need something like the Patient Information Quality Improvement Framework. So we're going to start out by talking about data quality. Now, you should know before you go any further, that I am a healthcare data quality zealot. If you've spoken to me, if you've listened to my podcast, if you've seen me talk at a conference, you know that this whole concept of healthcare data quality is something that I am unapologetically vehement about, and I feel that way because this concept of healthcare data quality is a huge roadblock to us as an industry evolving to the point where we can actually make more of a difference.
So let's talk about why that is. If you look at the current situation we have in healthcare, we have a couple of different things that are happening that are creating kind of a perfect storm for us in terms of impacting our ability to really take care of people the way they should be taken care of. The first factor that we're dealing with is the explosion of the amount of data that we produce in healthcare. It's been estimated that healthcare produces 30% of the world's data volume, 30%, a single patient generates about 80 megabytes of data per year, and a single hospital creates about 137 terabytes per day. Now granted, this is a mixture of structured data, unstructured data, image data, audio data, video data and the bottom line is there's a lot of data and it's going to continue to increase the more devices and the more things we put into practice, the more venues like telehealth that we start to leverage, the amount of data that we create in healthcare is just going to continue to increase as time goes by.
That's factor number one. Factor number two. In 1950, it was estimated that medical knowledge doubles every 50 years. In 1980, that figure was, that medical knowledge doubled every seven years. In 2010, medical knowledge doubled every three and a half years. In 2020, it is estimated that medical knowledge, what we know, what is being written and understood about medical knowledge doubles every 73 days. That is a remarkable figure if you think about it. Everything we know about healthcare doubles every 73 days, so it's impossible for someone who is a provider in healthcare to keep up with everything that's happening. So that's factor number two. Our rate of growth in knowledge in healthcare doubling every 73 days. The next thing that's a factor that affects this is primary care shortages. 30% of people don't have a care provider due to the shortage of providers. The National Association of Community Health Centers estimated that by 2034, we'll be short as many as 124,000 physicians.
The third factor is that the people that are looking after patients don't have enough time. They experience significant time famine. So as a result in general, the quality of care in this country decreases as time goes by because we don't have enough providers, we don't have enough people, and physicians are burning out and leaving the workforce in record numbers. So that is a trend that is likely to continue until we find other ways to relieve some of the pressure, increase the number of people entering that space or come up with another way to help. That's the third factor. The fourth factor is that with things like health information exchanges, TEFCA, we've also put an additional burden on the system where we're saying we want you to share data with each other. And it'd be easy if the sharing of data was seamless, easy in quotation marks, but in reality, the sharing of that data is not easy.
The sharing of the data puts an additional burden, additional time burden to reconcile the data, this explosion of data. And so the sharing of data, which you think might be helpful actually isn't helpful. All it does is increase the size of the information tsunami that providers and other caregivers have to reconcile with when they're trying to figure out what to do about a patient. And the last factor I'm going to talk about right now is this concept of us as a industry shifting from fee for service to value-based care. Because the challenge with value-based care is you want to make sure that the providers that are caring for patients are being compensated for the patients they're caring for, if they're delivering value. Well, how do you know if you're delivering value? Information, the information tells you. So we have a couple of things. Let me go back and just talk through them one more time.
We have an explosion of data, single patient, 80 megabytes per year. We have a doubling of medical knowledge. A provider, no matter how much they study, they're going to be behind no matter what they do, there's no way for them to keep up with everything we're learning and understanding about healthcare. They don't have enough time. The patient to provider ratio is a problem and it affects people that are in underserved communities, but it also affects people that are not in underserved communities because that ratio and the amount of time that a provider has to actually think about what's going on with the patient continues to go down. The sharing of data intended to complete the picture and make things better doesn't really do that today. And now we need the data because we're not just sending an invoice based upon the number of procedures. The intent is we're sending an invoice based upon the quality of care we provide to our patients, which the only way to determine that is by looking at the information to see where they are today versus where they were yesterday.
So what we have is a perfect storm. We have a perfect storm where we have all these problems and it's very difficult for human beings to manage the problem. So the answer to this problem is technology. The thought process is if we can leverage technology, I mean massive data knowledge increasing, automation, processing through things, sharing of data, you would think that the answer is healthcare technology. More specifically, software that can organize, understand, analyze the data and augment a provider's ability to advise and care for their patients in an optimal way and help patients understand their situation better so they can make informed decisions also. But that's not happened, and that begs the question, why hasn't that happened? In fact, when you look at the last 10 years, why have we had some of these spectacular failures in people trying to come in and fix healthcare with technology?
The answer, dear listener, is data quality. Despite our best efforts over the last decade to do a better job of improving the usability of data in healthcare, whether it was meaningful use, CPOE, even as recent as the Cures Act, the way we collect and leverage data in healthcare today is still a byproduct of the process of providing care, and we might not be able to change that. One of the things that I always tell people is that healthcare is an industry is disrupt resistant. It's very difficult for our industry to completely change direction, and there's a lot of good reasons for that. Some of them are just the complexity of what we do in healthcare. Some of it is just the critical nature of what we do in healthcare.
At the end of the day, when you think about how you're impacting human lives in what we do, it makes you very risk averse to try something that's tricky. And after so many failures, expensive failures that we've seen happen in the industry, it's also something that can be very intimidating to try to take on. I mean, I'm a big believer in innovation and doing things, but in my last 35 years of working in this industry, one of the things that I've learned is that the only way to truly change healthcare and improve it is through pragmatic, incremental steps. If you try to come in and do some big sweeping change in healthcare, your chances of success are pretty slim. If, however, you come in and you make incremental changes that allow the system to grapple with those changes and allow people to implement those changes in a way that doesn't put a patient at risk or their own job at risk, your chances of success, even though it may not be an overnight success is much better.
Take my own company Clinical Architecture. We've been trying to improve data quality in healthcare for 16 years, and we've made progress. We've seen some of our clients accomplish amazing things, but it's been a journey. It's not the kind of thing where it's an overnight, oh my god, everything is fixed. And I think that's just the nature of the beast in healthcare. So if data quality is the issue, why is it the issue? Well, the first reason is when you look at data in healthcare, you have what I'm going to call structured data, which is coded discrete data. You have unstructured data, which is information that's in a clinical note where someone has written a document intended for a human to read, and then you have digital data like images and audio files and videos that are intended also for humans to review and software. In some cases, as we've tried to take advantage of the data we produce for software to help with some of the heavy lifting in healthcare, the challenge is software prefers structured data.
Software likes codes, it likes concrete meanings and descriptions and taxonomies and ontologies. And if you can provide the data to software as discreet structured data, if it's good data, software can do a pretty good job with that data. If you provide it with unstructured data, images, audio files, faxes, things like that, for software to be able to consume that, you have to turn it into structured data or you have to teach the software to turn it into structured data. And that's why we look at solutions like N-L-P-O-C-R, speech to text and language models to try to grapple with that information and turn it into something that the software can deal with. The issue that we have fundamentally is the structured data is not complete and not always correct the unstructured data because you're asking a computer to interpret it in some way or you're relying on a human to interpret it in some way to put it into structured data.
There's a certain amount of uncalibrated uncertainty, regardless of which approach you take. When we try to use analytics or machine learning or large language models or even decision support, we have to understand that the data that we're feeding into those mechanisms is questionable, and it doesn't take a lot of failure for us to lose confidence in a technology because it doesn't deliver the solution. We hope that it would. So in healthcare, we have these machines that if we deliver data into them, they should provide some value to us, but they're not working the way we expect. So it's logical to conclude that the reason these machines are not working the way we expect is because the information we're feeding to them is not up to the task. So the very next question we need to ask is why and how is it not up to the task?
Where are the failures occurring? What is the nature of those failures? That's what got me to sit down and look at the problem and say, is there a way that we can put together a ubiquitous framework that allows us to evaluate the quality of, in this case patient information in a way that we can all commonly agree on? That can also provide insight into the nature of the issues so we can go back to the root cause and correct it, and then incrementally over time improve the quality of the data that we're feeding into these systems, or at least calibrate the uncertainty that we see and understand it because if you care about something, you measure it. In the next couple of podcasts, what I'm going to be talking about is the patient information quality improvement or "PIQI" framework. It's different components, how it works, what its intent is, and how it can be leveraged to understand and gate the quality of information that you're feeding into these machines that you're relying on to make good decisions. I hope you tune in and check it out, and I'm always happy to address any questions or comments. So feel free to get in touch and always excited to hear ideas for other podcast topics in the future. And so I'm going to go ahead and wrap it up for today. I am Charlie Harp and this is the Informonster Podcast.