Podcast

Episode 11: CHES 2025 Roundtable – The Health Data Wish List: What Economists Need to Fix US Health Policy

December 29, 2025

In this episode of On Background, host Stephen Parente leads a conversation around the challenges and opportunities in health economic analysis, particularly focusing on the missing data that hinders optimal policy-making. The participants discuss the difficulties in accessing health data, the need for standardization, and the importance of linking health data with other social determinants. They also explore future directions for improving data transparency and the role of technology in facilitating better access to health information.

speaker-0 (00:00)
welcome to our Caribbean Health Economics Soaposium round table. The topic that is up for conversation is what are, or is, guess, guess, all right, what are missing data for optimal health economic and policy analysis? And assembled our variety folks have done lots of empirical work here. A little preface to this is something that policy makers,

use whatever data they can or informed by work that probably many economists in this room sort of informed that give and provide. But we all know the data is not always optimal and we make a lot of assumptions to try to work the best we can. there’s always an opportunity for investment in new data, whether it’s nudging private sources or actually creating new public sources. And this is sort of a forum that’s given us the holiday season for those realize we’re recording this days before.

the festive season. that I mean festivus and other things. The floor is open. Robert Town. Too damn hot.

speaker-2 (00:55)
So the price of data is

too hard to get, it’s too expensive. And I think my wish list is that we find ways to make the price of access lower and to that point and is applied. The state of Texas is coming online with its all payer claims data, which is I think two thirds of the ERISA enrollees in Texas. ⁓

speaker-0 (01:13)
Conclude.

So.

speaker-2 (01:21)
The price is zero.

speaker-3 (01:22)
Even for people outside Texas.

speaker-4 (01:24)
patients and providers over time.

speaker-0 (01:26)
⁓ I don’t know

speaker-2 (01:27)
about MPIs. I suspect MPIs are in there, but…

speaker-0 (01:30)
What?

speaker-3 (01:31)
Wait, so how they funding it then?

speaker-0 (01:32)
Santa.

speaker-2 (01:33)
So this was mandated by the Texas legislature. They’re providing zero. So it’s basically UT Houston, which is housing the data, is self-funding it.

I think there’s probably they’re going to try to make some money on the back end by providing consulting services to access the data.

speaker-3 (01:51)
because that’s sustainable then.

So actually I had a point related to this, which is, I think a lot of us have been able to work with Medicare data, and it’s also expensive and difficult to get, but at least feasible. And I think that as a result, a lot of the work has focused on Medicare patients. Now, to some extent that makes sense because they’re vulnerable in somewhat unique ways.

But I think we have this blind spot. We don’t know what’s happening with a lot of the other populations. I was just talking about Medicaid. And so I wonder if there’s a way to take the state all-payer claims database model and somehow go national with it. I know there are these random samples of hospital discharge data, but it’s just hospitals, and it has a lot of limitations. I think that would be my wish list.

speaker-0 (02:39)
Yeah, that’s that has been discussed by asking an HRQ to make that actually kind of a joint project to move. The challenge of it is that if there is a model that’s like that, it’s the age cup data. Yeah. Whereas like, but the, you know, I think the first that arc of time to get an age cup is not entirely the whole country now still. But I mean, it’s it’s taken 30 years to go from like the sparks database, which like

you and I share an advisor with the past, Bob Berg. Rochester got the first Sparks database sample in 1978 or something like that. It took a long time to get to that point. I just hope that we can get there faster than 2048. There’s another idea I can suggest to the talk. Other thoughts?

speaker-5 (03:25)
We’ll start with a little hanging fruit and then you said it’s a Christmas wish list.

speaker-0 (03:29)
I’ll go. Yeah, yeah. I mean, Texas already provided half the Christmas. I know it just kind of amazing because like Santa can easily, you know, set down that sleigh in West Texas.

speaker-5 (03:39)
Kind of looking at other populations. So Medicaid claims data exists. It’s just, it’s not very good. It’s people refrain from using it because it’s really messy. It’s kind of the same format effectively as the Medicare claims data. so if there is a way to standardize it, just make it a little bit easier to use that. think that’s huge. It’s that’s the low hanging fruit. There is just, a lot of people who want to use it. They just don’t know how and it’s really difficult.

to harmonize it across years and across states. So some investment there, I think is gonna yield huge return in the terms of research on Medicaid population. Now for the ambitious part, partly inspired by what we’ve been talking about is there is no way you can talk about drug pricing or innovation without information on rebates as it’s made clear. So to what extent can we get at least any information?

On rebates doesn’t have to be super specific. Anything on aggregate even would be very, very useful. know, CBO might chime in here, but yeah, that’s, that’s my very would be ideal. And it’s Christmas.

speaker-0 (04:46)
Yeah, it’s almost like quoting, you know, love actually is like you get the cars and because it’s Christmas. data please.

speaker-4 (04:56)
I have a related to the price, but maybe a different kind of price is that I think the federal government in particular puts far too much of a burden on universities for hosting. And Medicare is now moving to this virtual enclave system, which is more expensive than an already expensive system.

speaker-0 (05:07)
to data.

speaker-4 (05:15)
And so for economists who have NDR affiliations, who can tap into some of this grandfather stuff with Medicare claims data, or they’ve established it, I think that that data is still very accessible. But for many of us who don’t have that resource, it was already hard to get without very deep pockets and it’s just getting harder. was, somebody tried to transfer a Medicare DUA to me at Penn State.

And the university said that the DUA was so high risk for them that they would rather just destroy the data, which they did. That was, you know, $250,000 data set that they just burned because they thought that the Medicare requirements for the protection of that data was infeasible for them to match.

speaker-6 (06:02)
That is true. The darkness is clearing out all of us.

speaker-4 (06:05)
And we know that it’s not a legal issue because states will give you this same information on different people for a much lower burden. And I have not read, you know, name your favorite data scandal, which one of them came from a researcher losing claims data.

speaker-3 (06:20)
Dot. Dot. Dot.

speaker-0 (06:24)
Done.

Yeah.

speaker-3 (06:25)
When did you see We’re to VRDC. Ah, OK.

speaker-4 (06:28)
and how many hundreds of thousands of dollars a year probably are in.

speaker-0 (06:31)
maintaining that data.

since the

speaker-6 (06:33)
dollars

a year and the requirements. yeah, but not to be crampus, but it’s good news. And that is that there are Medicaid user groups. And I remember going to one and I said, my name is John. I am a Medicaid user.

speaker-0 (06:36)
Thank

speaker-4 (06:37)
So.

speaker-6 (06:48)
Uh, and Lindsay Leningford who’s French always, I remember her saying, once you understand the Iowa Medicaid program, you understand the Iowa Medicaid. And that’s true, but there are like, they are kind of trying to build that out. And I think that’s the key is just sort of informal. oh, this is how we did it. And like, I, you know, remember just things like.

speaker-0 (06:51)
Peace.

Right.

speaker-5 (07:03)
Yeah, it was.

speaker-6 (07:15)
we couldn’t get this to line up with this and then somebody from Dartmouth said, are you on fiscal year or the calendar year? And it’s like, ⁓ okay, right. So those kinds of things that are sort of working it out. would be nice if there were a central.

speaker-5 (07:29)
Formalize it.

speaker-6 (07:30)
So like rant, like Rand takes the HRS data and makes it usable. Wouldn’t it be nice if they did the same or Accumator or somebody the same for Medicaid payment? But it is, a lot of it is just.

speaker-0 (07:38)
Yeah.

speaker-4 (07:42)
or not.

speaker-0 (07:43)
Yes.

speaker-6 (07:43)
I don’t know if it’s any better.

speaker-0 (07:45)
I’ll make a comment about it. want to make sure everybody that wants to chime in. Any more Christmas lists? You guys are deep in the data all the time.

speaker-4 (07:45)
Thanks for

speaker-5 (07:56)
Well, I think for our own selfish needs, we would be very interested in learning more about the health information technology that different providers use beyond just hospitals. So skilled nursing facilities, for example, and also the costs of those technologies. So what prices are they?

speaker-0 (08:11)
Go

speaker-3 (08:12)
So in the spirit of wishlist, raising the ambition to the next level, I think I have found, we run up against this limit of the healthcare data lives in a silo where it allows you to look at healthcare. But there are a of questions that you might have about how does health affect education? How does health affect family interactions?

You know, like there’s, there’s a whole rich range of questions. know, think of it like, you know, the Scandinavian sort of data environment. I think that would also be very important and beneficial if we can push in that direction. And maybe you can start at a state level and, know, see how that works, but being able to link and think about how things affect the household rather than as individual patient.

how it affects their education, how education affects health, all of those things, crime. I’m working on project right now where we’re using the census data environment where census has somehow recreated some of that, but it’s not perfect. And it’s incredibly difficult to use census data, not to mention government shutdowns. So I think that’s another thing that, you know, if we had something

Not exactly like the Scandinavian system, something that’s at least more than one dimension. I think we could answer a lot of rich questions.

speaker-4 (09:33)
Maybe one version of that would be

speaker-2 (09:34)
B.

speaker-0 (09:35)
and

speaker-4 (09:35)
The national health interview survey, which has. They also have a project there now, and they make it also incredibly hard to use relative to lots of other census data. Seth, the CDC just as different and larger requirements researchers that I think are not, they’re not angled towards economists. Really? They’re angled towards.

someone was writing a health affairs paper, send them, here’s the table that I want with the mean of these three values and that’s what I’m gonna compute. So it’s hard to get access to, but I think if they, yeah, bring survey more often, more information, it’ll be nice.

speaker-2 (10:09)
I think a starting point for that might be states linking Medicaid with education. In Texas they had it.

speaker-0 (10:14)
Okay, thanks for listening.

speaker-4 (10:17)
Yeah.

speaker-0 (10:17)
Yeah.

speaker-5 (10:18)
So there are some states that are.

speaker-3 (10:20)
But you know, my understanding is all of those stories of how people got access to that data was like, it was my uncle or I knew somebody in the government. You know, it’s not a standardized formal process.

speaker-4 (10:31)
And because of.

speaker-5 (10:31)
health.

speaker-6 (10:33)
all to research.

speaker-3 (10:35)
Yes, they were at the level, mean, you know, because employer sponsored insurance and they were in that huge lane, we got started.

speaker-0 (10:41)
That’s a really good point. So for those of you, I’m actually closing on my term as past president of Ash Econ, three-year presidential cycle. my, yeah, but I’m not sure I’m allowed talk about that. When you’re going from president to past president, you get to give a speech. And so my speech in Tennessee, some of you might have heard it was like, know, crossing the quality gas chasm of data, which is why I mentioned this question here. Again,

One thing I’ve been trying to do in my work at government is to try to liberate as much stuff as possible. a few things that may be, I don’t know, stocking stuffers, shall we say. I don’t think they’re going to be big Santa gifts, but the stuff that has happened with the transparency data is that Connor talked about, I think it’s still valuable. There’s another shoe that’s going to drop, which does the pharmaceuticals, the pharmacy data through PDMs is expected to basically come out from another federal rule.

And so that is supposed to potentially capture maybe the missing link on the rebates, maybe. And so I’m somewhat conflicted on that. actually truly conflicted, so I can’t say more, but keep an eye out for that. The other thing that I did while there, because it was part of the original price transparency rules to create synthetic data, so that actually was all payer. And actually, Connor,

helped me do this for my course in analytics when he was my TA way back when we use that as a prototype. So what the data actually was was a representative sample with basically maps weights attached to it for Medicare again, it’s Medicare fee for service, ESI, Medicaid and the individual insurance market. So basically the entire health economy scaled up. And so we had that bill for a course because I was bored one time waiting for confirmation.

and gave it essentially to HRQ and said, if I could do this, so can you. And the thought was, you know, use data from coming in from the exchange information for marketplace for at least the risk adjustment information. TM-SYS is essentially the Medicaid, PID-TAPE to TAPE files, of Reborn. Get the Fed’s OPM data to substitute for the ESI and properly weight it.

and bring Tricare in too because they’re basically all private contracts that are pretty robust. And then Medicare Advantage, same thing, use the switch system that comes in for the data there. And then fee-for-service Medicare is pretty obvious. It did actually get built. AHRQ did in the end build a prototype of it. And actually Eric’s group helped facilitate that money when he was in office as deputy secretary. They did do a, they,

The vendor did an okay job, but they kind of, they could have done it a lot better. Um, but it showed that it could be done. So I think there’s a thought that that’s one thing. I had a big list that not a stock and server, but a big click, can we do that again? Because to your point, the price is just high for this stuff. And there’s like a whole set of like people that just have to learn how to use this stuff and figure out what new metrics they can make, um, and run it all out.

that way. And the last one is even a bigger stretch one is compel the EHR vendors to do the same thing. And the thought was like, you know, it’s sort of, I don’t know if there’s a way to do this easily or requires new legislation, but it should be bipartisan. It’s like high tech put a billions of dollars into this thing. And the thought is like, where’s the return on investment? You know, you folks actually have a paper that’s more sort of showing some of that return on investment, but there’s a part of the same, but there’s a public good aspect of this.

And so, you know, keep your monopoly control, but if you’re going to have a regulated monopoly, then there has to be some public good that comes back out of it. De-identify part of it. Because one of my concerns, you know, looking at the health services research communities, it took them the better part of 15 years, once having access to claims data to make anything useful from it, this actually counts as an outcome metric that you guys all have in your papers. And like none of us work with EHR data.

make those metrics. It’s far more robust, obviously, because of what is recorded. That’s with my wish list. So we hope government officials, if you watch this thing, take our Christmas wishes to heart and forever whatever comes in your festive season. Any last words perhaps, Bob, from you at all? Is there a wise aside or anything like that? No. Okay. If nothing else,

What this recording may do is be a time castle so that when we come back here in 2048, when they finally have all payer databases for all 54 states, then we’ll see how much we got right.

speaker-2 (15:07)
data for the virtual us that will be projected into these seats.

speaker-0 (15:12)
will

be. That’s right, VR Bob. It’s true, we can all have our avatars come collect and tell us stories how it went in the end. Anyway, with that, for the good of the order, we close this seminar. Thank you all.

you making that over?

speaker-6 (15:28)
something.

Scroll to Top