Transcript#

This transcript was generated automatically and may contain errors.

Hey there, welcome to the Paws at Data Science Hangout. I'm Libby Herron, and this is a recording of our weekly community call that happens every Thursday at 12pm US Eastern Time. If you are not joining us live, you miss out on the amazing chat that's going on. So find the link in the description where you can add our call to your calendar and come hang out with the most supportive, friendly, and funny data community you'll ever experience.

I think we can go ahead and introduce our featured leader for today, Nikolay Braykov, Manager of Advanced Analytics and Outcomes at Children's Healthcare of Atlanta. Nikolay, it's so great to have you here today. Could you introduce yourself? Tell us a little bit about what you do, maybe a little bit about Children's Healthcare of Atlanta and something you like to do for fun.

Hey everyone, good afternoon. Thanks for the opportunity to be here. So my name is Nikolay. I manage our data science team here at Children's Healthcare of Atlanta. We are one of the largest pediatric health systems in the country, the largest in Georgia and one of the largest in the Southeast. So what my team does, we're part of a bigger data and analytics group, or we sort of sit alongside data and analytics under the IT part of the organization. And what my team does, we're officially called the Advanced Analytics Team, but we provide essentially any kind of data science question that goes, things that go beyond reporting.

And there are, you know, most of our questions or problems revolve around healthcare quality or kind of clinical and operational questions. We're not a research team. We're very focused on kind of applied data science. And I'd say there's like three main pillars of what we do. The traditional kind of advanced analytics data science work is, as I said, anything that our reporting teams can handle in terms of like something requires more advanced data visualization, data wrangling it from different sources is needed. And then eventually you might have some hypothesis testing or some kind of statistical analysis that needs to get done. So that's where we come in.

The other pillar is machine learning and the evaluation and also the development of custom predictive models. Most of those are around clinical decision support. And so we partner closely with clinical informaticists. Those are the folks that can build things in our electronic health record. And the third pillar, this is a relatively new area, but we also develop custom solutions that might be powered by LLMs. So if there are any applications that sort of lean on large language models and do, for example, NLP kind of work, we would develop and support these.

So I love being outside. Atlanta is a very green city. So I live on the bike trail. I love going on long bike rides and I've been trying to get into pickleball. I'm still new to it, but it's addictive. There's no other way of saying it. It's easy to get into it and it's such a wonderful community. So that's been kind of my newer hobby. But yeah, I like to stay active and enjoy the outdoors.

Tools and team overview

So we have three data scientists and one machine learning engineer. One machine learning engineer — actually have an opening for another one. So if folks are interested, get in touch with me.

As far as the tools we use and what we do, so a lot of the origin of the team was more in biostats. So we're answering questions around statistical hypothesis testing and data visualization analysis. I think our data scientists were primarily R and R is also my preferred language. So that's in heavy combination with SQL, just because a lot of times we need to get data and transform it. And for some of the workloads that we have, it's easier to do that when we directly interact with databases. So R and SQL for the data analysis, wrangling, visualization parts.

So if we're having a lot of ad hoc questions around something, for example, we might be asked to evaluate if a certain intervention has worked. So we would do some kind of observational study, but think of this more as quality improvement. So it's not exactly your biostats research kind of work, but it's still using a lot of the same tools and packages in R. So like we love, for example, using GT summary and making bivariable tables.

We do a lot of that work in R Markdown and Quarto. So I've been a big fan of R Markdown and now Quarto actually for since the start of my career, because it's just so great to have that type of notebook where you can actually create very polished output, make everything reproducible. So I'd say that's the basic building block of a lot of what we do, whether it's data exploration for some bigger project or predictive model, or there's always a notebook and that notebook tends to be in Quarto these days.

So we also develop Shiny dashboards for certain use cases where you need to interact with data. So one example of a project that was a Shiny dashboard, we had a question about clinical effectiveness of our behavioral mental health programs. So actually we had to build a tool to support a team of scientists that were looking at clinical effectiveness. Those are like health systems researchers. And it was a fairly large cohort and also a lot of different ways to create outcomes for it, filter by different subpopulations. And because again, this is an observational kind of question or comparing kids who are in our programs versus kids who are not, we had to do some propensity weighting to actually balance those comparisons.

So we built a Shiny dashboard that can do a lot of that analysis kind of end to end on the fly, like get the data out of our clinical data warehouses and process that. So that for example, is a script that runs on a schedule, publishes a pin on Posit Connect. Then on top of that, there's a Shiny dashboard that does the kind of cohort visualization, and then also runs these analysis on the fly so they can get their answers more quickly and then download output that they can use for their research.

Again, we wouldn't use Shiny for any old dashboard. There's teams that can handle that, like there's teams that develop in your traditional BI tools like Click and actually now Power BI. But yeah, Shiny is there for these more specialized uses.

And then, so that's all the R side of the house. For Python, I think most of our ML work, especially when it comes to the deployment evaluation of models, happens in Python. And then yeah, we used to run Jupyter notebooks, but I think now with Positron and with Quarto, I think it's actually become easier to achieve the same kind of very rich interactive output that you'd get from the things before we would do in R.

And yeah, finally, I think for creating some applications that are more Python-based, we've been dabbling more in Streamlit for anything that's like a chatbot. I'm actually interested in trying more of that in Shiny, but yeah, because that was a Python-based kind of product, we opted for Streamlit. So yeah, I mean, to recap, like a lot of data visualization, a lot of notebooks, everything is source controlled. Things that need to go to stakeholders go on Connect. Pins are wonderful. I rely on them a lot. And then the other key thing is getting the data in from a SQL database connection usually.

Bias and fairness in pediatric ML models

Hi. Apologize for the overly verbose nature of this question, but so you're in predictive AI and particularly in like clinical decision-making and that type of sphere. And one of the issues with clinical and medicine data is often systemic bias, already in adults, and now you're also working with pediatric patients. And when AI and ML tools are trained on bias data, they often recapitulate that same bias. So given that you're in the clinical decision-making sphere with predictive AI, how are you accounting for that or can you effectively account for that and so forth?

Yeah, that's a great question. We talk about fairness and bias and kind of responsible AI. There are certain things we can do to at least measure or be aware of our models performing adequately in all the different subgroups that we want to compare, like some disadvantaged groups. So the one thing we do whenever we have a model is when we run our evaluations, we do it by different demographic groups just to make sure there's that parity in model performance. As far as like quantifying bias and doing something about it at like the root cause level, we had some projects that were more focused on health equity and at least kind of surfacing that there are inequities using some more vigorous approach to measure that. But I'd say as far as like eradicating bias from our data, that's a pretty difficult challenge. And yeah, the best we do so far is just measuring performance by groups.

Challenges of pediatric versus adult data

Can you speak to the differences between working with data about children versus working with data about adults? Yeah, certainly. I mean, I think just the general difference in pediatrics versus adult medicine is it's more complicated. There can be a lot more nuances. And for example, if you're looking at creating certain features for risk prediction, you have to be aware that when, for example, some lab is out of range or something is considered to be abnormal, it's going to have a lot more different thresholds just because kids change or they grow. So there's more age stratification goes into our population, although they have a smaller age range. So that's one kind of obvious challenge with data.

And then in general, just how as far as difficulties in prediction, for example, no-shows or clinic no-shows, more challenging in pediatrics because the decision to come to an appointment is not just the child's decision. And so we don't really get to see in our electronic health record and in our, you know, most of our data is about the child, not the parent or the family or their environment. So, and then clinically, I think pediatrics can be more challenging and complex for some things. So for example, some of the critical things that might happen for somebody that's admitted to a hospital, like deteriorating or decompensating things like cardiac arrest or sepsis tend to be harder to recognize and move faster than in adults.

Tools, SAS, and tidymodels

We used to be a SAS shop. I think maybe it's been over five years since there's no more SAS support in the organization. And yeah, I'd say a lot of our, at least two of our data scientists, like they were in public health, they were trained in SAS as well. We have other people in healthcare that are used to working in SAS, but we've been like, you know, we're not supporting it. And for better or worse, I'd say I've always been more R and actually Stata was the one software that I used when I started my career in undergrad. So, yeah, the challenge with enterprise settings is again, like these licenses do become quite costly and it's hard to justify them if you are only going to have like a handful of people using it.

I should. To be fair for a lot of our predictive models, we were out of the R game. So I've used tidymodels and I've used Caret before that. I think it's very powerful, even actually not just for prediction, but just like statistical inference. If you've got a scale models, tidymodels is great. The problem is there's still kind of a steep learning curve. And then if our deployments are all happening in Python and all that, then it's sort of better to make the switch earlier. So, yeah, but I think tidymodels is pretty powerful and some other folks in my team have used tidymodels as well. I've used H2O a lot as well for machine learning and kind of getting started on some kind of quicker initial checks.

LLMs, PHI, and the data landscape

Yeah. The LLMs, I'd say that's generally challenging healthcare. Like you really can use a lot of the same technologies you use outside because of concerns over PHI and HIPAA compliance. So, for us, we do have on our Azure subscription, there is a deployment of OpenAI models that we can use that is HIPAA compliant. So, that's enabled us to do a lot. And that's probably been like two-ish years since it happened. But yeah, we, for example, coding assistants and GitHub Copilot, those are outside of that. So, that was a big deal being able to use those. We have to take a lot of precautions to make sure PHIs don't leak with those. But as far as like putting data through LLMs, like we have that ability and that's been, that's opened up a lot of these doors to do some of those custom LLM-based applications that I mentioned earlier. And also, yeah, Elmer actually has been a great way to kind of integrate that with R and even Shiny dashboards.

The data warehouses that you mentioned, Clarity and Caboodle, those are basically the, what Epic, like the biggest EHR vendor and what we use, it's where the data goes for reporting. So, Clarity is basically the, it's not exactly the data warehouse, but it's where more of the raw data for reporting goes. And Caboodle is a little bit more groomed and has a star scheme, it's neater, easier to work with. So, yes, we use both. We also use some of our enterprise data that comes from systems like Workday. So, those are kind of the main big enterprise data sources. And they're beasts, like Epic is just an enormous application. There's like, literally, there's hundreds of thousands of tables in Clarity. So, being able to navigate those kinds of relational databases is definitely some of the, the specialization that you get from healthcare data science, at least on the provider side that we're on.

OMOP is basically a standardized data model that's very great for creating cohorts. So, Children's does have a lot of our data mapped. And again, that's the special part there is you have to map your Epic data to use these common data models, otherwise they're, they stop being common. So, but we, again, OMOP is more of a reporting or more of a research tool, so we haven't really leaned very heavily on it. Yeah, a lot of times we need something that needs to happen, needs to have more recent data, and there's already a day lag with how things go into these data warehouses. With OMOP, it's a little bit longer for us as well, so.

Explainability and stakeholder buy-in

Yeah, thanks for the question. Yes, yes. So, as far as, like, making some of our outputs more accessible, I'd say, broadly, that's what R is so great for because you can create really powerful custom data visualization. So, for our ML models, for example, one thing that we've done for model evaluation is, when a model is developed, there's a lot of, you know, we usually do this in the notebook, but explaining what your model performance really looks like is something that we've used Shiny for, and it's been really powerful to actually be able to show. So, here's a classifier. These are all the features in it. This is how your feature importance looks. So, if we need to select between different models, that really helps kind of get stakeholder buy-in to be able to show from all the different things that we've experimented with, we're able to track our experiments and then show the feature importance and essentially what cohorts and different, how are these different features constructed.

Also, the other part of, for a classifier, for example, is these ML models are not the end thing that a user sees. They're just part of a bigger solution that, in the case of clinical decision support, that goes into the EHR. That's something that, you know, you can think of it as just another test or just another lab result or something, right? So, to actually act on this, we need to build things in the EHR, for example, an alert that uses that prediction. So, to decide when to display an alert, we need to know what threshold does the predicted risk score have to trigger that alert. So, Shiny has been great for showing the different tradeoffs. For example, if we show a clinician a precision and recall curve, we can say, if you choose to implement at this threshold, then your alert is going to fire this many times. And out of all the times that somebody, for example, had sepsis, this is how many you would positively be able to detect. So, we can show them what their sensitivity looks like in their cohort. We can show them what their precision looks like. And that's been really important for making some implementation decisions, right?

So, we can show them what their sensitivity looks like in their cohort. We can show them what their precision looks like. And that's been really important for making some implementation decisions, right?

So, yeah, I think our, what we call our model evaluation dashboards have been very instrumental in getting buy-in and explaining our models.

Favorite and most impactful projects

I love data visualization. So, I'd say one of my favorites was that behavioral mental health clinical effectiveness project and creating that dashboard that I was describing where we can sort of do this clinical effectiveness analysis quickly. Yeah, just because I had a lot of fun visualizing our cohort. There were a lot of assumptions there on, okay, somebody, you know, clinical data is complex or the hospital and healthcare visit data is complex. So, we had to create some custom visuals to basically show this is how we define our index visit for this cohort. This is how all their other visits look like. This is how they meet or don't meet the definitions. So, creating those custom plots was fun for me.

As far as like other projects, I think I'm excited for some of the new things we're doing with LLMs. So, we have currently a project where we're building a chart review system tool. So, we're using an LLM where people can, people can bring their data, bring their cohort, and we really underuse data in notes. So, so far we've said NLP is out of reach. NLP is too hard. There's quicker things we can do with structured data, but a lot of times there's a lot of rich information in notes that people try to get out in a structured way by doing manual chart reviews and putting things in data collection forms. So, we've been working on a tool that can basically ease that data abstraction burden and give people a first pass at here is information you can conversationally ask from a note to extract some structured data.

So, so far it's, the reason why I'm saying this is one of my favorite projects is people get pretty excited about this. No one likes chart reviews, so making people's lives easier has been, it's been satisfying.

There's another tool that we built. Again, this is more of a Shiny application, but so in healthcare quality, a lot of times you want to show whether you're whether your process is stable. And then if it's not, if there's some kind of special cause variation or some noise in, in some, some metric that you're looking at over time, you want to investigate why that's happening. And also if you do an intervention and you're tracking your, your outcome measures, you want to be able to show that there was a change that occurred. So looking at that kind of data over time in statistical process control charts is something that's done a lot in healthcare quality. So for those of you who are statisticians, I don't know, SPC charts are not really something that we, we focus on a lot. They're, they're widely used in like manufacturing, for example, but they're also used in healthcare quality improvement. So we, we created an application that helps people derive those charts.

And it's more than just simple data visualization. There's a lot of rules to when something is considered to be a process out of control. So there's various rules about like when, you know, when you're measuring your process and you derive some, what's called a center line, what is the average value for, for example, for time to getting antibiotics. If that average goes up or down, there's different rules that are kind of accepted to be considered special cause variation. So yeah, Shiny was a really great way to actually create an application to make these charts and enable people to do their own measurement for improvement for, for these quality improvement projects. I'd say that's probably one of our most used dashboards.

Interpretability and black-box models

Yeah, I mean, I think explainability is very important when you're making, you know, critical decisions with these models. So that being said, like, there's ways to go beyond logistic regressions and make your models explainable. So, you know, there's the more traditional ways of, of adding explainability to ML models with, you know, SHAP plots and things like this. So I mean, there's techniques to make to kind of unblack box some of the more complex models. Yeah. Even deep learning models. But I think it's also just a culture shift to I actually, my suspicion is that with kind of the advancements in AI, there might be more willingness to embrace a black box approach, because, you know, even people who build some of the build the LLMs don't exactly aren't able to trace what the billions of parameters like round up to sometimes, right.

So I think, I think what was true a few years ago with explainability and opting for simpler models is going to change a little bit. But yeah, I mean, I think the other part is having a having the kind of governance in place where people can trust that if there is a tool going in, that is being evaluated, and that, you know, you both have considered the workflow that people are using. So you're not just thinking that something is useful to predict, because, you know, you've thought about this in isolation, but you've actually sat down with a clinician seen what they what they're trying to accomplish and how this algorithm is going to be helpful to them. And then also having the kind of sufficient guardrails around evaluation and monitoring for something once it goes live. For any clinical model, we we don't really deploy anything until it's been in this phase of like silent evaluation. So okay, something would just have to go silently in the background, and then we would see does it actually work in production, because things can be tricky to deploy, you might think that it works well when you're training your model when you're testing, and then yeah, so moving a little bit more carefully and having these processes in place is critical for buy-in.

For any clinical model, we we don't really deploy anything until it's been in this phase of like silent evaluation. So okay, something would just have to go silently in the background, and then we would see does it actually work in production.

Career advice

Yeah, I mean, I didn't think that I would end up in data science or technology in general. I think when I was starting my career in undergrad, I was an econ major. I was interested in research. I was interested in going more into kind of maybe public policy. And then the first job I had kind of changed my mind around this, and I became very interested in public health. And then that led me more to disease ecology, and I was interested in antibiotic resistance. And then that led me to Atlanta, where I went in grad school for disease ecology. And again, still the focus was on public health. But yeah, long story short, you don't really know what you're going to be when you're in school, and you need to be open to your career kind of evolving as you go along. And mentors are very critical in that. So I think the reason why I went down all these paths was because I had some inspiring mentors that were both my grad school advisors, my various bosses that I had through my career. So the other advice is, yeah, be open, and also don't burn bridges with these people, because they will be critical for a lot of your next steps.

At Children's as well, I actually came here after grad school. I left, and I worked in other industries and other companies for some time, and then I came back. So again, maintaining those ties with people that you know are part of your growth and your development is going to be rewarding later.

Build versus buy: Epic AI and vendor models

No, great question. And also, that Hangout was awesome. It was awesome. I really love having some of these healthcare discussions, because yeah, I mean, our field is special, but I think all fields are special, but yeah, but healthcare specifically and vendor models, like, we do both. It's often great to start with something that's already available to you. So a lot of these models that are out of the box in Epic are maybe the first pass. In pediatrics, in our case, and also in Matt's case, and in cancer, like, maybe there just aren't enough that are specific to your population. So most of the Epic models tend to be on adults, and maybe they've done some stuff for pediatrics. So we have to build a lot of this on our own. So there's a lot of considerations around, you know, what are the use cases that you want to pick to develop on your own? Because with a small team and constrained resources, there's always infinite demand for some of these things.

But yeah, I'd say we do a good blend of trying something that's maybe going to work, seeing if it does, and then if not, we go and we customize. We also have some research partnerships with institutions like Georgia Tech or Emory. So there's a kind of a third outside, or a third way to get stuff in for us, and that's some of these collaborations we have.

Getting buy-in for Posit Connect

Well, there's no option for free Shiny for us because we're in an enterprise. So to be able to actually deploy some of these things, they really need to be on a kind of vetted application that IS&T supports. So I think just the nature of doing things like, you know, in an institution at a certain size, you really don't have the option of going with a free open-source version. And of course, we are using a lot of open-source software and packages, but the kind of critical parts of the scaffolding around those, like Posit Connect has been amazing, and all the actual Posit products are, you know, they're basically the stack that we lead on the most for my team.

And it took a, it always takes some kind of proof of concept to show people. So having the open-source tools is great to get it in the door. So it was years ago that we built our first Shiny app, and it was for analyzing U.S. News World report rankings. Hospitals care a lot about those. So that got leadership attention. So that's, that was one where like, oh, you know what, we can't really do these in ClickSense or in like, I think we had SAS Visual Analytics at the time. So yeah, sometimes you have to make the case for it and put the extra time in, but.

Transitioning from public health to data science

Yeah, that's a great question. I mean, yeah, so many angles to start from, but I think from like kind of like the quantitative skills, like number one, like if you're in epidemiology, there's a lot of kind of rigor around working with data and with numbers that directly can be leveraged in a data science career. Particularly when I started, or sorry, when I moved to, you know, more from research to data science, there really weren't that many programs that would formally train you in, like, and people were getting an MIS degree or like Master's in Information Science and so on. But there was quite a bit of openness to coming from different fields. And I think for public health, the thinking around like, you know, scientific thinking and also biostats and epi kind of quantitative rigor lends itself well to the biggest, I think that's one of the steepest parts of the curve is like learning these tools, or sorry, learning the methods.

As far as like programming, kind of the other big part of data science is like, can you turn your analyses into something actionable and something that is repeatable? Again, I think compared to a decade ago, like using tools like Quarto or Markdown is kind of a basic starting point. And they're great for research, but they also have kind of the basic premise of making sure your code is reproducible. So just scripting to do statistical analysis is one thing, making something truly repeatable, both like for replicating your findings, but also to translate to your other projects is a big part of data science, right?

And then as far as like actually transitioning in terms of culture shift, I think healthcare was a great place for me at that point, because there is a lot of understanding about, you know, you're still working with people that are been through the biosciences. So I find it easier to talk to people that are clinical and if you're coming from public health, because they have a lot of understanding, respect and similar foundation knowledge.

And then, you know, if you go further, I think the trade off is like, if you go more into technology, for example, once I moved out of healthcare, I went into more like, like more the insurance side of things, or it was more payer, you get kind of a deeper understanding of technology there, because they have different data challenges. So I think that was important, but you lose some of that connection with people around, you know, the way of thinking, the kind of shared mental models around, you know, hypothesis testing, for example, or just the mission and values appreciation as well. But you learn a lot of other things. Always a trade off.

And the other part I would say, LLMs have made it much easier with coding also to create production grade code, or maybe not quite production grade code, but to cross that barrier into, I don't know how to program. So maybe that looks very different now than for someone making that call like 10 years ago to go into data science.

Python versus R for model deployment

Yeah. So one of the reasons is to deploy some of the stuff in Epic, it has to be in Python. There's just no supported R runtime. So the other reason is, you know, I think broadly there's a lot of, like, tools out there that are maybe offer better support for Python. Like MLflow being one example. MLflow is a tool to log your experiments and track them. They do have an R API, but the Python one is always a step ahead. So for a lot of the engineering work, I mean, talent is also another consideration. They're just more in Python to draw from. And some of that hurts me because I personally love doing the data wrangling and analysis and visualization in R. But yeah. I mean, I also I think you can do things in any language and you have to just have a team that is enabled to do their best in both. Or in whatever works for them. So for us, it happens to be Python for the ML work.

Agentic AI and coding assistants

Short answer has been more limited in part because we can't really a lot of our environment when it comes to data that's loaded. It can be sensitive. So we've been pretty careful with how we use coding assistance. Broadly, though, coding assistance, like code completion and just being able to ask questions in our HIPAA compliant instances has been a huge productivity boost. But the agentic coding where we'll just do more things with your code is something we're just beginning to tap into.