Data for the People!

Taka Ariga on Using AI to Combat Waste, Fraud, and Abuse in Public Programs

Data Foundation Season 1 Episode 7

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 47:26

Send us Fan Mail

In the latest episode of Data for the People!, Data Foundation Senior Fellow Taka Ariga discusses the prospect of artificial intelligence (AI) to enhance government efforts to prevent fraud and other types of improper payments in public benefit programs.

Ariga was the first Chief Data Scientist at the Government Accountability Office (GAO) where he also directed GAO’s Innovation Lab. He later served as the Chief Data Officer (CDO) and Chief Artificial Intelligence Officer at the U.S. Office of Personnel Management. He is now a Senior Fellow with the Data Foundation and the founder of Sol Imagination, an AI advisory company.

Amanda Cash, the Senior Director of the Data Foundation’s Center for Data Policy, joins the episode as a co-host.

Learn more about the Data Foundation/Deloitte report discussed on the episode, Navigating Transition and Change: 2025 Survey of Federal Chief Data Officers



Want to be part of a national community that promotes policies that enable government data to be high-quality, accessible, and usable? Join our Data Coalition: https://datafoundation.org/pages/join-the-data-coalition

The Data Foundation is a 501(c)3 nonprofit, nonpartisan think tank. All contributions may be tax deductible. We appreciate all charitable contributions towards fulfilling our mission to make democratic society better for everyone by championing the use of open data and evidence-informed public policy. Donate: https://datafoundation.org/supportus

Follow the Data Foundation on LinkedIn: http://www.linkedin.com/company/datafoundation

SPEAKER_02

Hey listeners, this is JB hosted data for the people. Just a quick note before we get started. This episode was recorded in late March. We ended up holding it so we could post an episode about climate data around the time of DC Climate Week and Earth Day. Almost everything else in the interview is still 100% timely. But if you're wondering why in a pumpkin's jumping in May, we're talking about the cherry blossoms blooming. That is why it was at the time peak cherry blossom bloom in Washington, D.C. Okay, with that, I hope you enjoy the episode. All right. Welcome back to Data for the People, a podcast from the Data Foundation. I'm your host, JB Wogan, and I'm joined by my Data Foundation colleague, Amanda Cash, who directs our Center for Data Policy. Welcome, Amanda.

SPEAKER_01

Hi, JB. Hi, Taca.

SPEAKER_02

Our guest for this episode, as you just referenced, Amanda, is Taca Avriga, a senior fellow with the Data Foundation and the founder of Soul Imagination, an AI advisory company. Welcome, Taca.

SPEAKER_00

So excited to be with you and Amanda, especially at the Cuspa Peak Bloom for the Cherry Blossom, which is one of my favorite seasons of the year.

SPEAKER_02

Yes.

SPEAKER_00

Mine too.

SPEAKER_02

Beautiful time of year. Tough time for those of us with seasonal allergies, but I'll get over it. Taca, as some of our listeners may know, you were appointed to be the first chief data scientist at the Government Accountability Office or GAO during the first Trump administration. And you later served as the chief data officer and chief artificial intelligence officer at the U.S. Office of Personnel Management, all the way through the first few months of the second Trump administration. So I thought maybe we would start with asking about after spending so much of your career in the private sector, what was the pitch that convinced you to work in the federal government? And then once you were there, what motivated you to stay for as long as you did?

SPEAKER_00

Yeah, I have to say 2019 was an interesting confluence of both my professional aspiration and also an opportunity that showed up at the door. At the time, I have to say I was getting somewhat disillusioned with being part of the contracting workforce because you usually only work on a piece of the puzzle. And then there is that treadmill of utilization and revenue expectation that never goes away. So when GAO, which unusually engaged an external recruiter, knocked on my door to broaden their candidate pipeline, I was somewhat skeptical, right? Because having worked at an assurance practice, I knew firsthand how skeptical auditors can be. And so it wasn't clear to me when we talk about innovation or at least leading the innovation lab, are we talking about a PowerPoint slogan? Or is this a genuine attempt at moving the needle on how oversight might benefit from the use of emerging technology? And what ultimately sold me was really the commitment from the senior leadership of GAO to say this really is an opportunity for GAO to build its own muscle memory so that we're not doing oversight based on theoretical and research content, but based on the perspective of having, for example, implemented AI solution or worked on autonomous drone or virtual reality kind of technology, and so that we can really infuse that practitioner's point of view to not only developing our own solution to increase the sort of oversight capacity, but also how we might conduct evaluation as we encounter these technology across various agencies. So it was certainly a sort of once-in-a-life opportunity to build a startup organization within a government, right? And I think more importantly, I was given the permission, and by that I mean having the people, the money, and the space, to really use emerging technology as a catalyst to reimagine what oversight can be. As you both probably are very keenly aware, government is just layers upon layers of analog processes that we just kind of build on top of each other. And emerging technology can be used certainly to automate a lot of those functions. But I think it's more critical for us to really think about how we can reimagine what those processes should have been now that we have digital capabilities, and GAO offer that sort of permission structure to really think about how can audit be done more effectively, more efficiently. And we were exploring the idea of you know, how do we augment the independence of an agency like GAO so that we're not always reliant on agency provider information? So, for example, if you have an area drone, um you might be able to assess independently forestry management or emergency, you know, disaster sort of preparedness, things like that, that really augments an agency's oversight capacity. So it was an incredible five-year run that started the innovation lab with, I have to say, questionable existence, like it was any startup organization might be, to really become, I think, an authoritative incubator of best practices and solution development, at least within the legislative branch agency. And throughout my tenure, I had a chance to work on GAO's AI accountability framework, which is slightly different than the NIST framework in which we take a trust-bit verify approach. So tell us how your governance actually drives towards accountability. Show us evidence in which you have evaluated potential bias towards your AI responses. Don't just tell us that your AI is accountable. And so it was the most fascinating, I think, five years of my professional career. And I, you know, I would have been kicking myself had I not taken that opportunity to really support GAO and then support the broader oversight community. Fast forward 2024, had a chance to join OPM, which at the time was embarking on, I think, once in a generation type of modernization effort. And the goal was really to become data-driven, evidence-based champion for the federal workforce. And that was really important to me because as we sort of embark on these digital journeys, I actually think that technology is probably not the most important consideration. It's the people behind those technology that will end up influencing how we use them to the extent that we use them and be able to, you know, be an exemplar of what best practices might be. Uh, so it really it's about the people. And that particular mission really attracted to me at OPM. So that's a sort of a heartbroken decision to leave GL and join the Office of Personnel Management.

SPEAKER_02

When you first made the leap into the public sector, did you have a number in your head for how long you would stay? I mean, were you thinking this is a lark? I'll be here for a year, we'll see how it goes, or did you know I'm gonna try and stick this out for five years, three years? And similarly, did you think this is gonna be my career now? I'm going to be in the public sector. I may migrate to other agencies that need similar help, or was the switch to OPM not quite as big of a leap as going from the private sector to GAO to the federal government, but another big risk, another big leap for you?

unknown

Yeah.

SPEAKER_00

I didn't specifically have the concept of tenure in mind. To me, what was important was the level of impact that I am able to deliver. Um, and that was an important test for me professionally to say if I were to lead the innovation lab, can I actually move the needle beyond some of these um initial hypes around whether it's augmented reality, whether it's AI, and really be able to sort of impact how governmental agencies deliver their mission? And that was the part that was most attractive to me. And you know, part of that is certainly, you know, there's a technical issue, there's governance issues, there's an operational issue that you have to be concerned with, but policy in Nexus, right, collaboration across different stakeholders, and you know, there's sort of, I think, very salient questions around how do we move from prototypes to scalable solution, which is a much different consideration. So to me, it was you know, less so in the number of years, but really wanted to at that time to see what impact I am able to sort of deliver on behalf of an agency like GAO. And that was more important to me. And I often say, rather than focus on tenure, if an agency doesn't like what I do and how I do them, I am more than happy to be accountable and they can let me go. Right? I'm not the kind of person that necessarily latch onto sort of status quo and how I might be able to stay for longevity. I'm more interested in how we can think differently within sort of reason, of course, and how we might be able to move the needle in a way that really sort of, I think, flip the script around how governmental entities are always lagging behind the adoption of emergent technology. I think AI is a good example where it is the public sector that is really marching forward with an idea of how accountability and innovation can go side by side. Because we're not in the business of monetizing data, right? We have to provide a level of privacy and security to everything that we do, and governmental entity cannot rely on the best hits of what the internet has to offer in terms of how we do policy decision or operational decision. So that adds a level of care that I think is appropriate. The trick is how do we make sure that governance, that accountability move at the same speed as innovation? And I really think it is the public sector that is driving that conversation.

SPEAKER_01

Well, Taka, it sounds like you were driven in part by mission, impact, and innovation, I'll say, which are three things that I loved about being in the public sector. I know a lot of people don't put innovation together with the public sector, but I think there's a lot of innovation that happens, and people are incredibly mission-driven. And that's why they choose the public sector. Speaking of all of those things, three of those were the government efficiency conference together including consistent management services leadership coalition and two recruiting themes throughout the day where they have applications in government and addressing waste running abuse. So I know you've talked about this a lot. And I think we were wondering if if you were to combine the two, how can government use anything to curb waste recording abuse in government programs? And what do you see as the biggest opportunities and what obstacles or pitfalls stand in the way of these kinds of innovations?

SPEAKER_00

Yeah, I mean, as we all are keenly aware, fraud, waste, and abuse, and other forms of improper payments have dogged the government at the federal, state, and local level for, I don't know, eternity. Right. And but I do think one of the biggest misconceptions around what AI can do in combating fraud, waste, and abuse is the notion that there is the AI that we can deploy to solve the fraud issue, right? And if any vendor ever shows up at my doorstep that sort of advocate for that kind of automagical thinking, I tend to walk the other way because it doesn't work that way. What I do see is a lot of sort of focused, specific application of AI across the life cycle of assessment, detection, investigation, and prevention where AI can really make significant impacts, right? So you certainly have a significant use cases around identity verification, validating who you say you are, I think is an important part of that mitigation strategy. And a lot of time facial recognition is entirely power through the use of AI. We're familiar with TSA checkpoint, we're familiar with how we might log into IRS through id.me and login.gov kind of tools. And I think those kind of capabilities have played a significant role in mitigating potential identity sort of related issues. You certainly see a lot of AI in sort of document review investigative process. When I was growing up in professional services firm, document review was the bane of existence for any team, right? Because this is a very manual curated process to identify an email or piece of document that is relevant to your investigation to prove intent or to prove motive. And even at the time with technology assisted review, it was a very clunky process to sample relevant document and then apply that learning to a set of algorithms to then apply the rest of the information. But with AI, it's fundamentally transformed not only the process of document review, but the entire business model in the legal and professional services domain. And that is a wonderful way for us to really cut down on the timeline and the resource required to conduct complex investigation. And you certainly have a lot of the discussions around how we use machine learning to identify transactional anomalies. So this is where I think algorithms are really good at highlighting things that don't quite pass the smell test, right? And so I think we're really living in a sort of exciting what I call like algorithmic renaissance where AI can potentially really impact how we see the issues of fraud, waste, and abuse, as long as we don't think that there is DAI to solve defraud issue. One of the use cases that I'm most excited about is something that I presented recently at an ACFM conference in Nashville, and that's really how AI might revolutionize the way that we assess for fraud risks. So assessing for fraud risks is something that auditors have to do. PCALB advocated for it, and the problem is assessing fraud risks traditionally has not been something that auditors are really good at. And I can certainly understand why, right? In order to assess fraud risks, you're asking auditors to think about bad things that will happen in the future. And if you're a nice person, you generally try not to focus entirely on what might happen in the future that may be catastrophic. And then the notion of what might happen in the future is also an abstract concept. There's no certainty that something will happen. And then when it comes to fraud, waste and abuse, you do need to anchor yourself on some level of experience, having seen actual fraud happen, so that you can exercise that professional skepticism to say what might happen in the future. So this is a problem, certainly for inexperienced auditors, to think about what some of the fraud risk might be. But as it turns out, it also is an issue that impacts experienced auditors. So I came across a um very interesting research paper that was published back, I think, March of 2025, in which they conducted an experiment. They gave a group of experienced auditors, half of them access to Gen AI, half of them not, and they asked them to develop uh fraud risks based on a known fraud case. And the finding was actually very shocking. The group of experienced auditors that had access to Gen AI was able to identify twice as many fraud risks and twice as many highly relevant fraud risks compared to the experienced auditor that sort of identified those risks in a sort of traditional manual way. And they were able to do that work 30% faster compared to the group that didn't have access to generative AI. So I think that finding alone was sort of you know shocking in its own right, but they did the same experiment with a group of inexperienced auditors, so basically interns with one month of experience. And what they found was an inexperienced auditor with access to Gen AI can actually outperform an experienced auditor without Gen AI. Now, what that suggests is potentially AI can close the experience gap given that we are going through a cycle of shortage of auditors and accountants. So I think that is also exciting. But they went even further because a lot of time when we talk about AI, there's a presumption that we must have human in the loop. And I definitely understand that particular impulse. But they kind of sort of tested that parameter to say, well, you know, if we allow experienced auditors to use Gen AI, that is a more of a curator experience that because they might iterate on different prompts, they might iterate on different kinds of responses. But what if we just let AI loose on its own? What they found out preliminarily was that having AI identify fraud risk on its own did not materially was not materially different than let's say a curator experience on using AI. So then that poses the question to say, well, do we always need a human in the loop on something potentially like fraud risk assessment? Right? I absolutely agree. When it comes to high consequence stuff like application of AI in war fighting or a national security issue, you absolutely want to be mindful of the consequences. But when it comes to an activity like fraud risk assessment, maybe there is a sort of balance which AI can do its work and then humans can focus their energy on, for example, adopting an audit plan based on those emergent fraud risks. So I think this is a very perhaps wonky application of how AI can be used in the sort of life cycle of fraud, waste, and abuse. But it's an exciting, I think, capabilities for auditors and accountants to embrace. Now, I don't think this is an example of how AI might replace all of us. There's always that sort of narrative floating behind the scene. But I do think this is an example of for our immediate future, it will be demarcated by accountants and auditors that choose to embrace AI versus those that don't, as with any inflection for technology.

SPEAKER_02

Yeah, that's a fascinating example. And I hope the authors of that paper have expressed gratitude. I know you also, not only you're describing the findings from that paper here on our podcast, but also if folks go to your LinkedIn account, you have a nice summary there and explain kind of what the implications are, not just what the findings are, but what your takeaways are. I was wondering if, based on your experience either in the federal government or through your consulting work, can you think of any instructive examples where agencies have tried to use AI to fight fraud and have had varying degrees of success? What are the things that have worked and what are the things maybe that didn't work the way they had hoped?

SPEAKER_00

Yeah. I think one of the timely use cases here is how AI might be applied to look for transactional anomaly within healthcare claims data. There are certainly a lot of headlines around benefits programs that are susceptible to fraud one way or the other. And I think this is a topic that is gaining a lot of traction. The problem is healthcare is a complex business, right? And individual, just looking at the claims data, you know, a lot of time healthcare decisions are very individualized, the kind, the prescription drug, the kind of services that you get. And I think the most perhaps the challenging part is that fraud inherently is rare. ACFM, I think, has published a statistic to say up to 5% of revenue usually is lost to fraud. Well, on the flip side of that, that means 95% of the transaction usually are okay. So now you're living within this ecosystem where the fraud is inherently rare, but also fraud carries a very specific legal standard. In other words, it's not fraud until a jury in the court of law says it is fraud, and then you go through the legal process. Up till that point, it is possibly improper payment, it is possibly waste, it's possibly abuse. And so, from that perspective, applying something like a supervised machine learning techniques is very difficult because, first of all, you don't have the population of true fraud that you can necessarily learn from. And then you have this population of transactions that doesn't quite raise to the standard of what a legal definition of fraud might look like. And nobody is really tracking those gray zones of anomalous transactions. And so we're starting to use something like an unsupervised technique that doesn't necessarily rely on existence of true fault to learn from, to say, can we apply unsupervised machine learning techniques to identify anomalies in general? But we also have to be very cautious in that approach because, for example, the cost of false positive is very high. You can't just say, Oh, I have this fantastical algorithm that identifies a bunch of red flags, and you end up sending investigators everywhere that investigate things that may now turn out to be an issue. And that is in of itself not a wise use of resources, right? Conversely, on the false negative, you don't want to end up creating algorithms that don't detect things that it's supposed to be detecting. So, in a fraud, waste and abuse context, the costs of errors are very high. Because if you're wrong, somebody may not get the benefit that he or she deserves. If you get it wrong, the investigator is now spending a lot of time chasing down, you know, smokes that may not necessarily exist. And so it is very difficult to use. Machine learning in a way that identifies sort of transactional anomaly that ultimately leads to some adjudication of fraud, waste, and abuse. But I think a lot of experimentation, a lot of research continues to happen because this is where the focus is in terms of how we are sort of driving the accountability conversation within the public sector. So I am actually very excited to work with a number of teams on this particular uh notion of whether it's healthcare, whether it's social benefits, and how we can think of application of machine learning to highlight the right set of anomalies to really drive that accountability conversation. But we have a long way to go.

SPEAKER_02

Yeah, sounds like some exciting prospects, but also some risks out there we need to be careful of. Hey, you mentioned a couple of acronyms that I wasn't familiar with, and maybe some of our listeners wouldn't either. So I just want to clarify those. I think you mentioned twice ACFE, right?

SPEAKER_00

Yeah. Uh Association of Certified Fraud Examiners, similar to uh Association of the Government Accountant, they're the advocacy group that really talks about fraud prevention, fraud detection, fraud monitoring. I think they're the one that also came up with the famous fraud triangle that talks about opportunity, incentive, and rationalization on why people behave the way they do. And it's been an exemplar of how the profession of sort of anti-fraud professionals really think about the structure, the methods, and the sort of techniques that we apply to deal with fraud, waste, and abuse. And they have a multiple chapters across the country and at international level. And it's really a community of practitioners to get together, think about what the latest and the greatest best practices might be.

SPEAKER_02

Okay, great. And then they hosted the fraud conference that you spoke at in Nashville not too long ago. You also mentioned the national chapter. PCOB.

SPEAKER_00

Yeah, yeah. It's the regulator that is affiliated with SEC that regulates all of the commercial audit firms out there in terms of how they should be conducting audits, the quality of the audit, and they do sort of enforcement action when audits are not uncovering the stuff that they're supposed to be uncovering.

SPEAKER_02

Okay, great. Yeah. PCAOB, public company accounting oversight board.

SPEAKER_00

That's right. Yeah. That came out of the off ranges of reform era. Okay, great.

SPEAKER_01

Okay. So sticking with the subject of AI, but moving um on a little bit from what you've been talking about, we have to talk briefly about the day you played Chello from a crowd of AI Martin and Government Data Nerds. You volunteered to perform in an immense month within the Data Foundation and put them in the AI from LinkedIn as part of their series of moments on AI and culture. Maybe I think you're also going to drop a clip so listeners can hear you warming up. What ideas were you exploring? And are there any takeaways for tasks that AMI can do as well or better than us as humans? And are there tasks that are still better suited for humans and in between scenarios?

SPEAKER_00

You know, to be perfectly honest, when Nick first asked me about that particular event, I was dreading, right? Um, as someone who study music at a conservatory level, this is a particular area I try to avoid to say, look, I spend most of my life practicing my art, even though I spend my time advocating for the good use of AI, like this is not an area that I want to touch because, God forbid, AI becomes better than you know what I can do on a personal level. So it was a very uncomfortable topic. But it turns out to be a very interesting exploration. The good news is that after having done that talk, what I discover is that AI can't really deal with, for example, concepts of intent, interpretations, spontaneity, and even you know, abstract concept like how do I establish emotional connection with an audience, right? At least in the current AI architecture, that is not within the realm of possibility. So, sigh of relief on that particular path. But I do think what I learned from that whole exercise, what AI is really superb at is giving you the scaffolding for the ideas that you have in your head. And so I experimented a ton to say, well, since you're so good at what you do, can you give me, can you turn a contrapuntal theme in the style of jazz? Or just by a couple of prompts, can you create a song in the style of lullaby but in a minor key? Now, if I were to do any of that stuff, rather than prompting, I would have had to have a keyboard, I would have had to experiment it, I would have had to know the style of jazz, and I would have had to know the Baroque style of music and to even be able to make that transcription. And assuming I can actually do it, that is probably a multi-day, multi-week, if not multi-month exercise to come up with some iteration that is possible. With AI, three seconds. That's a scary part, right? And so it actually understands the kind of sort of tonality, the kind of scale, the kind of cadence required between different styles of music. It was able to create just snippets of the music file very quickly, trying to realize the idea that I have typed into the prompts. Um, another example of this is I don't know whether you both recall, but back in I think spring of 2025, there was this sort of movement around people creating their own action figure using Chat GPT. And that was everywhere, like Barbie doll, Ken doll of like individuals. And so I jump on the bandwagon and I was giving a speech, and it's like, you know what, I'm gonna try this myself. How can I create an action figure? And I give it some prompt to say, look, the title of my speech is The Confession of a Heretical Technologist. Can you create an action figure that kind of represents that? Now, within I will say less than three minutes of iterating prompts here and there, it did a very passable job of an action figure with maybe an accessory of a laptop and a coffee in my hand and a virtual reality goggle as a just a sort of added touch, and it kind of resembled like an action figure of me. Now, if I had to do that using Adobe Illustrator, assuming I know actually how to use Adobe Illustrator, again, this would have been a multi-week, multi-month trying to figure out well, how do I turn a photo into more of an action figure, skin tone, that kind of thing? How do I make the right light, the right color palette, things like that? This would have been, at least for me, I don't have the capability to do so. But with AI, it was able to generate a couple iterations of that in a matter of minutes. So while I'm gratified that AI is not going to replace intentionality, interpretation, or spontaneity anytime soon, but it is very productive for me to have a tool that can just synthesize what I have in mind in almost a sort of reality kind of a prototype. And I think that will generate new pathway for creativity. And so it's almost like a symbiotic relationship to say, okay, we're not getting replaced. That's good. That's question number one. But how can we use AI in a way that is beneficial to the overall artistic endeavor? And that's the part I'm excited about. But I'll say I probably won't be doing that kind of activity anytime soon.

SPEAKER_02

One of the things that stood out to me, Tak, as you're talking, is that there are certain tasks where I know I could use the AI to move them faster, but they're tasks that I have a great sense of ownership of and I enjoy doing, and I'd rather take a little extra time to have my personal imprint on them. One of them right now is I write, I still write the blogs that summarize these podcast episodes, even though we could get a decent version from Clumad or ChatGPT pretty quickly. And it also reminds me this isn't exactly an AI example, but last summer there was an experience I had with my son when we were going to the library and we got up to the front to check out his books. I mean the whole stack. And he likes to use the little scanner gun and go book by book to scan the barcode for each one. And he likes hearing the, he likes lining it up and he likes hearing the little beep beep. And the librarian at one point intervened and said, you know, you can just put all of the books on this pad that we have, and it will automatically read all of them at once, and you don't have to do the scanning. And I had to explain, he doesn't want to, he he he doesn't get to do the beep, beep, beep then. So yeah, it takes like three extra minutes to do it this way, but maybe, maybe not even, maybe like an extra minute. But that is the case where automation is not better, at least in the I in my son's eyes.

SPEAKER_00

Yeah. Um, I mean, uh another, I guess, very salient examples that I have where AI is not always better at is resume review. Right? I come across all of these AI slopes of resume creation that you already know one, this is AI created, and so you're already casting some credibility around individual candidates. Can you actually do the work? Or are you just based on what ChatGPT said, you might be able to do the work? And it's actually creating more work for me as an evaluator to say, do I even bother to read any of these AI-generated stuff? Because I have no foundation for that credibility of that individual. Now, if somebody who can thoughtfully use AI as rewrite of the specific narrative and be able to add additional sort of evidence to support capability, that is a plus. But I have to say that kind of resumes are far fewer in between. And now I'm inundated with you know AI create a resume that I almost want to just throw out my hand and say, I can't deal with this anymore. Um, so I think this is an example of where that efficiency is creating a little bit of an unintended consequence where you're starting to have AI pitted against each other. So, well, how do I use AI to generate resume? And then how do I use AI to identify AI generate a resume? And it's just like no end in sight.

SPEAKER_02

I want to pivot to something that actually just happened this week, the week that we're recording. The Data Foundation and Deloitte just released a survey of federal chief data officers, a role that you held not too long ago. And so I wanted to ask you about one of the findings. In 2025, which was the survey period, 30% of federal chief data officers reported that they also served as their agency's chief artificial intelligence officer, and that was up from 13% in the same survey the year before. So 13% to 30% in a year-to-year timeframe. Since you had this exact experience of wearing both hats in one agency, what do you make of this trend? I mean, is it a good thing for chief data officers who are so focused on data, data governance, data quality to be in the driving seats for agencies' AI strategy and AI use? Or do you think it's asking too much of the same individuals, a little bit of both? What's your how do you how do you think about that finding?

SPEAKER_00

Yeah, I wholeheartedly advocate for the stance that a public sector organization is not AI ready if it is not data ready, right? Because as a governmental entity, we rely so much on non-public information, whether that's draft policy, whether that's personnel evaluation, whether that's you know, identifying workforce data or other transactional data, we're not in the business of monetizing information. So not only are we thinking about you know how do we build that context in the data, but we have to think about data loss. We have to think about privacy, we have to think about um a bunch of other considerations that is not entirely compatible with the best hits of what the Redis, the Wikipedia, and the interweb can offer out of the box. So I think I mentioned at the beginning of this podcast that technology is probably to me a more of a secondary and tertiary consideration when it comes to public sector use of AI. I will put data on top of that. Now, it's not, you know, I understand why certain positions are double-hatted, maybe for budgetary reason, maybe for head count reason. To me, I think it's less about how many hats you wear, but more about whether the CDOs are actually in power to shift that focus from technology to data, right? I'll use an example of sort of AI.gov where GSA has created an example R chatbot that all agencies can adopt, which is, I think, a great thing, right? I do believe the federal government should have be a one buyer as opposed to thousands of different buyers. I think that's a great thing. But if you think about it, well, one chatbot template doesn't mean it works as well for IRS versus GSA versus Department of Transportation or Department of Homeland Security. And this is where the CDOs come in to say what are the right guardrails, what are the right use cases that needs to be in place so that we are not violating privacy or civil liberty or just the sovereignty of the governmental data, whether intentionally or unintentionally. And so I think this is where the CDOs really take a really profound role in impacting whether an AI adoption journey is successful or not. Right. And you know, I think by now enough organizations have sort of played with AI to know that this is not just shoving all your agency data into LLM and you should expect some sort of clarity. Um, as an L, you know, as a CDO, you now have, I think, new considerations that may not have existed a couple years ago. So, for example, context engineering, that's a thing, right? Data loss prevention, that is increasingly becoming a thing because there's that inherent conflict to say all of the foundational LLM out there, they have a voracious appetite to vacuum up as much data as possible. Whereas a governmental entity, we have a need to protect that data so that we don't end up using it for monetization purposes or some other inappropriate purposes. And so there's that inherent conflict that I think CDO is well suited to manage to make sure that, first of all, for example, we're not uploading, I don't know, draft policy into ChatGPT, even if it's a paid and protective version of Chat GPT. Because once that data leaves the boundary of your organization, it's gone. And do you really want to rest the reputation of your agency based on a set of terms and conditions in the contract? Personally, I wouldn't and I didn't. So we established in working with the CIO shop to make sure that the technology boundary is such that we create a safe space within the agency for folks to experiment to explore, but the boundary is so hardened in such a way that there is no possibility of leakage of that information, even if it's an unintentional use. So it is a I think cross-collaboration, it's a partnership to address the I think increasingly horizontal nature of risks that AI poses. So it's no longer about data silo. You know, we often talk about data silo in a very sort of vertically oriented way. That is still a challenge for sure. But we need to think about how do we then transcend those silos to deal with context engineering, to deal with data laws, to deal with interoperability of various data assets in a privacy protected, sort of post-quantum encryption type of consideration. So, how do we do it appropriately? And I think that's part of the secret sauce of a governance to say, how do you do that at the speed of innovation so that we don't end up in a position that we have to say, oops.

SPEAKER_01

Taka, I just want to say there are two things I'm gonna take away from this podcast today that you've said one is auto-magical. I love that. The second is if you're not data ready, you're not AI ready, which I feel like should be on t-shirts everywhere. So continuing on the theme of data readiness and data governance, especially in the public sector, because you're a recognized expert. Are there any questions you wish that people were asking about data governance that get lost in the current discourse, which I feel is probably really easy to do? And what should listeners be thinking about when it comes to data governance in their own agencies or organizations?

SPEAKER_00

Yeah, when it comes to data governance, the boring questions are probably some of the most important questions. Yeah. Right. And amazingly, for a lot of, you know, when I talk with data engineer, data scientists, and even, you know, data executive, one of the most difficult questions for them to answer is how do you empirically know how reliable is your data? You know, one of the benefits of having gone through GAO in my tenure is that at GAO, you are really instilled the concept of unless the evidence and the data is reliable, the audit does not move forward. So if that's an expectation of an oversight, why shouldn't we have the same expectation of algorithms? And there are certainly sort of techniques that could be applied to identify whether your data is complete, is accurate, is timely, is unique, right? There's there are many, many different dimensions, but I don't see enough of that focus to say, well, how do we know the data is reliable? Otherwise, all of the downstream issues around context engineering or even inventory of that data, there is no ground for trust or for confidence. And one of the reasons GAO as an institution never relies on data being posted on USAspending.gov or data.gov or any other government repository is we just don't have that grounding of data reliability. We don't have the necessary metadata to say when was this thing pulled, how was it pulled, was it pulled in an appropriate way, who filtered what along the way. And so we still go through the exercise of making very bespoke data requests that often may seem duplicative than the data sets already been posted in a public available site. But that's the reason that we are so careful around the question of data reliability. And I think that is one concept that should continue to permeate across all the agency to say, yeah, technology is great, LLMs are great, MCPs are great, there's a lot of different changes in architecture, there's a new model that comes out every other day. But the boring question of how reliable is your data will have so much impact in terms of the quality of the output, the value that you end up generating. I mentioned, you know, the cost of false positive and false negative, that largely has to do with the quality of the information that you're assessing. The context engineering is no longer about retrieval, augmented generation. There are a lot of evolving techniques that are being adopted, whether it's knowledge graph, uh, whether some other techniques, those are all new dimensions of data governance beyond just mapping out the metadata. So I do think you know the version 1.0 of Evidence Act really made significant progress on accessibility of governmental data. I do think the next chapter of that is really thinking about high-quality data and then also will govern data in such a way that we can drive increasing confidence and trust on what gets generated from government use of AI. You know, a dear friend of mine once told me AI adoption will only happen at the speed of confidence, not at the speed of innovation. And I think that is true, right? We have to have a certain amount of trust behind what the chatbots are telling us, whether that's public facing or internal facing, and you can only achieve that through a really robust data governance by answering some of these boring questions around how reliable is your data.

SPEAKER_01

Yeah, I just want to say the old adage, garbage in, garbage out, is still holds up for AI and data quality. And I'm just not sure people are thinking about it enough as they're implementing AI in whatever organization they're in.

SPEAKER_00

So yeah, somebody once jokingly told me that, you know, we spend all these efforts training the brightest data scientists out there so that they can create the most compelling clickbait for Silicon Valley. And there's some truth in that, right? Like, you know, social media companies have really, I think, almost perfected the use of algorithm, you know, whether it's TikTok, Instagram, or whatnot, to make sure that you continue to stay engaged, that you are interacting on the platform. But that's I think different than trying to achieve the kind of public sector mission. And so, how do we build that muscle memory? How do we insist on that level of governance so that we don't, and I go back to this notion, as a public sector entity, we don't get to say oops too often. And if we have to say oops, something has gone wrong. So there is, I think, an appropriate level of bureaucracy that exists to protect privacy, to protect civil liberty. The trick is how do we optimize those processes, those reviews in such a way so those governance structures can move at the speed of innovation?

SPEAKER_02

It goes back to the phrase that you called out earlier, Amanda, about auto-magical thinking that it really does feel like magic. And one doesn't think unless it's really presented to you, what's going into this, what's the data that's being used. How high quality is the data? But I also mentioned I loved the phrase algorithmic renaissance. And I do think that we need to come up with a data foundation t-shirt that says if you're not data ready, you're not AI ready. I think this is a great place to end today's conversation. Taka, thank you so much for joining us today. We're so grateful to have you as a as one of our senior fellows at the Data Foundation.

SPEAKER_00

Yeah, thanks for having me. It's been great.

SPEAKER_02

Thanks for listening to Data for the People, a podcast of the Data Foundation. You can learn more about our guest, Taka Arika, by visiting his website, soulimagination.ai. A link to the website is in our show notes. Taka is a senior fellow at the Data Foundation, specifically with the Center for Data Policy led by Amanda Cash, who you heard co-hosting with me today on this episode. A link to the Center for Data Policy's website is also in our show notes. We discussed a recent survey from the Data Foundation and deployed one of the federal chief data officers. A link to that survey is in our show notes. If you liked this show, please subscribe wherever you listen to podcasts. To learn more about the Data Foundation, go to datafoundation.org.

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

Management Matters with James-Christian Blockwood Artwork

Management Matters with James-Christian Blockwood

National Academy of Public Administration
The PolicyViz Podcast Artwork

The PolicyViz Podcast

The PolicyViz Podcast
GovEx Data Points Artwork

GovEx Data Points

GovEx Data Points
Poverty Research & Policy Artwork

Poverty Research & Policy

Institute for Research on Poverty
Scholars Strategy Network's No Jargon Artwork

Scholars Strategy Network's No Jargon

The Scholars Strategy Network
After the Fact Artwork

After the Fact

The Pew Charitable Trusts
Follow the Data Podcast Artwork

Follow the Data Podcast

Bloomberg Philanthropies