Making Artificial Intelligence Truly Trustworthy

Jason D'Cruz leans against a concrete pillar on the Podium

Jason D'Cruz, of Philosophy (Photo by Brian Busher)

ALBANY, N.Y. (Aug. 26, 2021) — Technical advancements in artificial intelligence (AI) and machine learning over the last decade have outstripped our ability to understand and critically evaluate their social and ethical significance, says Jason D’Cruz, associate professor in Philosophy and an expert in ethics and moral psychology.

Distrust of AI systems is a major obstacle to their adoption, said D’Cruz, explaining that the challenge is deeper than merely eliciting human willingness to trust AI. The challenge is to create systems that are worthy of trust. What structural and design features of an AI system make it trustworthy or untrustworthy?

Seeking an answer, D’Cruz, working with PhD Candidate Saleh Afroogh, has set out to develop the foundations of a trustworthiness assessment framework for AI-driven systems. “To create trustworthy systems, we need implementations that empower people rather than manipulate them, tools that work alongside them rather than against them,” he said.

He referenced Stuart Russell, a pioneering AI researcher, who put the problem like this: “We say that machines are intelligent to the extent that their actions can be expected to achieve their objectives, but we have no reliable way to make sure that their objectives are the same as our objectives.”

Amid a backdrop of the corner of a concrete pool and large green trees, Saleh Afroogh looks to the camera — Researcher Saleh Afroogh, PhD candidate in Philosophy

D’Cruz and Afroogh’s project, “Trustworthy AI from a User Perspective,” received one-year funding of $100,000, beginning in Fall 2021 from the SUNY-IBM AI Collaborative Research Alliance with the possibility of renewal. “Our project,” said D’Cruz, “is to lay some of the foundations for artificial intelligence that realizes human values.”

While he stated his primary aim with the project is theoretical, “our ambition is also that, in collaborating with AI and ML designers and engineers, including specialists in human-computer interaction at IBM, our work will have a downstream impact on the design and implementation of AI systems.”

D’Cruz said that developers face several challenges in creating AI that is worthy of human trust. “We will be investigating two of these. The first is the ‘black box’ problem: How can we rationally trust the outputs of a system whose processes we do not fully know or understand?

“The second problem involves over-confidence in AI systems: How can we design AI systems that, when given a task, can reliably tell you if they are able do it and also explain why?”

The researchers’ goal, through their collaborative work with developers and scientists at IBM, is to develop AI applications that are “richly trustworthy,” a concept they borrow from Australian philosopher Karen Jones.

“A richly trustworthy AI system must be able to continuously monitor and assess its own dynamic functioning,” said D’Cruz. “It must then generate signals that are legible to both human agents and also other AI systems about what it can be trusted to do.

“Rich trustworthiness is a step beyond ‘comprehensibility’ in that the AI system is tasked with producing accurate and interpretable signals about exactly which human dependencies it can and cannot respond to.”