Similarly, New York City is considering Int 1894, a law that would introduce mandatory audits of “automated employment decision tools,” defined as “any system whose function is governed by statistical theory, or systems whose parameters are defined by such systems.” Notably, both bills mandate audits but provide only high-level guidelines on what an audit is.
As decision-makers in both government and industry create standards for algorithmic audits, disagreements about what counts as an algorithm are likely. Rather than trying to agree on a common definition of “algorithm” or a particular universal auditing technique, we suggest evaluating automated systems primarily based on their impact. By focusing on outcome rather than input, we avoid needless debates over technical complexity. What matters is the potential for harm, regardless of whether we’re discussing an algebraic formula or a deep neural network.
Impact is a critical assessment factor in other fields. It’s built into the classic DREAD framework in cybersecurity, which was first popularized by Microsoft in the early 2000s and is still used at some corporations. The “A” in DREAD asks threat assessors to quantify “affected users” by asking how many people would suffer the impact of an identified vulnerability. Impact assessments are also common in human rights and sustainability analyses, and we’ve seen some early developers of AI impact assessments create similar rubrics. For example, Canada’s Algorithmic Impact Assessment provides a score based on qualitative questions such as “Are clients in this line of business particularly vulnerable? (yes or no).”
There are certainly difficulties to introducing a loosely defined term such as “impact” into any assessment. The DREAD framework was later supplemented or replaced by STRIDE, in part because of challenges with reconciling different beliefs about what threat modeling entails. Microsoft stopped using DREAD in 2008.
In the AI field, conferences and journals have already introduced impact statements with varying degrees of success and controversy. It’s far from foolproof: impact assessments that are purely formulaic can easily be gamed, while an overly vague definition can lead to arbitrary or impossibly lengthy assessments.
Still, it’s an important step forward. The term “algorithm,” however defined, shouldn’t be a shield to absolve the humans who designed and deployed any system of responsibility for the consequences of its use. This is why the public is increasingly demanding algorithmic accountability—and the concept of impact offers a useful common ground for different groups working to meet that demand.
Kristian Lum is an assistant research professor in the Computer and Information Science Department at the University of Pennsylvania.
Rumman Chowdhury is the director of the Machine Ethics, Transparency, and Accountability (META) team at Twitter. She was previously the CEO and founder of Parity, an algorithmic audit platform, and global lead for responsible AI at Accenture.