ethics of AI: a translation task

“To effectively contend with questions of fairness, the machine learning community cannot reduce fairness to a technical question. Instead, it must increasingly and explicitly adopt an agenda of broad institutional change and a stance on the political impacts of the technology itself.”

Does this statement seem obvious to you? It seemed obvious to me when I first read it – it totally fits in with the FAT*-style narrative that artificial intelligence simply amplifies the systemic biases of society and the STS-style argument that, ahem, artifacts do in fact have politics.

But, oh boy, this statement was the center of quite a bit of controversy at this past year’s International Conference on Machine Learning (ICML). In fact, this statement was the center of a hosted debate during the conference. There were two sides to this debate:  no, we cannot reduce fairness to a technical question, and yes – we can and should reduce fairness to a technical question!

And while you might (maybe strongly) disagree with one side or another, it’s important to note that this debate got a few hundred machine learning researchers and practitioners in a room to contend with a big ethical dilemma.

Image result for kate crawford fairness
From Kate Crawford’s “The Trouble with AI Bias” talk at NIPS 2017.

But what you might also note is that this kind of ethics discussion – one that centers around causal inference, statistical measures, impossibility theorems – feels very different from the design-oriented discussions we’ve been having in class. Furthermore, if you go talk to actual moral philosophers interested in AI (as I have had to do for my thesis), you’ll find that the discussion differs even more – and you’ll find yourself wondering if moral reasoning is useful, and if so, how useful (and can you prove it?).

My point here is this: lots of people are worried about the ethical dilemmas of AI (see my ecosystem map below), but we (those of us in different academic fields, industries, citizens, the government, news media) all seem to be speaking very different languages when it comes to identifying problems and solutions. And while some fields are making great efforts to talk to other fields (for example, FAT* has an entire workshop devoted to “translation” this year), I can’t tell you the number of times I’ve talked to influential machine learning researchers who have asked,  “What’s STS?” Or, if you read the literature on AI ethics education, almost all of the pedagogy is rooted in teaching “what is [insert your favorite ethical theory here] and how can you use it to justify your design choices?” It’s really hard to know what we don’t know, and I strongly believe these communities have more tools at their disposal than they think.

So, there are a few questions I’d like to explore in this space:

  • What does “ethics of AI” mean to different fields (such as AI/ML, STS, philosophy, anthropology, human computer interaction, psychology, etc)? What do these different fields believe the role of ethics should (or can) be?
  • What expectations do different fields have when we talk about “designing ethical AI?” What do they think is feasible?
  • Can we find ways to more easily translate these definitions, tools, and expectations between fields?
  • How do these concerns align with public concerns and expectations around AI? Can we translate from the communities who actively work in these areas to the public at large, and back?

let’s do some ai+ethics

As many of you know, my thesis project is to develop an AI+Ethics curriculum for middle school students. The goals of the curriculum is three-fold: (1) to enable students to see artificial intelligence systems as artifacts with politics, (2) to enable students to see technology as manipulable, and (3) to empower students to design AI with ethics in mind.

I’ll actually be in the classroom later this week, and these are a set of values I hope to establish in our community.

These first three values come as suggestions from Jaleesa in LLK and her experience in the classroom, so credit and thanks goes to her.

  1. This is a brave space: We acknowledge that designing better AI systems is a hard thing to do, and even hard than that, is discussing the ethics of systems that affect almost every aspect of our lives. However, despite the difficulty and complexity of the problem before us, we will do our best to contribute to a solution. We acknowledge that speaking up in front of a group of people can be scary, and that it is very brave to share your ideas and risk vulnerability. We acknowledge that learning something brand new means we also risk failure, and that learning takes time.
  2. We trust the experiences of others: We accept others as they are and trust in their experiences when they share them with us. We recognize that to deny the experiences of others is to make them less human,  and refuse to do so. We recognize the bravery in sharing about our lives, and encourage an environment that makes each other feel included.
  3. We assume positive intent: We recognize that we will be discussing difficult topics, and that sometimes our peers will say or do something that makes us feel small, upset, or offended. When that happens, we will assume that person meant well in their words or actions and respectfully notify them how their words or actions made us feel so that they may learn how to better communicate in the future.
  4. We value diversity, inclusion, and collaboration: We recognize that our biggest asset in learning about and designing better technology are each other’s diverse experiences. We welcome those who are different from us. We recognize that no best solution comes from one person, and therefore we value collaborating with others on a team, and making everyone in our team feel safe, valued, and comfortable.
  5. We are humble and curious: We accept that the problem set before us is challenging and no single person, regardless of how smart or experienced, will be able to solve it. We acknowledge that to do our best we must be committed to learning about new things, people, and ideas. We acknowledge that asking questions is important, and that no question is “dumb” or “a waste of time,” and that we encourage our peers to be curious by asking questions.

And here are the details:

Who:

Middle schoolers, teachers, school district administrators, representatives from local AI startups (software engineers, C-level executives, education outreach officials).

Of course, middle school students are the target age group for my curriculum, so it makes sense that middle schoolers will be there. However,  we are keeping an open door policy. Many local area AI startups are interested in this ethics curriculum, and will be invited to participate in activities alongside the students, or to shadow the class (whichever they feel most comfortable with). We want to show the students that you can learn to design better technology at any age, at any point in your career. Teachers and school district administrators have also been invited to observe, but will also be welcome to participate because designing better AI should be an open, democratic, and inclusive process.

Where:

At a middle school in Pittsburgh, PA during the school day. As some of you might know, many curriculum pilots take place during after school workshop or summer camps. There is at least one good reason for this: school time is precious. However, the problem with pilots taking place outside of school hours is that many students systematically do not gain access to these new, high-tech, cutting-edge curriculums. Additionally, an unfortunate side effect is that many of the scientific studies are based on small numbers of participants. Thus, I feel so lucky that I will be able to visit this middle school and work with students during their normally scheduled library period.  Since I will be returning each quarter to the school, I will be able to offer this curriculum to every child in the district.

What:

There will be a few activities for students to engage in. These activities, broadly speaking, serve one of three goals: (1) teaching students the fundamentals of artificial intelligence (e.g. what is training data? what is a learning algorithm? how do these two items affect the success of the system?), (2) teaching students that AI systems have ethical import, and (3) teaching students the fundamentals of value-sensitive design and giving students practice in making hard, ethical design decisions.

In addition to these activities, however, we will also be including community-building activities, such as fun icebreakers, small group and class discussion, and by role playing/modeling behavior that reflects our core values listed above.

an app in retrospect

This is the Strom Thurmond Fitness Center, the $38 million crown jewel of the University of South Carolina, my alma mater. (Go gamecocks! Beat Vandy!)

You can’t see it from this picture, but “The Strom” sits at the top of what we at home would call “big-ass hill.” At the bottom of the hill like two important tourist attractions for the University of South Carolina: Greek housing and lots of parking garages.

Every day thousands of students walk by the Strom, and in doing so, they also walk by this sign:

“Blossom Street School, at the corner of what was then Blossom & Gates (now Park) Streets, was built in 1898 as the first public school in Columbia south of Senate Street. A frame building, it was originally a school for white children. After it burned in 1915, a brick school was built here the next year. Blossom Street became a school for black children in Ward One in 1929 and was renamed Celia Dial Saxon School in 1930.” Photo taken from: http://wardone.wixsite.com/wardone/education?lightbox=imagezle

This is South Carolina Historical Marker 40–149. It honors the Celia Dial Saxon School, which opened in 1929 to the children of an African American community called Ward One. Most students don’t know this sign exists.

Most students don’t know the Celia Dial Saxon school ever existed at all and they definitely don’t know that the Celia Dial Saxon School was closed in 1968 and demolished in 1974. Most students don’t even know that it wasn’t until 2003 that construction on the Strom began.

See, the ugly truth about the Strom Thurmond Fitness Center (ugh, well, you know, besides the name) is that it sits on stolen property. And so does the University of South Carolina Koger Center (performing arts), the Carolina Coliseum (basketball), and the Darla Moore School of Business (exactly what it sounds like).

In the 1950s, 60s, and 70s, “urban renewal” struck Columbia, South Carolina. Before the 50s, you could divide the city of Columbia into three regions: the west, which consisted of a black community called Ward One, the middle, which was home to the University of South Carolina, and the east, home to many white families of university faculty and staff.

But in the 1950s, 60s, and 70s,  the residents to the west of USC – the black community – were told to leave. The district of Ward One had been deemed “blighted” by the state. Residents were told their homes, schools, churches, and businesses would be demolished for their own good, for their own safety. The community, however, was never rebuilt. Instead, the land was reabsorbed by the University of South Carolina, a campus that the residents of the Ward One community were never allowed to step foot on. Many Ward One residents never returned to Columbia after 1975.

In 2015, I joined a class at the University of South Carolina called “Critical Interactives: Ward One.” Half of the class was comprised of media artists, journalists, and historians. The other half of the class consisted of people like me: computer science majors looking to learn a brand new iOS programming language called Swift.

The goal of the project seemed simple enough: build an interactive app ~experience~ which conveys the history of the Ward One community. We kept saying “a history museum but on your phone.” We wanted to allow USC students and the broader community to know more about the history of their favorite places, and particularly, the role race had played in the formation of the university. Our university had just installed a sign at the front entrance of the university acknowledging that the “beautiful part of campus” had been built by slaves, and we were hopeful to ride the coattails of our all white, mostly male, grin-and-bear-it-til-its-not-awkward-anymore administration.

We were balancing interests: what the university administration would be willing to support, what students would interact with, and what the living residents of Ward One and their children wanted.

Together we thought we could hack this hard problem. With skills and experiences in so many disciplines, surely we would be thoughtful with this thing we were building. Our class was also all white.

What we ended up building was, arguably, cool from a tech standpoint. It has fairly good geolocation (good for walking tours around campus) in a place where geolocation had been historically difficult, we had cool augmented reality tools where you could overlay archival photos on what your camera was seeing, and had collected a bunch of new documentary video and audio from the remaining living residents of the Ward One community.

I remember demo-ing the app at a showcase towards the end of the semester. Some university administrators were there, my parents were there, and most of all, the members of the Ward One community were there. They loved it. It was like magic to them, seeing their childhood home come to life on a screen.

At the end of the event, we asked for feedback from the Ward One residents. There was one piece that has stuck with me, that came from 72 year old Mattie Johnson Roberson: “We don’t want this app to make anyone feel bad, we just want to show people why we were proud to live in Ward One.”

We had work to do.

hello from the valley

Remember how on the first day of class Ethan was all like, “we’re here to walk you into the valley of depression about the consequences of technology, and then back out of it so you don’t feel the need to apply to that public policy program”?

Well, hi. I’m Blakeley, a second-year, possibly-going-to-graduate-in-June masters student here at the Media Lab. I’ve currently got about 48 Google Chrome tabs open right now and half of them are filled with queries like “harvard kennedy school admission how” and the other half are “STS phd program jobs after graduation.” So… yeah. Maybe I could use some introspection as to how I got here.

this is my valley! look at all the anxiety over technology! just look at it all! 

I came to the Media Lab last year with a background in math and computer science. Upon reading Cathy O’Neil’s Weapons of Math Destruction I was STOKED to combat algorithmic bias, injustice, and as per usual, the patriarchy of tech. The goal of my main project last year, Turing Box, was relatively simple: let’s build a two-sided platform where on one side algorithm developers will upload their AI systems and on the other side examiners (maybe social scientists, maybe other computer scientists) will evaluate those algorithms for potential bias or harm. Algorithms with good evaluations will receive seals of approval, certifying that they are, well, at a minimum, the least worst algorithm there is for a particular task.

Turing Box garnered a lot of praise upon it’s announcement. Companies wanted to use it internally, social scientists were excited about accessibility, and as a tool to explain algorithmic bias, it excelled. On the other hand, it also endured a lot of fair criticism, mostly about the framing of algorithms as agents with behavior to be studied and the lack of credit to those in STS who laid the groundwork.

But it wasn’t the criticism that worried me. What worried me was that Turing Box, if placed in the very market it was creating, wouldn’t hold up against other versions of the platform. I could imagine a hundred scenarios in which the platform could fail as a societally beneficial tool. What if big tech companies flooded the markets in an adversarial way? What if the majority of evaluators only came from a particular company and were taught to recognize the behavior of their own algorithms so that they could give them higher evaluations? How do you recommend algorithms to examiners to evaluate without, essentially, setting the scientific agenda in the same way that platforms like Twitter determine our political discourse? How do we ensure that our platform doesn’t just offer measurement as a solution to real societal problems?

but what if our site gets popular and everyone adversarially uses it and it makes it seems like we’ve solved algorithmic bias but all we’ve actually done is measured a bunch of really useless stuff and then we have no way of knowing until real harm actually occurs

I’ve since, uh, pivoted. The goal of my masters thesis is to construct an “AI and Ethics” curriculum for middle schoolers (or as I’m trying to advertise, “just a better AI curriculum that includes ethical considerations because we should have been doing that in the first place,” it’s not really catching on yet…).

So, why am I here? There are a few reasons. First, I’d love to walk back out of the valley. Second, I don’t want to build a curriculum where I walk my students into the valley and leave them stranded. I’m looking forward to learning about value-sensitive design and participatory design because I’d like to integrate these techniques into my own curriculum. Third, I really, really want to graduate on time. My parents already took vacation days.

“Why do you write like you’re running out of time?” Um, because I WANT TO GRADUATE IN JUNE. 

If you’re interested in discussing what an AI+Ethics curriculum might look like for middle schoolers, I’d love to chat!