The ML community needs a provisional morality

Just a few days ago, many of us found ourselves in a state of stupefaction as we learned the circumstances under which Dr Timnit Gebru was pushed out of her position at Google. Following this event, many conversations have sparked up about the technicalities of the forced resignation and the dubious explanations given by Google. First, let me reaffirm my support for Timnit, thank her for her contributions to the field, and wish her to promptly find a place that will support her essential work. Now I would like to talk about a point in the answer from Google that I consider worrisome.

I haven’t had the chance to read the paper at the core of this ordeal yet but luckily, the MIT Tech Review summed up its main points. The paper pointed at some of the main risks of deploying large language models: the environmental cost, the impossibility to audit the massive amount of training data as well as the model itself, research efforts concentrating towards these models at the expense of more environmentally-friendly ones or ones that attempt another approach at modelling language, and the very harmful mistakes these models make when they are trusted blindly.

It is crucial to have a critical conversation about these four points but here I’d like to focus specifically on the second one: the impossibility to audit the massive amount of training data as well as the model itself. In his comments, Dr Jeff Dean said that the reason why the paper was barred from publication by their internal reviewing committee was that

“it didn’t include important findings on how models can be made more efficient and actually reduce overall environmental impact, and it didn’t take into account some recent work at Google and elsewhere on mitigating bias in language models. Highlighting risks without pointing out methods for researchers and developers to understand and mitigate those risks misses the mark on helping with these problems.“

This statement immediately irked me because it brushes off the fundamental issue of the content of the training data as just another technical issue. Not only this but it tries to undermine the concerns with the insurance that the research happening at Google or elsewhere will eventually crack this problem and make it disappear. This might be true but I believe this framing “misses the mark on helping with these problems”.

To help me make my point clearer, let me tell you about a PhD student who has been using videos scraped off the internet to generate artificial pornographic clips. This student who has been active under the pseudonym GeneratedPorn (or GP) did not audit the data he used to train his generative model. When journalists told him that his data contained videos of women who had been sexually abused on camera, his answer was:

“I again can't verify the back story behind hundreds of thousands of images, but I can assume some of the images in the dataset might have an exploitative power dynamic behind them. I'm not sure if it's even possible to blacklist exploitative data if it's been scraped from the web. I need to consider this a bit more." Later in the article, he says “The researcher in me feels like 'if it's been published online it's open source and fair game' however the budding capitalist in me feels like that violates IP in some sense. I'm a bit conflicted. I've personally accepted that any data I ever create as an individual will be used by others for profit or research. […] Now that the abuse is present I can opt to not use that data and source data from elsewhere. Others in the area may not care and may decide to use it anyway. It's quite difficult to screen for this data completely. Doing a google image search for 'female standing nude' gives you a bunch of Czech Casting images [the ones containing the reported abuse]. Throwing on the flag '-"czech"' catches a lot of them, but some still get through the cracks."

You can see clearly three stages in his reflection: first he says there’s nothing he can do about the problem, then he tries to justify his indifference, and finally he brushes off the problem as a technical one that can be solved with time, effort, and the right preprocessing pipeline. This article shocked me to the core. I had a hard time dealing with the stark contrast between the complete indifference of GP and the very real women who not only had been abused but had to live with the knowledge that their abuse was now part of training data for a generative model. In addition, I just couldn’t process the fact that GP and I are technically part of the same research community.

Of course, this is a very extreme example and I am not trying to compare Google’s actions to GP’s but the underlying reasoning sounds similar to me. It states that everything is a technical issue that will be solved, if we just “consider this a bit more“, that it’s all a question of finding the right cleaning or de-biasing methods. Maybe. Maybe we will find sound solutions to these problems. But can we really satisfy ourselves with the answer that research might in the future solve these problems? My question is: how should we act now and as long as we have an imperfect knowledge of these models and the data they were trained on? Because we know that these models can make very damaging mistakes, this, I believe, becomes a morality issue, more specifically, a question of provisional morality.

René Descartes first discussed the idea of a provisional morality in Part III of his Discourse on the method. He writes:

“Now, before starting to rebuild your house, it is not enough simply to pull it down, to make provision for materials and architects (or else train yourself in architecture), and to have carefully drawn up the plans; you must also provide yourself with some other place where you can live comfortably while building is in progress. Likewise, lest I should remain indecisive in my actions while reason obliged me to be so in my judgements, and in order to live as happily as I could during this time, I formed for myself a provisional moral code consisting of just three or four maxims, which I should like to tell you about.“

In other words, as long as doubts persisted in his mind about what was right, Descartes decided how to act according to simple moral principles. These principles were provisional and meant to evolve as he gained more knowledge and progressively resolved his doubts.

What is our provisional morality when it comes to large machine learning models? Do we acknowledge our blind spots, pause deployment, and welcome those who have been trying to raise awareness like Timnit and others? Or do we apply these models first and solve problems later? Is it an insurmountable problem that these models need a massive amount of data to train? Can we curate high quality data? Is everything just a technical problem waiting to be solved or are there fundamental flaws in our approach? Where do we stand as a field that deploys models impacting millions if not billions of people? When we know that minorities have consistently been given the short end of the stick by these machine learning models?

These are the questions that the ethical AI community has been asking and Google does not seem to want to answer. Undermining legitimate concerns by saying that there is ongoing research is not an appropriate answer. The answer we need is to the question: what do we do in the meantime?

Layla El Asri

Layla El AsriComment