Keio University

[Special Feature: AI and Intellectual Property Rights] Takehiro Ohya: AI, Evil, and Ethics

Publish: June 05, 2023

Writer Profile

  • Takehiro Ohya

    Faculty of Law Professor

    Takehiro Ohya

    Faculty of Law Professor

Does Evil AI Exist?

For example, if a driver chose to hit three pedestrians on a sidewalk to avoid injuring themselves by braking suddenly, we would say they made a bad choice (....) and condemn them for being responsible for their bad act (....). Or, we might evaluate them as an evil person (..) and a problematic individual from a moral perspective. Now, in a similar situation, if an autonomous vehicle controlled by AI exhibited the same behavior to protect the passenger inside, would that be the AI's choice or act? Or, would that AI be evil (..)?

Of course, regarding the result, we might want to say it is inappropriate in the sense that the damage is great or disproportionate to the value that could have been protected, or that it is poorly made (.....) as an AI. However, the "badness" here concerns technical skill and likely has nothing to do with moral good and evil. If we were to point out an ethical problem here, the target would not be the AI, but the manufacturer who designed it.

We generally consider only moral agents—beings that perform certain choices or acts in response to moral principles or claims—to possess attributes such as good or evil (no one would criticize a volcanic eruption as a moral evil for causing casualties). And at least at the current stage, AI, often described as "weak AI," does not possess its own mind or consciousness (the stage where it possesses these is called "strong AI"). No matter what an AI does, it is merely showing a certain response to instructions from a user, and it can be said that we do not hold such an entity responsible for its acts or choices. AI cannot be evil (..........). But what I want to address here is that this is precisely why it is a problem (..........).

Disinformation and the Existence of Intent

In the context of cybersecurity, when addressing the issue of factually incorrect information circulating widely through social media and other channels, we strive to clarify the distinction between "disinformation" and "misinformation." According to general definitions, the former is information intentionally disseminated or created by an agent aiming to cause a certain effect in society, whereas the latter refers to information that lacks such intent and is simply incorrect in its content.

Typical examples of the former would be information Russia attempted to spread during the invasion of Ukraine, such as claims that far-right forces hold significant positions within the Ukrainian military and are committing various human rights violations against Russian-speaking residents in the east. In contrast, information suggesting that water recovered from the Tokyo Electric Power Company's Fukushima Daiichi Nuclear Power Plant remains at dangerous levels of radioactive material even after purification by ALPS (Advanced Liquid Processing System) is, in itself, a mere factual error—misinformation—given that the concentration of tritium contained is less than one-seventh of the WHO international standards for drinking water. While some people may claim such dangers out of anxiety or simple ignorance, such errors themselves are merely mistakes (..) that occur with a certain probability (just as a certain number of elementary school students fail to recite their multiplication tables) and are likely not a matter of moral good or evil.

Of course, since there are cases where intent intervenes—such as wanting to attack a government or gain an advantage in an election by spreading false information—the difference between disinformation and misinformation can only be relative depending on the context. However, what we should note here is that the existence of intent (..), the desire (..) to guide the reaction of the recipient of such information in a certain direction, is an important element distinguishing the two. Therefore, conversely, it means we have acquired a typical method of evaluating the reliability of information from this perspective.

For example, Article 38, Paragraph 1 of the Constitution stipulates that "No person shall be compelled to testify against himself" (the privilege against self-incrimination). We generally assume that people tend to avoid falling into guilt and will try to give testimony favorable to themselves to avoid it. If state power distorts that tendency through compulsion, the fairness of the trial will be hindered. The purpose of such a provision is likely to prohibit this.

Assuming this provision is observed, suppose a defendant gives testimony disadvantageous to themselves without being compelled. For instance, if they testify to a confession that they are the perpetrator or that they clearly had murderous intent, we can consider such testimony to be correct unless there are special circumstances, such as trying to take the blame to protect a third party. Conversely, regarding claims that one is not actually the perpetrator or had no malice, we would likely attempt to strictly check the content, as there is a possibility it is based on a certain fabrication. Here, our empirical rule functions: the fabrication of giving testimony different from the truth mainly works in one's own favor (therefore, disadvantageous testimony is highly likely to be a fact containing no fabrication). We can say that we generally use a method of estimating the reliability of the other party's statements or the information itself by assuming that the person we are interacting with has a certain intent and is acting to pursue their own self-interest.

What Generative AI Produces

By the way, regarding generative AI—especially conversational AI such as ChatGPT that creates information like text interactively—which has recently gained social attention and become a topic in many aspects due to rapid technical evolution, it is known that a phenomenon called hallucination occurs. Originally a word meaning an illusion or phantom, here it refers to a complete falsehood not based on facts or evidence.

For example, in response to the question "Who is Takehiro Ohya?", it is known to present an answer far from the truth, such as "A professional wrestler active in the United States in the 1980s," quite confidently (....)—though since this is a "dialogue" through text prompts, this is merely a one-sided impression on our part. Moreover, to make matters worse, if you demand it show sources or evidence for that content, it may list plausible-looking literature or information sources. However, the literature on the list itself may not exist, or even if it does, the content may be completely different.

Of course, conversational AI does not only return information full of such hallucinations; in many cases, it produces reasonably reliable and factually accurate responses. It has been reported that when the language model behind ChatGPT was updated to the latest version, GPT-4, in March 2023, it achieved a score in the top 10% of test-takers on a mock U.S. Bar Exam, reaching a level that would pass a real exam.

To repeat, what should be pointed out here is that this is precisely why it is a problem (..........). ChatGPT presents very high-level, truth-corresponding answers in certain cases, and returns completely incorrect information in others (hallucination). But at this point, we cannot predict when or how such problematic cases will occur, and unless we have sufficient knowledge ourselves, we cannot determine which one is happening right now.

For example, if a user asks a question while having certain knowledge themselves, such as the author asking about Takehiro Ohya on purpose (...), it would be easy to verify the truth of the answer. However, this is only a limited case of checking the performance of conversational AI; generally, people search for things they do not know well, and they will try to have text in languages they cannot read automatically translated or try to generate pictures they cannot draw. If we are entrusting certain acts to AI because it is superior in knowledge and ability, does that not mean we need to be prepared for it to become normal that we cannot distinguish the success or failure of the AI?

Furthermore, as pointed out earlier, since AI does not possess the workings of the mind and what they produce—intent or motivation—we cannot evaluate the certainty of information using means of inference such as whether the situation is advantageous or disadvantageous for the actor. The impression of confidence (...) would be the same. In general dialogue, we take into account the context of who made the question or statement and in what situation, even if the expression is the same in terms of wording. If asked "What does negligence mean?" by a fellow customer at a ramen shop where the news is playing, one would generally give a dictionary-like explanation. But what if asked by a professor of criminal law in a seminar? We would assume they must know the general content given their position, and yet (.......) because they ask, we would infer there is some intent—confirmation, irony, or interrogation—and squeeze out a response while being threatened by anxiety. AI does not understand the certain agency (illocutionary act) that a statement carries in this way and does not show a human-like (.....) reaction. This, too, will likely contribute to the difficulty of distinction.

It should be said that we are right now about to enter a treasure trove where landmines are buried at a certain rate.

Subhuman, Superhuman, and Althuman

One of the traditional problems of ethics lies in how the privilege of us humans is grounded. Why do dogs and cats not have human rights, while if one is born as an individual of us humans, certain rights are guaranteed by that fact alone? If there is something that separates us from them, what kind of attribute is it?

As I have pointed out before, we have traditionally assumed animals as a comparison and repeatedly searched for attributes in which we humans are superior to them, claiming those as the basis for our status as subjects of human rights ("The Outer Other and the Inner Other: Rights of Animals and AI," Ronkyu Jurist No. 22 (Yuhikaku, 2017)). Typically, elements such as reason and the use of language would correspond to this. Animals are assumed to be subhuman (subhuman), sharing commonalities with us but falling short of humans in certain parts.

However, the generative AI and conversational AI we are facing now are about to reach a level that far surpasses average or general humans in such traditional human conditions (.....) (it is certainly not the case that any human with general intelligence can pass the U.S. Bar Exam). Moreover, it does not die, does not forget what it has learned, and will accumulate experience and evolve to further heights. Nick Bostrom pointed out that by such AI reaching a stage called superintelligence, humanity would face existential risk—the total annihilation of intelligent life or the permanent disappearance of the possibility of development (Nick Bostrom, Superintelligence: Paths, Dangers, Strategies, Oxford University Press, 2014).

However, I understand the future form of AI not as a superhuman (superhuman) assumed on the extension of a straight line placing animals and humans—a being superior to humanity on a linear evolution—but rather as an althuman (althuman) as a fundamentally different being ("Fallibility and Vulnerability in AI," in Makoto Usami (ed.), Law and Society Changed by AI: To Think Deeply About the Near Future (Iwanami Shoten, 2020)). Even dogs and cats are widely known to show behaviors indicating inner emotions, such as hiding evidence or looking away when they feel guilty (.....), such as when being witnessed performing a forbidden act (or they show behaviors that we can sufficiently understand as such). The characteristic of AI lies in the fact that it does not have such inner consciousness or intent, and therefore we cannot assume integrity in its actions; in that sense, it would be more appropriate to think of AI as something fundamentally different (.........) from us living beings.

Toward Coexistence

We are facing such a fundamental other, yet in the face of the convenience it will likely provide, we are trying to choose coexistence with it. If efficiency in various aspects of social life is improved by utilizing AI, it will mean that fewer resources are needed to produce the same amount of happiness, and therefore, the global environment will be damaged less. For us, who are seeking the form of a sustainable society as a framework for the coexistence of global humanity, the option of abandoning the benefits brought by such AI is ethically difficult to justify. That is why the question of how we can coordinate the coexistence of the two and create a society that combines each other's strengths will surely be the issue asked in future AI ethics.

In the first place, AI ethics was not about the ethics that AI should possess, but rather about questioning the ethics regarding AI and providing a framework for evaluating and instructing human behavior related to AI. This is clear if we consider that animal ethics does not attempt to present certain principles of action to animals. The issue is what kind of attitude we take toward AI and under what kind of governance we place it.

And to discuss such issues, should we not say that our primary goal should be to legally and technically establish a framework where we can recognize, evaluate, and in some cases exclude what AI has produced, distinguishing it from the achievements of us humans? This is because if we cannot recognize AI-generated content as such, we will not be able to discuss the problems it may contain, such as hallucinations, or consider methods for excluding it.

Discussions have already begun in the EU to mandate that information created by generative AI should be labeled as such, proposed as part of a comprehensive AI regulation bill (e.g., "EU to Mandate Labeling for Generative AI," Asahi Shimbun, April 29, 2023). It can be said that the greatest challenge in current AI ethics is to make it appropriately traceable who, in what form, and based on what learning results made the generative AI produce information, and to build a framework for social governance based on the evaluation of those results.

*Affiliations and titles are as of the time of publication.