Coexisting With A Paperclip Maximizer

13 minute read

Published:

In this post, we consider the paperclip maximizer problem proposed by [1] in the context of weak intelligence for machine learning algorithms. After an overview of the learning process in the weak intelligence paradigm, we critique this empirical knowledge acquisition under a mathematical framework subject to the human bias. Then, provide open questions to the field of artificial intelligence on how the paperclip maximizer phenomenon is growing up around us.

Keywords — artificial intelligence, decision-making process, empiricism, optimization problem.

The Optimization Problem In Machine Learning

The machine learning algorithms for autonomous decision-making processes are leading a new industrial revolution in nowadays society. Those algorithms include recommendation systems for suggesting actions to users based on their preferences; and control systems for managing and regulating the behaviour in a machine or a large scale plant. Applications for those algorithms are e-commerce advertisements, video games, autonomous driving cars, smart houses, and robotics [2]. Even though the mentioned applications seem relevant for a digital economy, there are several challenges in implementing machine learning into industries that require critical safety constraints, such as healthcare and food production.

According to [3], the transition process from literature to real-world applications in the machine learning field has open questions concerning human-machine interaction and black-box models. Those questions come from the fact that we humans can not understand the feature extraction operations inside the learning process yet. Hence the algorithm’s result seems like "magic" because it might be correct; however, we do not know how the algorithm reached it.

Machine learning has a mathematical foundation based on probability, control and optimization theory. In particular, the formulation of decision-making algorithms involves an optimization problem of some objective function. For instance, an objective such as "build as many paperclips as possible" is a well-defined optimization problem that may be solved via machine learning.

Let us consider an autonomous system whose objective is to build as many paperclips as possible. In principle, this objective is harmless for society, and with the current technology, one of the worst-case scenarios is that system runs out of resources. However, [1], considers the possibility that the paperclip maximizer system may learn how to get unlimited resources to build paperclips at the expense of all the natural resources on the planet. Thus, an inoffensive objective turns out to be a potential danger for society, not only for inadequate resource usage but also the impossibility to stop and control it.

Weak And Strong Intelligence

The dystopian scenario of a paperclip maximizer turns on the alarms for possible machine learning developments. For instance, [1] suggests a future where a super-intelligence inside a computer, which is in charge of essential natural resources, does not cooperate with humanity. One possible reason for this unwanted super-intelligence’s behaviour might be that the computer requires those natural resources to fulfill its job. Another likely reason associates the cooperation with humanity with non-optimal results. In both scenarios, the machine’s priority is about making decisions with optimal results.

The super-intelligence idea comes from the strong intelligence concept where the machines may develop thinking processes as humans. Under this concept, the machines may have emotions and dreams like another human being without needing a biological evolution process. As a counterpart, another concept known as weak intelligence suggests that machines can only simulate intelligence [2]. The intelligence simulation happens as mathematical operations with a logic structure inside a capable processing unit with no thinking process as humans. Indeed, the current machine learning technology lies in the weak intelligence side.

In both types of intelligence, a paperclip maximizer’s existence is possible due to the learning process and the self-improving idea behind machine learning. On the one hand, a machine with strong intelligence is conscious of its decisions, capable of exceeding human capabilities. On the other hand, a machine with weak intelligence works to fulfill an objective without awareness of its existence. For instance, the lack of self-awareness and the need for self-improve may create situations where the machine hurts itself or its environment to achieve a goal since it does not see the consequences of its actions [4]. For instance, our interest goes to weak intelligence since it is the paradigm for technology nowadays.

The Learning Process In Weak Intelligence

The weak intelligence paradigm in machine learning algorithms has foundations from branches of mathematics and computer science. However, those algorithms show unexpected behaviours in real-life scenarios due to the theory-application gap from assumptions that do not apply in reality. As evidence of those unexpected behaviours, we examine the case of "lazy agents," which are algorithms with no meaningful actions in the environment; in other words, they consider that no actions are the optimal behaviour for a given task [5]. Although this behaviour seems counter-intuitive at first sight, the equations behind the algorithm demand "laziness."

Let us analyze the decision-making process for autonomous agents based on the mentioned optimization framework. The first step relates to observe the environment and retrieve information from sensors. Then, the machine processes the data and makes decisions based on such information. Hence, the machine performs actions with those decisions [6]. For instance, the data processing step implies an optimization problem that translates a real-life scenario into mathematical language. In particular, Bayesian thinking plays a significant role in the language by providing a notion of uncertainty in the decision-making processes. Indeed, some environments, like the stock market, electronic sensors, and human interaction, require the hypothesis spaces model under a probabilistic framework to capture complex behaviour from the system.

The hypothesis space is the set of possible mathematical models that the machine may use to fit decisions into optimal behaviour. Neural networks, linear models, or kernels are examples of mathematical models. Furthermore, the researchers design the hypothesis space with a particular heuristic rule to deal with the machine’s application. Also, this space carries on information about the environment and the agent’s structure. Still, this information is subject to the designer’s choice.

As we can perceive from this learning process, the machine gains knowledge from empirical methods based on observation. The overall learning has foundations in trial and error methods over some hypothesis space. In principle, the intelligent agent has a hypothesis on how to perform optimal actions with respect to some objective, e.g. minimize production costs; then, the agent tries the hypothesis over an environment and computes some evaluation metrics; hence, with those metrics, the agent fine-tunes the decisions for further attempts.

The learning process may look like a positivist framework for machines, where the source of knowledge is the observation, and the agent systematically tries a hypothesis. However, the current learning techniques for weak intelligence incur prior bias from the human designers and researchers. Such bias is visible in the modelling framework for the environment, the training methods, and the data used in the learning. Hence, we may doubt the machine’s ability to retrieve knowledge from the environment since it is subject to human bias.

A Biased Empiricism

The reality of biased knowledge leads us to a scenario where the machines have corrupted intelligence. At first glance, the machine replicates a scientific method to make decisions on the environment; observation, hypothesis testing, and experimentation are steps that the machine pursues towards an optimal behaviour. However, the computational capabilities of a machine are not yet comparable with the brain and human reasoning. Moreover, as a consequence of the immature human-machine communication through a mathematical language, we can expect irrational behaviour in the machine.

This stage of weak intelligence does not only have a lack of self-awareness; it also depends on human expertise. Due to the requirements to manipulate abstract mathematical objects, the scenario where humans can not send an accurate indication to the machine is more common than expected [7]. Indeed, transparency is still an open question in the machine learning community since the researchers are final users are not entirely conscious of the machine’s internal processes.

[4] proposes a series of vulnerabilities that the optimization-based learning approach may have. For instance, the agent wants to self-improve by exploring the environment under the hypothesis space. As a result, the agent’s irrational behaviour leads to a narrow self-improving attempt without considering factors not included in the hypothesis space. Furthermore, the agent’s actions drive towards preserving the objective without ethical judging about it.

Similarly, this objective may lead to inadequate actions, such as "laziness," that contribute to unwanted situations of self-harming and environmental damages. Likewise, the agent’s resource management becomes a critical metric to consider in the learning process. Unless the human states explicitly to take care of the resources, the machine only pays attention to the optimization objective. Thus, the machine may decide actions considering unlimited resources available, which is an unrealistic assumption.

To overcome those vulnerabilities, the research community may consider ideas like constant supervision to the agent, where a human or another agent follows the machine’s decisions and examines some external metric different from the objective function to evaluate those actions. Another strategy consists of modifying the objective to explicitly consider some notion of risk, such as resource usage so that the agent becomes aware of possible unwanted actions. Additionally, the agent may seek experimentation methods in the hypothesis space where the trial and error approach follows some heuristics to minimize the errors. However, an agent’s mistakes with no self-awareness are inevitable, so we may think in an unbreakable environment where the agent can make as many mistakes as possible. Likewise, we can modify this particular environment to prepare the agent for the worst-case scenarios by setting on purpose undesirable for the agent [8]. Of course, we may have cases where the agent may not compute a suitable decision for those scenarios due to the lack of rich hypothesis space.

As we can see from the machine learning process, the (weak) intelligent agents follow similar knowledge gaining methods as humans. Those methods take observation and experimentation via trial and error as a basis. However, the lack of rationality and self-awareness prevents the agent from having an independent learning process. Hence, the decisions in the machine have a bias from the researcher in charge of design it.

Discussion

This essay discusses the possibility of a paperclip maximizer in the current early machine learning research stage. This issue comes from vulnerabilities in the weak intelligence given by the lack of self-awareness and bias by design preferences. The immature weak intelligence may raise several concerns in the research community and policymakers. So, we must develop possible strategies to overcome the imminent consequences of a machine with relevant tasks and unacceptable execution.

At this point, we must ask who is responsible for the machine’s behaviour. On the one hand, the researchers induce a bias in the agent’s decisions. On the other hand, the policymakers accept such behaviour for usage in society. Also, society takes advantage of the agent to pursue some activity. Maybe we look into a shared responsibility for the machine learning developments.

Even though machines’ learning process is still immature and dependant on human expertise, we may see similarities in how machines and humans may acquire knowledge through empiricism. A strong intelligence in the future is still an improbable event due to the current technology. However, as humanity, we shall prepare for a situation where humans and machines cooperate on a common goal.

In the meantime, we coexist with (weak) intelligent agents in some critical applications, like health-care and transportation. Hence, the research community must develop security layers to prevent potential danger. Constant supervision, risk-aware objectives and trial with no error experimentation are some possible approaches to tackle this issue.

We find that positivist ideas for knowledge creation based on observations require a rational agent, not precisely the one with weak intelligence. Furthermore, the mathematical language may communicate a message under a channel with well-established rules. But if the humans transmit an imprecise message to the machine, then the learning process turns out into faulty actions with unwanted consequences. So, the paperclip maximizer seems like a suitable possibility in the current machine learning era due to the (weak) intelligence agents’ irrationality.

References

[1] N. Bostrom, “The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents,” Minds & Machines, vol. 22, no. 2, pp. 71–85, May 2012, doi: 10.1007/s11023-012-9281-3.

[2] S. J. Russell, P. Norvig, and E. Davis, “Artificial intelligence: a modern approach”, 3rd ed. Upper Saddle River: Prentice Hall, 2010.

[3] G. Dulac-Arnold, D. Mankowitz, and T. Hester, “Challenges of Real-World Reinforcement Learning,” arXiv:1904.12901 [cs, stat], Apr. 2019, Accessed: Aug. 18, 2020. [Online]. Available: http://arxiv.org/abs/1904.12901.

[4] S. Omohundro, “The basic AI drives,” in Frontiers in Artificial Intelligence and Applications,” 2008, vol. 171, pp. 483–492, [Online]. Available: https://selfawaresystems.files.wordpress.com/2008/01/ai_drives_final.pdf.

[5] D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and D. Mané, “Concrete Problems in AI Safety,” arXiv:1606.06565 [cs], Jul. 2016, Accessed: Dec. 29, 2020. [Online]. Available: http://arxiv.org/abs/1606.06565.

[6] R. S. Sutton and A. G. Barto, Reinforcement learning: an introduction, Second edition. Cambridge, Massachusetts: The MIT Press, 2018.

[7] P. de Blanc, “Ontological Crises in Artificial Agents’ Value Systems,” arXiv:1105.3821 [cs], May 2011, Accessed: Dec. 29, 2020. [Online]. Available: http://arxiv.org/abs/1105.3821.

[8] J. Garcıa and F. Fernández, “A comprehensive survey on safe reinforcement learning,” Journal of Machine Learning Research, vol. 16, no. 1, pp. 1437–1480, 2015.