Yahoo Web Search

Search results

  1. Jun 5, 2022 · Eliezer Yudkowsky. AGI Ruin: A List of Lethalities. Created with Sketch. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. Preamble: (If you're already familiar with all basics and don't want any preamble, skip ahead to Section Bfor technical difficulties of alignment proper.)

    • Overview
    • Why We Need to Solve The Ai Alignment Problem
    • Why Ai Alignment Is Hard
    • Alignment Techniques
    • The Ai Safety Field

    Introduction

    This post is a summary of Eliezer Yudkowsky’s post “AGI Ruin: A List of Lethalities”. I wrote it because I think the original post is longer and less organized than I would like it to be. The purpose of this post is to summarize the main points in the original post and structure the points in a new layout that I hope will make it easier to quickly read and reference each point. The summary contains the following sections: 1. Overview 2. Why we need to solve the AI alignment problem 3. Why AI...

    An AI is aligned if it doesn’t cause an existential catastrophe

    When Eliezer says that AI alignment is lethally difficult, he is not talking about the challenge of creating perfect or ‘provable’ alignment. He says that even creating an outcome where there are any survivors is unlikely given the incomplete alignment methods we have today.

    AI can be much smarter than humans

    AI can be much smarter than humans and use information much more efficiently than humans when making decisions or forming beliefs about the world. For example, AlphaZero learned to be superhuman at Go in only a few days.

    AI alone could be very dangerous for humans

    A misaligned AI smarter than humans could cause human extinction. The AI would not need a robot body to be dangerous because it could convince one or more humans to carry out its plans or hack into human infrastructure to make use of it. The AI could invent dangerous technologies such as advanced nanotechnology that we would not be able to defend against.

    We can’t just decide not to build AGI

    There is a lot of computer hardware in the world such as GPUs and many people have access to it. AI software and hardware are continually getting better and it would be very difficult to completely halt progress in these areas because everyone in the world would have to agree not to make progress in these areas. Many actors are working on AGI research and even if one or more of them refrain from making progress, the other actors can still go on to create AGI. If many organizations decided to...

    Our time is limited and we will probably only get one chance

    Making AGI safe will probably be very difficult because: 1. We have limited time to solve the problem: we probably need to solve the AI alignment problem before AGI is created. The amount of time we have is unknown. 2. We will probably only get one chance to solve the AI alignment problem:problems with sub-AGI systems can be noticed and patched. But it may not be not possible to program an AGI multiple times because a misaligned AGI could kill you before you can take further action. The AI al...

    Human feedback would not work on an AGI

    If a weak AI produces a harmful output, the output can be labeled as negative and the AI can learn not to produce it again. However, this technique would not work on an AGI because it could be powerful enough to produce an output that kills its operators before they can give feedback on the output.

    Scalable alignment is hard

    Instead, we might need to come up with alignment solutions that work for weak AI and generalize to superhuman AI. The problem is that progressing from subhuman to superhuman intelligence is a big distributional shift that may break many alignment techniques. Some weaknesses in an alignment solution may not be visible at low levels of intelligence and only materialized once the AI achieves superintelligence. Therefore, an AI could behave safely at first but only become dangerous after reaching...

    Two alignment techniques

    1. CEV sovereign:build a sovereign AI that is programmed to carry out our coherent extrapolated volition (CEV). This is unlikely to work because our values are complex and it is unlikely that we will encode them successfully into the AI in one attempt. 2. Corrigible AI:build an AI that doesn’t do exactly what we want but is corrigible so that we can switch it off. The problem with this approach is that corrigibility conflicts with the self-preservation instrumental convergent goal. Yudkowsky...

    Problems with interpretability

    1. Currently, we don’t know what AIs are thinking. We don’t know how to inspect AIs for dangerous motives. 2. Even if we created a misaligned AI and had sufficiently advanced interpretability tools to know what it was thinking, we would still not know how to create an aligned AI. 3. Optimizing against unaligned thoughts optimizes for aligned thoughts but also for the ability to hide unaligned thoughts or reduce interpretability. 4. We might not be able to evaluate the thoughts or plans of an...

    Bright-eyed youngsters and cynical old veterans

    Often bright-eyed youngsters go into a field with optimism but then learn that the problem they hoped to solve is harder than they think. Then they become cynical old veterans who warn people about the difficulty of the problem. The field of AGI may never have cynical old veterans because a bright-eyed youngster who creates a misaligned AGI will die before learning from the mistake and becoming a cynical old veteran. Therefore, AGI researchers may always be unrealistically optimistic. The sol...

    AI safety researchers are not making real progress

    Most researchers are working on the kind of problems that are easy enough to make progress on rather than difficult problems where they might fail. The field is not making real progress and there is no method for determining whether the research is actually reducing AI risk.

    Alignment mindset

    Yudkowsky is able to notice serious problems associated with AI alignment solutions but does not know how he does this or how to train other people to have this mindset. The security mindset is similar and can be taught.

  2. People also ask

  3. Jun 10, 2022 · June 10, 2022 | Eliezer Yudkowsky | Analysis. Preamble: (If you’re already familiar with all basics and don’t want any preamble, skip ahead to Section B for technical difficulties of alignment proper.) I have several times failed to write up a well-organized list of reasons why AGI will kill you.

  4. Mar 29, 2023 · Ideas. Technology. Pausing AI Developments Isn’t Enough. We Need to Shut it All Down. 11 minute read. Illustration for TIME by Lon Tweeten. Ideas. By Eliezer Yudkowsky. March 29, 2023 6:01 PM...

    • Eliezer Yudkowsky
  5. Decision theorist Eliezer Yudkowsky has a simple message: superintelligent AI could probably kill us all. So the question becomes: Is it possible to build powerful artificial minds that are obedient, even benevolent?

  6. Mar 30, 2023 · Subscribe: Spotify | TuneIn | RSS. Eliezer Yudkowsky is a researcher, writer, and philosopher on the topic of superintelligent AI. Please support this podcast by checking out our sponsors: 1. AGI Ruin (blog post): https://lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities. 2.

  7. Feb 22, 2024 · This scenario is outlined by Yudkowsky on multiple occasions, most prominently in his post to the LessWrong forum site AGI Ruin: A List of Lethalities. Additional discussions of this scenario are contained in the Lex Fridman Podcast #368, The Logan Bartlett Show, EconTalk, Bankless podcasts along with an archived livestream with the Center for ...

  1. People also search for