【OpenAI元従業員が語るAGIのリスクと安全対策の必要性】英語解説を日本語で読む【2024年6月8日｜@TheAIGRID】

2024年6月9日 15:39

AIの安全性に関する懸念が高まっている現在、多くの人々がAGI（人工汎用知能）の登場を予測し、そのリスクについて議論しています。OpenAIの元従業員や現従業員、そして他のAIリーダーたちが署名した公開書簡が「righttowarn.ai」に掲載されており、これはAI技術の利点とともに、それが引き起こす可能性のある深刻なリスクについて述べています。この書簡には、AIが既存の不平等をさらに強化し、操作や誤情報の拡散などのリスクをもたらすことへの懸念が記されています。また、AIシステムの制御が失われた場合には人類の絶滅につながる可能性があるとも警告しています。これらのリスクに対処するためには、科学コミュニティ、政策立案者、そして一般市民のガイダンスが必要であると強調しています。一方、OpenAIの元従業員であるレオポルド・アッシェンブレンナーやダニエル・ココタジロもAI安全性についての懸念を表明しています。アッシェンブレンナーは、今後数年間でAGIが登場し、それが国家安全保障や地政学的な対立を引き起こす可能性があると警告しています。彼は、AI研究を自動化するAGIシステムが出現することで、研究の進展が急速に加速し、人間レベルを超えた超知能に至ると予測しています。ココタジロも、OpenAIが安全性研究に十分な投資を行わないことを批判しており、企業が自身の利益のために安全性を軽視していると述べています。
公開日：2024年6月8日
※動画を再生してから読むのがオススメです。

He is saying, by the end of this decade, we will have AGI.

彼は、この10年の終わりまでに私たちはAGI（人工汎用知能）を手にするだろうと言っています。

It seems like AI safety is on everybody's mind lately.

最近は、AIの安全性が誰もが気にしているようです。

Now that OpenAI basically dissolved all of their non-disparagement clauses with former employees, more people are coming out and talking about how worried they are.

OpenAIは元従業員との非中傷条項を基本的に解消したため、より多くの人々が出てきて、自分たちがどれだけ心配しているかについて話しています。

That's what we're going to talk about today.

それが今日話すことです。

First, I want to go over this letter at righttowarn.ai.

まず、righttowarn.aiにあるこの書簡を確認したいと思います。

This is an open letter signed by multiple OpenAI current employees and former employees and other AI leaders talking about the state of AI and the state of AI safety.

これは、複数のOpenAIの現在の従業員や元従業員、他のAIリーダーが署名したオープンレターで、AIの状態とAIの安全性について話しています。

We are current and former employees at Frontier AI companies, and we believe in the potential of AI technology to deliver unprecedented benefits to humanity.

私たちは、フロンティアAI企業の現在の従業員や元従業員であり、AI技術が人類に前例のない利益をもたらす可能性を信じています。

We also understand the serious risks posed by these technologies.

また、これらの技術が引き起こす深刻なリスクを理解しています。

These risks range from the further entrenchment of existing inequalities to manipulation and misinformation.

これらのリスクは、既存の不平等のさらなる定着から操作や誤情報までさまざまです。

These two things are what I am most worried about.

これら2つが私が最も心配していることです。

To the loss of control of autonomous AI systems, potentially resulting in human extinction.

自律型AIシステムの制御を失うこと、それが人類の絶滅につながる可能性があります。

Very, very scary, but something that I don't believe is going to happen.

非常に、非常に恐ろしいですが、私はそれが起こるとは信じていません。

That's just my core belief.

それが私の根本的な信念です。

AI companies themselves have acknowledged these risks, as have governments across the world and other AI experts.

AI企業自体がこれらのリスクを認めており、世界中の政府や他のAI専門家も同様です。

They give a bunch of references where they actually point out times where companies, governments, and AI experts have talked about the risks of AI.

彼らは、企業、政府、AI専門家がAIのリスクについて話した時点を実際に指摘している参照文献をたくさん提供しています。

We are hopeful that these risks can be adequately mitigated with sufficient guidance from the scientific community policymakers and the public.

私たちは、これらのリスクが科学コミュニティ、政策立案者、および一般市民からの十分な指導によって適切に緩和されることを期待しています。

However, AI companies have strong financial incentives to avoid effective oversight.

ただし、AI企業には効果的な監督を回避するための強力な財務的インセンティブがあります。

That's really what we're seeing today.

それが今日私たちが見ている実際の状況です。

A lot of people, including Ilya who just left, including Jan who just left as the head of AI super alignment at OpenAI are basically saying companies are optimizing for increases in quality and other productization of AI instead of AI safety.

イリヤを含む多くの人々、OpenAIのAIスーパーアライメントの責任者であるヤンを含む多くの人々は、企業がAIの品質向上や他の製品化を最適化しているのではなく、AIの安全性を最適化していると言っています。

We do not believe bespoke structures of corporate governance are sufficient to change this.

私たちは、企業の統治の特注構造だけではこれを変えるのに十分ではないと考えています。

AI companies possess substantial non-public information about the capabilities and limitations of their systems, the adequacy of their protective measures, and the risk levels of different kinds of harm.

AI企業は、自社システムの能力や限界、保護対策の適切さ、さまざまな被害のリスクレベルについて、重要な非公開情報を持っています。

However, they currently have only weak obligations to share some of this information with governments and none with civil society.

しかしながら、彼らは現在、この情報の一部を政府と共有する義務が弱く、市民社会とはまったく共有する義務がありません。

Another reason to be very bullish and very proactive in the open source AI community, because if it's all out there, then we can talk about it.

オープンソースのAIコミュニティで非常に前向きで積極的であるべきもう1つの理由は、すべてが公開されている場合、私たちはそれについて話すことができるからです。

We can examine it.

私たちはそれを調査することができます。

We can dive into the details together.

一緒に詳細に突っ込むことができます。

It just allows for a much more hardened AI infrastructure.

それは、はるかに強固なAIインフラを可能にします。

We do not think they can all be relied upon to share it voluntarily.

私たちは、彼らが全員が自発的にそれを共有することに頼ることはできないと考えています。

The letter goes on to talk about some of the confidentiality agreements put in place by OpenAI and other leading frontier AI companies.

その手紙は、OpenAIや他の先駆的なAI企業が設けたいくつかの機密保持契約についても言及しています。

So long as there is no effective government oversight of these corporations, and it's tough to have government oversight without also having regulatory capture, because who within the government knows better about AI than the AI companies themselves?

これらの企業に対する効果的な政府の監督がない限り、また、規制網を持たないと政府の監督が難しいのは、政府内でAIについてAI企業自身よりもよく知っている人がいるからです。

The government is going to lean on AI companies to inform their decisions.

政府は、AI企業に意思決定を通知するように働きかけるでしょう。

AI companies are incentivized to make it as difficult as possible for upstart AI companies to compete because that is how incentives work.

AI企業は、インセンティブが働く仕組みであるため、新興AI企業が競争するのをできるだけ困難にするように動機づけされています。

Current and former employees are among the few people who can hold them accountable to the public.

現在の従業員や元従業員は、彼らを一般市民に対して責任を負わせることができる数少ない人々の中にいます。

Yet broad confidentiality agreements block us from voicing our concerns, except to the very companies that may be failing to address these concerns.

しかし、広範な機密保持契約により、私たちは懸念を表明することができません。これらの懸念に対処しようとしているかもしれない企業に対してのみです。

Ordinary whistleblower protections are insufficient because they focus on illegal activity, whereas many of the risks we are concerned about are not yet regulated.

通常の告発者保護は不十分です。なぜなら、それらは違法行為に焦点を当てており、私たちが懸念しているリスクの多くはまだ規制されていないからです。

That is an interesting concept.

それは興味深い概念です。

None of this stuff is illegal.

これらのものはいずれも違法ではありません。

Thus, they are not covered under whistleblower status, but who knows if they should actually be illegal or not because this is all very new stuff.

したがって、これらは告発者の地位の対象外ですが、これがすべて非常に新しいものであるため、実際に違法であるべきかどうかは誰もわかりません。

Some of us reasonably fear various forms of retaliation given the history of such cases across the industry.

私たちの中には、業界全体でそのような事例の歴史を考えると、さまざまな形の報復を理由に恐れる人もいます。

We therefore call upon advanced AI companies to commit to these principles, that the company will not enter into or enforce any agreement that prohibits disparagement or criticism of the company for risk-related concerns nor retaliate for risk-related criticism by hindering any vested economic benefit.

したがって、私たちは先進的なAI企業にこれらの原則を守るよう求めます。つまり、企業がリスクに関連する懸念に対する中傷や批判を禁止する契約を締結または執行しないこと、また、リスクに関連する批判に報復して任意の経済的利益を妨げないことです。

That is a direct statement to OpenAI.

OpenAIに対する直接的な声明です。

This is exactly what OpenAI had in their documentation upon an employee leaving the company.

これは、従業員が会社を去る際にOpenAIの文書に記載されていた内容そのままです。

But they did recently say they're going to waive all of it.

しかし、最近彼らはそれら全てを免除すると言っていました。

They are getting better, but this letter really directly addresses the entire industry.

彼らは改善していますが、この手紙は本当に業界全体に直接的に対処しています。

If we want to say something after we leave a company about the risk profile of the company with respect to AI, we should be able to do that.

会社を去った後に、AIに関連した会社のリスクプロファイルについて何か言いたい場合、それをすることができるべきです。

I agree, they should be able to.

同意します、それをすることができるべきです。

Obviously, they shouldn't be able to share trade secrets.

明らかに、彼らは取引秘密を共有することはできません。

That makes a lot of sense because these trade secrets are incredibly valuable and they could just go to another company, although that happens anyways.

それは非常に理にかなっています、なぜならこれらの取引秘密は非常に価値があり、彼らは別の会社に行く可能性があるからです、たとえそれがどうせ起こることだとしても。

But the fact that they can't criticize the company's risk-related activities doesn't make sense to me.

しかし、会社のリスク関連活動を批判することができないという事実は私には理解できません。

Next, that the company will facilitate a verifiably anonymous process for current and former employees to raise risk-related concerns to the company's board, to regulators, and to an appropriate independent organization with relevant experience.

次に、会社が現在および元従業員がリスクに関連する懸念を会社の取締役会、規制当局、関連する経験を持つ適切な独立組織に匿名で提起できるプロセスを支援すること。

That's great.

それは素晴らしいです。

I don't know if companies are going to agree to that, though.

しかし、企業がそれに同意するかどうかはわかりません。

That's basically asking them to invest their resources into something that they're going to view as harming them.

基本的には、彼らにとって自分たちを害すると見なすものにリソースを投資するよう求めていることです。

I don't agree with that, by the way.

私はそれに同意しません、ところで。

But whether they're going to be willing to do it is something else.

しかし、それを行う意志があるかどうかは別の問題です。

I think number one is probably much more likely.

私は、ナンバーワンがおそらくより可能性が高いと思います。

Next, that the company will support a culture of open criticism and allow its current and former employees to raise risk-related concerns about its technologies to the public, to the company's board, to regulators, or to an appropriate independent organization with relevant experience, so long as trade secrets and other intellectual property interests are appropriately protected.

次に、会社が技術に関連するリスクについて現在および元従業員が公に、会社の取締役会、規制当局、または関連する経験を持つ適切な独立組織に提起できるよう、批判の文化を支援し、取引秘密や他の知的財産権益が適切に保護される限り。

This is good, but this seems just very hand-wavy to me.

これは良いですが、私には非常に曖昧に思えます。

There's no way to actually measure or enforce this.

これを実際に測定または強制する方法はありません。

Even if they said, yeah, we're going to allow it, the culture can still be if somebody goes out and criticizes the AI safety of OpenAI.

たとえ彼らが「はい、許可します」と言ったとしても、文化は依然として誰かがOpenAIのAIの安全性を批判するということができます。

In the subconscious of each employee, the manager, the board, the directors, they're still going to know that this person went behind their backs and did something that kind of hurt the company, and they're going to think negatively about that person.

各従業員、マネージャー、取締役会、取締役の潜在意識の中では、この人物が裏で何かしら会社に害を及ぼす行動をしたことをまだ知っているだろうし、その人に対して否定的に考えるだろう。

This is a pretty weak point in my mind, although I agree with it in principle.

原則としては同意するが、私の考えではかなり弱い点だと思う。

Next and last, that the company will not retaliate against current and former employees who publicly share risk-related confidential information after other processes have failed.

次に最後に、会社は他の手続きが失敗した後、現在および元従業員がリスクに関連する機密情報を公に共有しても報復しないことを受け入れる。

It continues, we accept that any effort to report risk-related concerns should avoid releasing confidential information unnecessarily.

続けて、リスクに関連する懸念を報告するためのあらゆる取り組みは、不必要に機密情報を公開しないようにするべきだと受け入れる。

Therefore, once an adequate process for anonymously raising concerns to the company's board, to regulators, and to an appropriate independent organization with relevant expertise exists, we accept that concerns should be raised through such a process initially.

したがって、適切なプロセスが会社の取締役会、規制当局、関連する専門機関に匿名で懸念を提起するために存在する場合、そのようなプロセスを通じて懸念が初めに提起されるべきだと受け入れる。

However, as long as such a process does not exist, current and former employees should retain their freedom to report their concerns to the public.

ただし、そのようなプロセスが存在しない限り、現在および元従業員は懸念を公に報告する自由を保持すべきだと受け入れる。

Most of these sound exactly the same to me.

これらのほとんどは私には全く同じように聞こえる。

In principle, I agree.

原則としては同意する。

I'm not sure how they're going to get these companies to agree with it.

これらの企業をどのようにしてこれに同意させるかはよくわからない。

The only one that is kind of cut and dry, very obvious and very easy, is that the company will not enter into or enforce any agreement that prohibits disparagement for risk-related concerns.

唯一、かなり明確で明白で簡単なのは、会社がリスクに関連する懸念に対する中傷を禁止する契約に入ることやその契約を履行することをしないことだ。

That one seems obvious.

その点は明らかだ。

Companies should do that.

企業はそうすべきだ。

It doesn't really cost them anything.

それは実際には企業には何もコストがかからない。

There is a kind of level of commitment to safety that a company is going to have to take, but that's a good thing.

企業が取る必要がある安全への取り組みのレベルがあるが、それは良いことだ。

We have the people who signed.

署名した人々がいる。

We have Jacob Hilton, formerly OpenAI.

ジェイコブ・ヒルトン、以前はOpenAI。

That's one name that I recognize.

私が認識する名前の1つだ。

William Saunders, I recognize, formerly OpenAI, and a bunch of former and current OpenAI anonymous employees.

ウィリアム・サンダース、私が認識する、以前はOpenAI、およびいくつかの元および現在のOpenAI匿名従業員。

We also have Yahshua Benjio, who, as we see here, Canadian computer scientist, most noted for his work on artificial neural networks.

また、ここで見るように、カナダのコンピューターサイエンティストであり、人工ニューラルネットワークに関する研究で最も知られているヤシュア・ベンジオもいる。

Very, very prominent in the AI research field.

AI研究分野で非常に著名だ。

We have Jeffrey Hinton, who is also known as the godfather of AI, a British-Canadian computer scientist, and Stuart Russell, who is another British computer scientist.

私たちは、AIの教父としても知られるイギリスとカナダのコンピュータサイエンティストであるJeffrey Hintonと、別のイギリスのコンピュータサイエンティストであるStuart Russellを持っています。

These are all very well-versed, well-experienced people to be talking about this subject.

これらは、この話題について話すのに非常に精通し、経験豊富な人々です。

That came out on June 4th, which, as of recording, was yesterday.

それは6月4日に出たもので、録音時点では昨日でした。

We have Leopold Aschenbrenner, I hope I'm pronouncing this right, who just did an interview on Dwarkesh's podcast, where he talks about a bunch of things, including a trillion-dollar cluster, scaling AGI, the CCP espionage, leaving OpenAI and starting an AGI investment firm, dangers of outsourcing clusters to the Middle East, and so on.

私たちは、レオポルド・アッシェンブレナーを持っています。彼は、Dwarkeshのポッドキャストでインタビューを受け、兆ドルのクラスター、AGIのスケーリング、CCPのスパイ活動、OpenAIを離れてAGI投資会社を立ち上げること、中東にクラスターを外部委託する危険性などについて話しています。

Here is his LinkedIn profile.

こちらが彼のLinkedInプロフィールです。

Columbia University valedictorian, worked at OpenAI for just about a year and a half.

コロンビア大学の卒業生で、OpenAIで約1年半働いていました。

He left because of AI safety concerns as well.

彼はAIの安全性の懸念から辞めました。

He actually wrote a pretty in-depth paper about AI, AGI, AI safety, etc.

実際、AI、AGI、AIの安全性などについてかなり詳細な論文を書いています。

I'll touch on a couple things from this.

私はこれからいくつか触れます。

Over the past year, the talk of the town, and they're talking about San Francisco, the cradle of innovation at this point, the talk of the town has shifted from $10 billion compute clusters to $100 billion to trillion-dollar compute clusters.

過去1年間、話題はサンフランシスコに移り、革新の発祥地である今、話題は100億ドルから1兆ドルのコンピュートクラスターに移りました。

Every six months, another zero is added.

半年ごとに、ゼロが追加されています。

He's talking about the scaling now.

彼は今、スケーリングについて話しています。

By the end of the decade, American electricity production will have grown tens of percent from the shale fields of Pennsylvania to the solar farms of Nevada.

この10年の終わりまでに、アメリカの電力生産は、ペンシルベニア州のシェールフィールドからネバダ州の太陽光発電所まで、何十パーセントも成長するでしょう。

Hundreds of millions of GPUs will hum.

何億ものGPUが鳴り響くでしょう。

The AGI race has begun.

AGIレースが始まりました。

He says by 2025, 2026, these machines will outpace many college graduates.

彼は2025年、2026年までに、これらの機械が多くの大学卒業生を凌駕するだろうと言っています。

By the end of the decade, they will be smarter than you or I. We will have super intelligence.

この10年の終わりまでに、彼らはあなたや私よりも賢くなるでしょう。私たちは超知能を持つことになります。

He is saying by the end of this decade, we will have AGI.

彼は、この10年の終わりまでに、私たちがAGIを持つことになると言っています。

Along the way, national security forces not seen in half a century will be unleashed.

その過程で、半世紀ぶりに見られない国家安全保障部隊が解放されるでしょう。

Before long, the project we will be on.

間もなく、私たちが取り組むプロジェクトになるでしょう。

If we're lucky, we'll be in an all-out race with the CCP.

もし運が良ければ、私たちはCCPとの全面戦争になるかもしれません。

That's China's government.

それは中国政府のことです。

If we're unlucky, an all-out war.

もし運が悪ければ、全面戦争になるかもしれません。

He seems to be extremely concerned with CCP and their espionage of USAI labs.

彼はCCPと彼らがUSAI研究所をスパイ活動していることに非常に関心を持っているようです。

The CCP has been known to use espionage against US corporations.

CCPはUS企業に対してスパイ活動を行っていることで知られています。

I don't think his concern is unwarranted.

彼の懸念は無根ではないと思います。

Here he goes on to say, NVIDIA analysts still think 2024 might be close to the peak.

ここで彼は、NVIDIAのアナリストたちは2024年がピークに近いと考えていると述べています。

Mainstream pundits are stuck on the willful blindness of it's just predicting the next word.

メインストリームの専門家たちは、次の単語を予測するだけだという意図的な盲目性に固執しています。

That's how transformers work currently.

それが現在のトランスフォーマーの動作方法です。

They see only hype and business as usual.

彼らは煽りや業界の通常業務しか見ていません。

At most, they entertain another internet-scale technological change.

せいぜい、彼らは別のインターネット規模の技術変革を考えています。

Before long, the world will wake up.

間もなく、世界は目を覚ますでしょう。

But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs that have situational awareness.

しかし、今のところ、おそらく数百人の人々が、ほとんどがサンフランシスコにいて、状況認識を持っているAI研究所にいます。

Basically an awareness of what's coming.

基本的には、何が来るかを認識していることです。

If you're watching these videos, you probably have an idea as well.

もしこのビデオを見ているなら、おそらくあなたもアイデアを持っているでしょう。

I'm just going to touch on the summaries of his very deep paper.

私は彼の非常に深い論文の要約に触れるだけです。

But I recommend you read it in depth.

しかし、あなたにはそれを詳しく読むことをお勧めします。

You can find it at situational-awareness.ai.

それはsituational-awareness.aiで見つけることができます。

Let's just go over what this very, very smart person in AI thinks.

ここでは、AIの非常に、非常に賢い人がどう考えているかを見てみましょう。

From GPT-4 to AGI counting the OOMs, orders of magnitude.

GPT-4からAGIまで、OOMsを数えています、桁数の差。

AGI by 2027 is strikingly plausible.

2027年までにAGIが実現する可能性は非常に高いです。

gpt2 to GPT-4 took us from preschooler to smart high schooler abilities in four years.

四年間で、GPT-2からGPT-4への進化により、幼稚園児から賢い高校生の能力を身につけました。

Tracing trend lines in compute, 0.4 orders of magnitude per year, algorithmic efficiencies, and unhobbling gains from chatbot to agent.

コンピューターでのトレンドラインの追跡、年間0.4桁のアルゴリズムの効率、チャットボットからエージェントへの利益を取り戻す。

We should expect another preschooler to high schooler-sized qualitative jump by 2027.

2027年までに、もう1つの幼児から高校生サイズの質的な飛躍を期待すべきです。

Here we can see the increase in performance for different tests against AI.

ここでは、異なるテストに対するパフォーマンスの向上が見られます。

As we can see, so back in 1998, it took a long time to have these little increases.

私たちが見ているように、1998年当時はこれらのわずかな増加には長い時間がかかりました。

But then all of a sudden, starting at the end of the 2010s and especially accelerating into the last couple years, we're seeing basically vertical lines, increases that are tremendous.

しかし、突然、2010年代末から特に過去数年間に加速し始め、基本的に垂直線を描く、膨大な増加が見られます。

He goes on to say from AGI to superintelligence, the intelligence explosion.

彼はAGIから超知能へと、知能爆発について述べています。

AI progress won't stop at human level.

AIの進歩は人間のレベルで止まらないでしょう。

Hundreds of millions of AGI's could automate AI research, compressing a decade of algorithmic process into less than a year.

何億ものAGIがAI研究を自動化し、アルゴリズムのプロセスを10年分を1年未満に圧縮する可能性があります。

We would rapidly go from human level to vastly superhuman AI systems.

私たちは急速に人間レベルからはるかに超人的なAIシステムに移行するでしょう。

The power and the peril of superintelligence would be dramatic.

超知能の力と危険は劇的であろう。

He is saying beyond AGI, we're going to have superintelligence systems, potentially hundreds or thousands of them, that are doing the AI research themselves, compounding the output of that research, compounding the intelligence that they have.

彼は、AGIを超えて、AI研究を自ら行い、その研究の成果を複利化し、持っている知能を複利化する可能性がある、数百または数千の超知能システムが登場するだろうと述べています。

He goes on to talk about the race to a trillion-dollar cluster.

彼は、1兆ドルのクラスターへの競争について話し続けます。

We know that there was rumors Sam Altman is raising $7 trillion for a cluster himself.

サム・アルトマンが自ら7兆ドルを調達してクラスターを作るという噂があることは知っています。

The most extraordinary techno capital acceleration has been set in motion.

最も驚くべきテクノキャピタルの加速が始まりました。

As AI revenue grows rapidly, many trillions of dollars will go into GPU, data center and power build out before the end of the decade.

AI収益が急速に成長するにつれ、この10年の終わりまでに多額の資金がGPU、データセンター、電力の整備に投入されるでしょう。

The industrial mobilization, including growing US electricity production by tens of percent will be intense.

工業動員、米国の電力生産を数十パーセント増やすことも含めて、激しいものになるでしょう。

Lock down the labs, security for AGI.

研究所をロックダウンし、AGIのセキュリティを確保してください。

The nation's leading AI labs treat security as an afterthought.

国内の主要なAI研究所は、セキュリティを二の次にしています。

Currently, they're basically handing the key secrets for AGI to the CCP on a silver platter, securing the AGI secrets and weights against the state actor threat will be an immense effort and we're not on track.

現在、彼らは基本的にAGIの重要な秘密を中国共産党に手渡しており、国家主体の脅威に対するAGIの秘密と重みを守ることは膨大な努力が必要であり、私たちは正しい方向に進んでいない。

Reading this, if I understand correctly, he does believe in closed source.

これを読んで、理解しているなら、彼はクローズドソースを信じていると思います。

He wants to lock down the weights.

彼は重みをロックしたいと思っています。

He doesn't want to share them openly.

彼はそれらを公開で共有したくないと思っています。

He thinks that they are too valuable and too dangerous.

彼はそれらがあまりにも貴重であり、危険すぎると考えています。

Next, super alignment.

次に、スーパーアライメントです。

Reliably controlling AI systems much smarter than we are is an unsolved technical problem.

私たちよりもはるかに賢いAIシステムを信頼してコントロールすることは未解決の技術的問題です。

While it is a solvable problem, okay, so that's very hopeful.

それは解決可能な問題ですが、それは非常に希望に満ちています。

He thinks it's solvable.

彼はそれが解決可能だと考えています。

Things could easily go off the rails during a rapid intelligence explosion, which we are having right now.

急速な知性爆発中に簡単に事態が狂ってしまう可能性がありますが、それは今起こっていることです。

Managing this will be extremely tense.

これを管理することは非常に緊張感を持って行われるでしょう。

Failure could easily be catastrophic.

失敗は簡単に壊滅的になる可能性があります。

He is definitely in the category of AI doomer, but he has hope that we can achieve it.

彼は間違いなくAIドゥーマーのカテゴリーに属していますが、私たちがそれを達成できると希望を持っています。

Just not on the path we're on right now.

ただし、現在の進行方向ではありません。

I'm going to skip over the rest of these sections and I want to go on to another person who just left OpenAI.

これらのセクションの残りをスキップして、OpenAIを辞めた別の人物に進みたいと思います。

This is Daniel Kokotajlo.

これはダニエル・ココタジロです。

I hope I'm pronouncing that right.

私が正しく発音していることを願っています。

He just resigned from OpenAI.

彼はOpenAIを辞任しました。

He just posted this on June 4th.

彼はちょうど6月4日にこれを投稿しました。

This is after OpenAI said they're going to release all ex-employees from that disparagement clause.

これはOpenAIが元従業員全員を誹謗中傷条項から解放すると発表した後です。

Let's read it.

読みましょう。

Here is a tweet.

こちらがツイートです。

In April, I resigned from OpenAI after losing confidence that the company would behave responsibly in its attempt to build artificial general intelligence, AI systems that are generally smarter than humans.

4月、私はOpenAIを辞職しました。会社が責任を持って人間よりも賢いAIシステムを構築しようとする際に、会社が適切に行動するとは自信が持てなくなったからです。

I joined with the hope that we would invest much more in safety research as our systems became more scalable.

私は、システムがよりスケーラブルになるにつれて、安全性の研究にもっと多くの投資をすると期待して参加しました。

But OpenAI never made this pivot.

しかし、OpenAIはこの方向転換を行いませんでした。

People started resigning when they realized this.

これを理解した人々が辞職し始めました。

I was not the first or last to do so.

私は最初でも最後でもありませんでした。

Another is Jan, who was the head of super alignment.

もう一人は、スーパーアライメントの責任者であるJanです。

He explicitly said, I was promised, I think he said, 20% of compute resources for super alignment research and they were not given it.

彼は明言しています。「私は約束された、20%の計算リソースをスーパーアライメント研究に割り当てると思っていましたが、それは与えられませんでした。」

When I left, I was asked to sign paperwork with a non-disparagement clause that would stop me from saying anything critical of the company, we already know about that now.

私が去る際、会社から非難を避けるための契約書に署名するよう求められました。私たちはすでにそのことを知っています。

It was clear from the paperwork and my communications with OpenAI that I would lose my vested equity in 60 days if I refused to sign.

契約書やOpenAIとのやり取りからはっきりと分かりました。拒否すれば60日以内に私の株式報酬を失うことになると。

Some documents and emails are visible here and actually I made a video about this.

ここにいくつかの文書やメールが見えますが、実際に私はこのことについてビデオを作りました。

Part five, my wife and I thought hard about it and decided that my freedom to speak up in the future was more important than the equity.

第五部、妻と私はよく考え、将来自由に意見を述べることが株式よりも重要だと決断しました。

I told OpenAI that I could not sign because I did not think the policy was ethical.

私は、その方針が倫理的でないと考えたため、OpenAIに署名できないと伝えました。

They accepted my decision, we parted ways.

彼らは私の決定を受け入れ、別れました。

That's really nice of them to do.

彼らがそれをしてくれたのは本当に親切です。

They basically didn't have to pay out.

基本的には支払う必要はありませんでした。

But they also did not get that signature.

しかし、彼らはまたその署名を得ることもできませんでした。

But at this point, I think it doesn't matter.

しかし、この時点では、それは重要ではないと思います。

He still, I believe, should get his vested equity.

彼は依然として、私は信じています、彼の株式報酬を受け取るべきです。

Next, the systems that labs like OpenAI are building have the capacity to do enormous good.

次に、OpenAIのような研究所が構築しているシステムは、莫大な利益をもたらす能力を持っています。

But if we are not careful, they can be destabilizing in the short term and catastrophic in the long term.

しかし、慎重でなければ、それらは短期間では不安定になり、長期的には壊滅的になる可能性があります。

These systems are not ordinary software.

これらのシステムは普通のソフトウェアではありません。

They are artificial neural nets that learn from massive amounts of data.

それらは膨大な量のデータから学習する人工ニューラルネットです。

There is a rapidly growing scientific literature on interpretability, alignment and control.

解釈可能性、整合性、制御に関する急速に成長している科学文献があります。

But these fields are still in their infancy.

しかし、これらの分野はまだ幼稚な段階にあります。

I mentioned in a previous video that I am going to do a video about interpretability, which I find to be fascinating.

以前のビデオで、解釈可能性についてのビデオを作ると言いましたが、それは私が魅力的だと思うものです。

Anthropic AI just published a paper about interpretability and how they are starting to see that it is possible.

Anthropic AIが解釈可能性に関する論文を発表し、それが可能であることを始めて見始めていることを述べました。

Basically, by the way, what that means is traditionally AI is like a black box.

基本的に、AIが従来のようにブラックボックスであるということです。

The algorithm is a black box.

アルゴリズムはブラックボックスです。

You put in an input, something happens and you get the output.

入力を行うと何かが起こり、出力が得られます。

What we wanna be able to do is see into that black box to understand, okay, why did this input result in this output?

私たちができることは、そのブラックボックスの中を見て、なぜこの入力がこの出力につながったのかを理解することです。

Which nodes within the AI system were activated as part of that process?

AIシステム内のどのノードがそのプロセスの一部として活性化されたのかを見ることができます。

We don't have a good understanding of that today.

現在、それについては良い理解がありません。

Next, there is a lot we don't understand about how these systems work and whether they will remain aligned to human interests as they get smarter and possibly surpass human level intelligence in all arenas.

次に、これらのシステムがどのように機能し、より賢くなり、おそらくあらゆる分野で人間を超える知能を持ち続けるかについて、私たちが理解していないことがたくさんあります。

Meanwhile, there is little to no oversight over this technology.

一方、この技術に対する監督はほとんどないか、まったくないです。

Instead, we rely on the companies building them to self-govern, even as profit motives and excitement about the technology push them to move fast and break things.

代わりに、それらを構築している企業が自己規制することを頼りにしていますが、利益動機や技術への興奮が彼らを急いで物事を壊す方向に押し進めています。

If you're not familiar with that statement, move fast and break things, that is from Facebook in the very early days that was plastered everywhere.

もしその声明に馴染みがない場合、「急いで進み、物事を壊す」というのは、Facebookの非常に初期の時代からどこにでも貼られていたものです。

That was their internal motto, move fast, break things.

その内部のモットーは、「速く動いて、物事を壊す」というものでした。

But that no longer works.

しかし、それはもはや機能していません。

Actually, that doesn't work in a lot of industries.

実際、多くの産業ではそれは機能しません。

Healthcare, for example.

例えば、医療業界です。

You don't wanna move fast and break things.

速く動いて物事を壊したくないですね。

Otherwise, you get something like Theranos where they moved fast, broke things and put people's lives at risk.

さもないと、Theranosのような状況になります。彼らは速く動いて物事を壊し、人々の命を危険にさらしました。

Silencing researchers and making them afraid of retaliation is dangerous when we are currently some of the only people in a position to warn the public.

研究者を沈黙させ、報復を恐れさせることは、現在、一般市民に警告を発する立場にいるわれわれにとって危険です。

Agreed, I applaud OpenAI for promising to change these policies, yes.

同意します。OpenAIがこれらの方針を変更することを約束していることに賞賛します。

It's concerning that they engaged in these intimidation tactics for so long and only course corrected under public pressure.

彼らがこれらの威圧的な手法に長い間従事し、公衆の圧力の下でしかコース修正しなかったことは懸念されます。

Why does anybody ever course correct, especially a company?

なぜ誰もがコース修正するのでしょうか、特に企業が？

It's because you got caught.

それは、あなたが捕まったからです。

It's also concerning that leaders who signed off on these policies claim they didn't know about them.

また、これらの方針に署名したリーダーがそれについて知らなかったと主張することも懸念されます。

That is surprising.

それは驚きです。

We owe it to the public who will bear the brunt of these dangers to do better than this.

これらの危険の最前線に立つ一般市民には、これよりも良いことをしなければなりません。

Reasonable minds can disagree about whether AGI will happen soon, but it is foolish to put so few resources into preparing.

合理的な考え方は、AGIがすぐに実現するかどうかについて異論があるかもしれませんが、準備にあまりにも少ないリソースを投入するのは愚かです。

He goes on to talk about right to warn.ai, which I just reviewed at the beginning of this video.

彼は、このビデオの冒頭でちょうどレビューしたright to warn.aiについて話を続けます。

Elon Musk with his amazing single exclamation mark reply.

エロン・マスクは、彼の素晴らしい一つの感嘆符の返信をします。

Next, I wanna go back to situationalawareness.ai.

次に、situationalawareness.aiに戻りたいと思います。

Remember, this is Leopold, former OpenAI employee.

覚えておいてください、これは元OpenAIの従業員であるLeopoldです。

He has this really interesting chart right here.

彼はここに本当に興味深いチャートを持っています。

He has on the left AGI and a few different categories for safety. and then superintelligence, which is beyond AGI. So required alignment technique.

左側にAGIといくつかの異なる安全カテゴリーがあります。そして、AGIを超えたスーパーインテリジェンスがあります。ですので、必要なのは整列技術です。

We know AGI, RLHF, reinforcement learning through human feedback, plus plus.

私たちはAGI、RLHF、人間のフィードバックを通じた強化学習などを知っています。

But for superintelligence, it is novel, qualitatively different technical solutions that we don't know yet.

しかし、スーパーインテリジェンスに関しては、私たちがまだ知らない画期的で質的に異なる技術的解決策が必要です。

What happens if we fail with AGI?

AGIに失敗した場合、何が起こるのでしょうか？

The stakes are low.

リスクは低いです。

What happens if we fail with superintelligence?

スーパーインテリジェンスに失敗した場合、何が起こるのでしょうか？

Catastrophic.

壊滅的です。

Architecture and algorithms.

アーキテクチャとアルゴリズム。

Familiar descendants of current systems, fairly benign safety properties.

現行システムの馴染み深い派生物で、かなり穏やかな安全性を持っています。

For superintelligence, architecture alien, designed by previous generations, super smart AI system.

スーパーインテリジェンスに関しては、前の世代によって設計された異質なアーキテクチャで、超スマートなAIシステムです。

AI designing AI, designing AI, to the point where it is so alien to us that we basically don't know what's going on.

AIがAIを設計し、AIを設計し、私たちにとって非常に異質な状態になるまで、基本的に何が起こっているのかわからなくなります。

I've actually talked about something similar in a different context and that's programming.

実際、私は異なる文脈で似たようなことについて話しています。それはプログラミングです。

I am a big believer that in 10 years, we're not gonna need developers anymore.

私は、10年後には開発者はもう必要ないと信じています。

There's a few reasons for that.

その理由はいくつかあります。

But one is code as it looks today only looks that way because humans are really bad at programming.

そのうちの1つは、今日のコードは、人間がプログラミングが本当に下手だからそのように見える必要があるということです。

It has to look like that.

それはそのように見える必要があります。

It has to look as much like natural language as possible.

できるだけ自然言語に似せる必要があります。

However, in the future, if AI is writing all the code and eventually we're going to have Large Language Models computing directly on end devices, then all of a sudden it doesn't need to look like natural language at all.

しかし、将来、AIがすべてのコードを書いていて、最終的には大規模言語モデルがエンドデバイス上で直接計算するようになると、突然、それは全く自然言語のように見える必要はありません。

In fact, it could just be symbols or it could be something we can't even imagine.

実際、それは単なる記号であるか、私たちが想像できないものであるかもしれません。

There's a lot to think about there, but it's kind of the same thing.

考えることがたくさんありますが、基本的には同じことです。

If AI is designing a system, it could look like nothing we've seen before and we just might not understand what we're looking at.

AIがシステムを設計している場合、これまで見たことのないものに見えるかもしれず、私たちが見ているものを理解できないかもしれません。

Backdrop world is normal and super intelligence world is crazy, okay.

バックドロップワールドは普通で、スーパーインテリジェンスワールドはクレイジーですね。

Last, our ability to understand it, AGI, we can understand what the systems are doing, how they work and whether they're aligned.

最後に、AGIを理解する能力は、システムが何をしているか、どのように機能しているか、それらが整合しているかどうかを理解できます。

We're not quite there.

まだそこには至っていません。

I think he's talking about in the future.

彼は将来のことを話していると思います。

For super intelligence, we have no ability to understand what's going on, how to tell if systems are still aligned or benign, what the systems are doing.

スーパーインテリジェンスに関しては、何が起こっているのかを理解する能力がなく、システムが整合しているかどうか、悪意がないかどうか、システムが何をしているかを理解する能力がありません。

We are entirely reliant on trusting the AI systems.

私たちは完全にAIシステムを信頼するしかありません。

Transition in less than a year with very little time to get decisions right.

非常に少ない時間で正しい決定をするための移行が1年未満で行われます。

His timeline seems very short in my mind, but what do you think?

私の考えでは、彼のタイムラインは非常に短いように思えますが、あなたはどう思いますか？

Let me know in the comments.

コメントで教えてください。

But I do agree that we're going to get to this point where we no longer have the ability to just understand what's going on under the hood, in the black box.

しかし、私は同意します。私たちは、もはやフードの下で何が起こっているかを理解する能力を持たなくなる時点に到達すると思います。

At that point, we do have to just trust AI and that does not seem good to me.

その時点で、私たちは単にAIを信頼する必要があり、それは私にとって良くないように思えます。

I think we need to be able to always trust, but verify, trust, but verify.

私は常に信頼する必要があると思いますが、検証する必要があります。

That's all I'm going to cover from situational awareness today.

今日は状況認識からこれだけをカバーします。

But if you want to see me do a full video all about every section in here, let me know.

しかし、ここにあるすべてのセクションについて完全なビデオを見たい場合は、教えてください。

It is quite complex at times and there's a lot of meat on these bones.

時々かなり複雑で、これらのポイントには多くの要素があります。

Let me know in the comments.

コメントで教えてください。

I'm happy to really do a deep dive, take a bunch of notes and try to convey it as I understand it.

私は本当に深く掘り下げて、たくさんのメモを取り、私が理解しているように伝えようとします。

If you enjoyed this video, please consider giving a like and subscribe and I'll see you in the next one.

このビデオを楽しんでいただけたら、いいねや購読を考えていただけると嬉しいです。次のビデオでお会いしましょう。

この記事が気に入ったらサポートをしてみませんか？