The post The Rocket Alignment Problem appeared first on Machine Intelligence Research Institute.

]]>

(*Somewhere in a not-very-near neighboring world…*)

**ALFONSO:** Hello, Beth. I’ve noticed a lot of speculations lately about “spaceplanes” being used to attack cities, or possibly becoming infused with malevolent spirits that inhabit the celestial realms so that they turn on their own engineers.

I’m rather skeptical of these speculations. Indeed, I’m a bit skeptical that airplanes will be able to even rise as high as stratospheric weather balloons anytime in the next century. But I understand that your institute wants to address the potential problem of malevolent or dangerous spaceplanes, and that you think this is an important present-day cause.

**BETH:** That’s… really not how we at the Mathematics of Intentional Rocketry Institute would phrase things.

The problem of malevolent celestial spirits is what all the news articles are focusing on, but we think the real problem is something entirely different. We’re worried that there’s a difficult, theoretically challenging problem which modern-day rocket punditry is mostly overlooking. We’re worried that if you aim a rocket at where the Moon is in the sky, and press the launch button, the rocket may not actually end up at the Moon.

**ALFONSO:** I understand that it’s very important to design fins that can stabilize a spaceplane’s flight in heavy winds. That’s important spaceplane safety research and someone needs to do it.

But if you were working on that sort of safety research, I’d expect you to be collaborating tightly with modern airplane engineers to test out your fin designs, to demonstrate that they are actually useful.

**BETH:** Aerodynamic designs are important features of any safe rocket, and we’re quite glad that rocket scientists are working on these problems and taking safety seriously. That’s not the sort of problem that we at MIRI focus on, though.

**ALFONSO:** What’s the concern, then? Do you fear that spaceplanes may be developed by ill-intentioned people?

**BETH:** That’s not the failure mode we’re worried about right now. We’re more worried that right now, *nobody* can tell you how to point your rocket’s nose such that it goes to the moon, nor indeed *any* prespecified celestial destination. Whether Google or the US Government or North Korea is the one to launch the rocket won’t make a pragmatic difference to the probability of a successful Moon landing from our perspective, because right now *nobody knows how to aim any kind of rocket anywhere*.

**ALFONSO:** I’m not sure I understand.

**BETH:** We’re worried that even if you aim a rocket at the Moon, such that the nose of the rocket is clearly lined up with the Moon in the sky, the rocket won’t go to the Moon. We’re not sure what a realistic path from the Earth to the moon looks like, but we suspect it might not be a very straight path, and it may not involve pointing the nose of the rocket at the moon at all. We think the most important thing to do next is to advance our understanding of rocket trajectories until we have a better, deeper understanding of what we’ve started calling the “rocket alignment problem”. There are other safety problems, but this rocket alignment problem will probably take the most total time to work on, so it’s the most urgent.

**ALFONSO:** Hmm, that sounds like a bold claim to me. Do you have a reason to think that there are invisible barriers between here and the moon that the spaceplane might hit? Are you saying that it might get very very windy between here and the moon, more so than on Earth? Both eventualities could be worth preparing for, I suppose, but neither seem likely.

**BETH:** We don’t think it’s particularly likely that there are invisible barriers, no. And we don’t think it’s going to be especially windy in the celestial reaches — quite the opposite, in fact. The problem is just that we don’t yet know how to plot *any* trajectory that a vehicle could realistically take to get from Earth to the moon.

**ALFONSO:** Of course we can’t plot an actual trajectory; wind and weather are too unpredictable. But your claim still seems too strong to me. Just aim the spaceplane at the moon, go up, and have the pilot adjust as necessary. Why wouldn’t that work? Can you prove that a spaceplane aimed at the moon won’t go there?

**BETH:** We don’t think we can *prove* anything of that sort, no. Part of the problem is that realistic calculations are extremely hard to do in this area, after you take into account all the atmospheric friction and the movements of other celestial bodies and such. We’ve been trying to solve some drastically simplified problems in this area, on the order of assuming that there is no atmosphere and that all rockets move in perfectly straight lines. Even those unrealistic calculations strongly suggest that, in the much more complicated real world, just pointing your rocket’s nose at the Moon also won’t make your rocket end up at the Moon. I mean, the fact that the real world is more complicated doesn’t exactly make it any *easier* to get to the Moon.

**ALFONSO:** Okay, let me take a look at this “understanding” work you say you’re doing…

Huh. Based on what I’ve read about the math you’re trying to do, I can’t say I understand what it has to do with the Moon. Shouldn’t helping spaceplane pilots exactly target the Moon involve looking through lunar telescopes and studying exactly what the Moon looks like, so that the spaceplane pilots can identify particular features of the landscape to land on?

**BETH:** We think our present stage of understanding is much too crude for a detailed Moon map to be our next research target. We haven’t yet advanced to the point of targeting one crater or another for our landing. We can’t target *anything* at this point. It’s more along the lines of “figure out how to talk mathematically about curved rocket trajectories, instead of rockets that move in straight lines”. Not even realistically curved trajectories, right now, we’re just trying to get past straight lines at all –

**ALFONSO:** But planes on Earth move in curved lines all the time, because the Earth itself is curved. It seems reasonable to expect that future spaceplanes will also have the capability to move in curved lines. If your worry is that spaceplanes will only move in straight lines and miss the Moon, and you want to advise rocket engineers to build rockets that move in curved lines, well, that doesn’t seem to me like a great use of anyone’s time.

**BETH:** You’re trying to draw much too direct of a line between the math we’re working on right now, and actual rocket designs that might exist in the future. It’s *not* that current rocket ideas are almost right, and we just need to solve one or two more problems to make them work. The conceptual distance that separates anyone from solving the rocket alignment problem is *much greater* than that.

Right now everyone is *confused* about rocket trajectories, and we’re trying to become *less confused*. That’s what we need to do next, not run out and advise rocket engineers to build their rockets the way that our current math papers are talking about. Not until we stop being *confused* about extremely basic questions like why the Earth doesn’t fall into the Sun.

**ALFONSO:** I don’t think the Earth is going to collide with the Sun anytime soon. The Sun has been steadily circling the Earth for a long time now.

**BETH:** I’m not saying that our goal is to address the risk of the Earth falling into the Sun. What I’m trying to say is that if humanity’s present knowledge can’t answer questions like “Why doesn’t the Earth fall into the Sun?” then we don’t know very much about celestial mechanics and we won’t be able to aim a rocket through the celestial reaches in a way that lands softly on the Moon.

As an example of work we’re presently doing that’s aimed at improving our understanding, there’s what we call the “tiling positions” problem. The tiling positions problem is how to fire a cannonball from a cannon in such a way that the cannonball circumnavigates the earth over and over again, “tiling” its initial coordinates like repeating tiles on a tessellated floor –

**ALFONSO:** I read a little bit about your work on that topic. I have to say, it’s hard for me to see what firing things from cannons has to do with getting to the Moon. Frankly, it sounds an awful lot like Good Old-Fashioned Space Travel, which everyone knows doesn’t work. Maybe Jules Verne thought it was possible to travel around the earth by firing capsules out of cannons, but the modern study of high-altitude planes has completely abandoned the notion of firing things out of cannons. The fact that you go around talking about firing things out of cannons suggests to me that you haven’t kept up with all the innovations in airplane design over the last century, and that your spaceplane designs will be completely unrealistic.

**BETH:** We know that rockets will not actually be fired out of cannons. We really, really know that. We’re intimately familiar with the reasons why nothing fired out of a modern cannon is ever going to reach escape velocity. I’ve previously written several sequences of articles in which I describe why cannon-based space travel doesn’t work.

**ALFONSO:** But your current work is all about firing something out a cannon in such a way that it circles the earth over and over. What could that have to do with any realistic advice that you could give to a spaceplane pilot about how to travel to the Moon?

**BETH:** Again, you’re trying to draw much too straight a line between the math we’re doing right now, and direct advice to future rocket engineers.

We think that if we could find an angle and firing speed such that an ideal cannon, firing an ideal cannonball at that speed, on a perfectly spherical Earth with no atmosphere, would lead to that cannonball entering what we would call a “stable orbit” without hitting the ground, then… we might have understood something really fundamental and important about celestial mechanics.

Or maybe not! It’s hard to know in advance which questions are important and which research avenues will pan out. All you can do is figure out the next tractable-looking problem that confuses you, and try to come up with a solution, and hope that you’ll be less confused after that.

**ALFONSO:** You’re talking about the cannonball hitting the ground as a problem, and how you want to avoid that and just have the cannonball keep going forever, right? But real spaceplanes aren’t going to be aimed at the ground in the first place, and lots of regular airplanes manage to not hit the ground. It seems to me that this “being fired out of a cannon and hitting the ground” scenario that you’re trying to avoid in this “tiling positions problem” of yours just isn’t a failure mode that real spaceplane designers would need to worry about.

**BETH:** We are not worried about real rockets being fired out of cannons and hitting the ground. That is not why we’re working on the tiling positions problem. In a way, you’re being far too optimistic about how much of rocket alignment theory is already solved! We’re not so close to understanding how to aim rockets that the kind of designs people are talking about now *would* work if only we solved a particular set of remaining difficulties like not firing the rocket into the ground. You need to go more meta on understanding the kind of progress we’re trying to make.

We’re working on the tiling positions problem because we think that being able to fire a cannonball at a certain instantaneous velocity such that it enters a stable orbit… is the sort of problem that somebody who could really actually launch a rocket through space and have it move in a particular curve that really actually ended with softly landing on the Moon would be able to solve *easily*. So the fact that we can’t solve it is alarming. If we can figure out how to solve this much simpler, much more crisply stated “tiling positions problem” with imaginary cannonballs on a perfectly spherical earth with no atmosphere, which is a lot easier to analyze than a Moon launch, we might thereby take one more incremental step towards eventually becoming the sort of people who could plot out a Moon launch.

**ALFONSO:** If you don’t think that Jules-Verne-style space cannons are the wave of the future, I don’t understand why you keep talking about cannons in particular.

**BETH:** Because there’s a lot of sophisticated mathematical machinery already developed for aiming cannons. People have been aiming cannons and plotting cannonball trajectories since the sixteenth century. We can take advantage of that existing mathematics to say exactly how, if we fired an ideal cannonball in a certain direction, it would plow into the ground. If we tried talking about rockets with realistically varying acceleration, we can’t even manage to prove that a rocket like that *won’t* travel around the Earth in a perfect square, because with all that realistically varying acceleration and realistic air friction it’s impossible to make any sort of definite statement one way or another. Our present understanding isn’t up to it.

**ALFONSO:** Okay, another question in the same vein. Why is MIRI sponsoring work on adding up lots of tiny vectors? I don’t even see what that has to do with rockets in the first place; it seems like this weird side problem in abstract math.

**BETH:** It’s more like… at several points in our investigation so far, we’ve run into the problem of going from a function about time-varying accelerations to a function about time-varying positions. We kept running into this problem as a blocking point in our math, in several places, so we branched off and started trying to analyze it explicitly. Since it’s about the pure mathematics of points that don’t move in discrete intervals, we call it the “logical undiscreteness” problem. Some of the ways of investigating this problem involve trying to add up lots of tiny, varying vectors to get a big vector. Then we talk about how that sum seems to change more and more slowly, approaching a limit, as the vectors get tinier and tinier and we add up more and more of them… or at least that’s one avenue of approach.

**ALFONSO:** I just find it hard to imagine people in future spaceplane rockets staring out their viewports and going, “Oh, no, we don’t have tiny enough vectors with which to correct our course! If only there was some way of adding up even more vectors that are even smaller!” I’d expect future calculating machines to do a pretty good job of that already.

**BETH:** Again, you’re trying to draw much too straight a line between the work we’re doing now, and the implications for future rocket designs. It’s not like we think a rocket design will almost work, but the pilot won’t be able to add up lots of tiny vectors fast enough, so we just need a faster algorithm and then the rocket will get to the Moon. This is foundational mathematical work that we think might play a role in multiple basic concepts for understanding celestial trajectories. When we try to plot out a trajectory that goes all the way to a soft landing on a moving Moon, we feel confused and blocked. We think part of the confusion comes from not being able to go from acceleration functions to position functions, so we’re trying to resolve our confusion.

**ALFONSO:** This sounds suspiciously like a philosophy-of-mathematics problem, and I don’t think that it’s possible to progress on spaceplane design by doing philosophical research. The field of philosophy is a stagnant quagmire. Some philosophers still believe that going to the moon is impossible; they say that the celestial plane is fundamentally separate from the earthly plane and therefore inaccessible, which is clearly silly. Spaceplane design is an engineering problem, and progress will be made by engineers.

**BETH:** I agree that rocket design will be carried out by engineers rather than philosophers. I also share some of your frustration with philosophy in general. For that reason, we stick to well-defined mathematical questions that are likely to have actual answers, such as questions about how to fire a cannonball on a perfectly spherical planet with no atmosphere such that it winds up in a stable orbit.

This often requires developing new mathematical frameworks. For example, in the case of the logical undiscreteness problem, we’re developing methods for translating between time-varying accelerations and time-varying positions. You can call the development of new mathematical frameworks “philosophical” if you’d like — but if you do, remember that it’s a very different kind of philosophy than the “speculate about the heavenly and earthly planes” sort, and that we’re always pushing to develop new mathematical frameworks or tools.

**ALFONSO:** So from the perspective of the public good, what’s a good thing that might happen if you solved this logical undiscreteness problem?

**BETH:** Mainly, we’d be less confused and our research wouldn’t be blocked and humanity could actually land on the Moon someday. To try and make it more concrete – though it’s hard to do that without actually knowing the concrete solution – we might be able to talk about incrementally more realistic rocket trajectories, because our mathematics would no longer break down as soon as we stopped assuming that rockets moved in straight lines. Our math would be able to talk about exact curves, instead of a series of straight lines that approximate the curve.

**ALFONSO:** An exact curve that a rocket follows? This gets me into the main problem I have with your project in general. I just don’t believe that any future rocket design will be the sort of thing that can be analyzed with absolute, perfect precision so that you can get the rocket to the Moon based on an absolutely plotted trajectory with no need to steer. That seems to me like a bunch of mathematicians who have no clue how things work in the real world, wanting everything to be perfectly calculated. Look at the way Venus moves in the sky; usually it travels in one direction, but sometimes it goes retrograde in the other direction. We’ll just have to steer as we go.

**BETH:** That’s not what I meant by talking about exact curves… Look, even if we can invent logical undiscreteness, I agree that it’s futile to try to predict, in advance, the precise trajectories of all of the winds that will strike a rocket on its way off the ground. Though I’ll mention parenthetically that things might actually become calmer and easier to predict, once a rocket gets sufficiently high up –

**ALFONSO:** Why?

**BETH:** Let’s just leave that aside for now, since we both agree that rocket positions are hard to predict exactly during the atmospheric part of the trajectory, due to winds and such. And yes, if you can’t exactly predict the initial trajectory, you can’t exactly predict the later trajectory. So, indeed, the proposal is definitely not to have a rocket design so perfect that you can fire it at exactly the right angle and then walk away without the pilot doing any further steering. The point of doing rocket math isn’t that you want to predict the rocket’s exact position at every microsecond, in advance.

**ALFONSO:** Then why obsess over pure math that’s too simple to describe the rich, complicated real universe where sometimes it rains?

**BETH:** It’s true that a real rocket isn’t a simple equation on a board. It’s true that there are all sorts of aspects of a real rocket’s shape and internal plumbing that aren’t going to have a mathematically compact characterization. What MIRI is doing isn’t the right degree of mathematization for all rocket engineers for all time; it’s the mathematics for us to be using right now (or so we hope).

To build up the field’s understanding incrementally, we need to talk about ideas whose consequences can be pinpointed precisely enough that people can analyze scenarios in a shared framework. We need enough precision that someone can say, “I think in scenario X, design Y does Z”, and someone else can say, “No, in scenario X, Y actually does W”, and the first person responds, “Darn, you’re right. Well, is there some way to change Y so that it would do Z?”

If you try to make things realistically complicated at this stage of research, all you’re left with is verbal fantasies. When we try to talk to someone with an enormous flowchart of all the gears and steering rudders they think should go into a rocket design, and we try to explain why a rocket pointed at the Moon doesn’t necessarily end up at the Moon, they just reply, “Oh, my rocket won’t do *that*.” Their ideas have enough vagueness and flex and underspecification that they’ve achieved the safety of nobody being able to prove to them that they’re wrong. It’s impossible to incrementally build up a body of collective knowledge that way.

The goal is to start building up a library of tools and ideas we can use to discuss trajectories formally. Some of the key tools for formalizing and analyzing *intuitively* plausible-seeming trajectories haven’t yet been expressed using math, and we can live with that for now. We still try to find ways to represent the key ideas in mathematically crisp ways whenever we can. That’s not because math is so neat or so prestigious; it’s part of an ongoing project to have arguments about rocketry that go beyond “Does not!” vs. “Does so!”

**ALFONSO:** I still get the impression that you’re reaching for the warm, comforting blanket of mathematical reassurance in a realm where mathematical reassurance doesn’t apply. We can’t obtain a mathematical certainty of our spaceplanes being absolutely sure to reach the Moon with nothing going wrong. That being the case, there’s no point in trying to pretend that we can use mathematics to get absolute guarantees about spaceplanes.

**BETH:** Trust me, I am not going to feel “reassured” about rocketry no matter what math MIRI comes up with. But, yes, of course you can’t obtain a mathematical assurance of any physical proposition, nor assign probability 1 to any empirical statement.

**ALFONSO:** Yet you talk about proving theorems – proving that a cannonball will go in circles around the earth indefinitely, for example.

**BETH:** Proving a theorem about a rocket’s trajectory won’t ever let us feel comfortingly certain about where the rocket is actually going to end up. But if you can prove a theorem which says that your rocket would go to the Moon if it launched in a perfect vacuum, maybe you can attach some steering jets to the rocket and then have it actually go to the Moon in real life. Not with 100% probability, but with probability greater than zero.

The point of our work isn’t to take current ideas about rocket aiming from a 99% probability of success to a 100% chance of success. It’s to get past an approximately 0% chance of success, which is where we are now.

**ALFONSO:** Zero percent?!

**BETH:** Modulo Cromwell’s Rule, yes, zero percent. If you point a rocket’s nose at the Moon and launch it, it does not go to the Moon.

**ALFONSO:** I don’t think future spaceplane engineers will actually be that silly, if direct Moon-aiming isn’t a method that works. They’ll lead the Moon’s current motion in the sky, and aim at the part of the sky where Moon will appear on the day the spaceplane is a Moon’s distance away. I’m a bit worried that you’ve been talking about this problem so long without considering such an obvious idea.

**BETH:** We considered that idea very early on, and we’re pretty sure that it still doesn’t get us to the Moon.

**ALFONSO:** What if I add steering fins so that the rocket moves in a more curved trajectory? Can you prove that no version of that class of rocket designs will go to the Moon, no matter what I try?

**BETH:** Can you sketch out the trajectory that you think your rocket will follow?

**ALFONSO:** It goes from the Earth to the Moon.

**BETH:** In a bit more detail, maybe?

**ALFONSO:** No, because in the real world there are always variable wind speeds, we don’t have infinite fuel, and our spaceplanes don’t move in perfectly straight lines.

**BETH:** Can you sketch out a trajectory that you think a simplified version of your rocket will follow, so we can examine the assumptions your idea requires?

**ALFONSO:** I just don’t believe in the general methodology you’re proposing for spaceplane designs. We’ll put on some steering fins, turn the wheel as we go, and keep the Moon in our viewports. If we’re off course, we’ll steer back.

**BETH:** … We’re actually a bit concerned that standard steering fins may stop working once the rocket gets high enough, so you won’t actually find yourself able to correct course by much once you’re in the celestial reaches – like, if you’re already on a good course, you can correct it, but if you screwed up, you won’t just be able to turn around like you could turn around an airplane –

**ALFONSO:** Why not?

**BETH:** We can go into that topic too; but even given a simplified model of a rocket that you *could* steer, a walkthrough of the steps along the path that simplified rocket would take to the Moon would be an important step in moving this discussion forward. Celestial rocketry is a domain that we expect to be unusually difficult – even compared to building rockets on Earth, which is already a famously hard problem because they usually just explode. It’s not that everything has to be neat and mathematical. But the overall difficulty is such that, in a proposal like “lead the moon in the sky,” if the core ideas don’t have a certain amount of solidity about them, it would be equivalent to firing your rocket randomly into the void.

If it feels like you don’t know for sure whether your idea works, but that it might work; if your idea has many plausible-sounding elements, and to you it feels like nobody has been able to *convincingly* explain to you how it would fail; then, in real life, that proposal has a roughly 0% chance of steering a rocket to the Moon.

If it seems like an idea is extremely solid and clearly well-understood, if it feels like this proposal should definitely take a rocket to the Moon without fail in good conditions, then maybe under the best-case conditions we should assign an 85% subjective credence in success, or something in that vicinity.

**ALFONSO:** So uncertainty automatically means failure? This is starting to sound a bit paranoid, honestly.

**BETH:** The idea I’m trying to communicate is something along the lines of, “If you can reason rigorously about why a rocket should definitely work in principle, it might work in real life, but if you have anything less than that, then it definitely won’t work in real life.”

I’m not asking you to give me an absolute mathematical proof of empirical success. I’m asking you to give me something more like a sketch for how a simplified version of your rocket could move, that’s sufficiently determined in its meaning that you can’t just come back and say “Oh, I didn’t mean *that*” every time someone tries to figure out what it actually does or pinpoint a failure mode.

This isn’t an unreasonable demand that I’m imposing to make it impossible for any ideas to pass my filters. It’s the primary bar all of us have to pass to contribute to collective progress in this field. And a rocket design which can’t even pass that conceptual bar has roughly a 0% chance of landing softly on the Moon.

The post The Rocket Alignment Problem appeared first on Machine Intelligence Research Institute.

]]>The post September 2018 Newsletter appeared first on Machine Intelligence Research Institute.

]]>On the fundraising side, we received a $489,000 grant from the Long-Term Future Fund, a $150,000 AI Safety Retraining Program grant from the Open Philanthropy Project, and an amazing surprise $1.02 million grant from “Anonymous Ethereum Investor #2”!

- New research forum posts: Reducing Collective Rationality to Individual Optimization in Common-Payoff Games Using MCMC; History of the Development of Logical Induction
- We spoke at the Human-Aligned AI Summer School in Prague.
- MIRI advisor Blake Borgeson has joined our Board of Directors, and DeepMind Research Scientist Victoria Krakovna has become a MIRI research advisor.

- The Open Philanthropy Project is accepting a new round of applicants for its AI Fellows Program.

The post September 2018 Newsletter appeared first on Machine Intelligence Research Institute.

]]>The post Summer MIRI Updates appeared first on Machine Intelligence Research Institute.

]]>In our last major updates—our 2017 strategic update and fundraiser posts—we said that our current focus is on technical research and executing our biggest-ever hiring push. Our supporters responded with an incredible show of support at the end of the year, putting us in an excellent position to execute on our most ambitious growth plans.

In this post, I’d like to provide some updates on our recruiting efforts and successes, announce some major donations and grants that we’ve received, and provide some other miscellaneous updates.

In brief, our major announcements are:

- We have
**two new full-time research staff**hires to announce. - We’ve received
**$1.7 million in major donations and grants**, $1 million of which came through a tax-advantaged fund for Canadian MIRI supporters.

For more details, see below.

I’m happy to announce the addition of two new research staff to the MIRI team:

**Buck Shlegeris**: Before joining MIRI, Buck worked as a software engineer at PayPal, and he was the first employee at Triplebyte. He previously studied at the Australian National University, majoring in CS and minoring in math and physics, and he has presented work on data structure synthesis at industry conferences. In addition to his research at MIRI, Buck is also helping with recruiting.

**Ben Weinstein-Raun**: Ben joined MIRI after spending two years as a software engineer at Cruise Automation, where he worked on the planning and prediction teams. He previously worked at Counsyl on their automated genomics lab, and helped to found Hacksburg, a hackerspace in Blacksburg, Virginia. He holds a BS from Virginia Tech, where he studied computer engineering.

This year we’ve run a few different programs to help us work towards our hiring goals, and to more generally increase the number of people doing AI alignment research:

- We’ve been co-running a
**series of invite-only workshops**with the Center for Applied Rationality (CFAR), targeted at potential future hires who have strong engineering backgrounds. Participants report really enjoying the workshops, and we’ve found them very useful for getting to know potential research staff hires.^{1}If you’d be interested in attending one of these workshops, send Buck an email.

- We helped run the
**AI Summer Fellows Program**with CFAR. We had a large and extremely strong pool of applicants, with over 170 applications for 30 slots (versus 50 applications for 20 slots in 2017). The program this year was more mathematically flavored than in 2017, and concluded with a flurry of new analyses by participants. On the whole, the program seems to have been more successful at digging into AI alignment problems than in previous years, as well as more successful at seeding ongoing collaborations between participants, and between participants and MIRI staff.

- We ran a ten-week
**research internship program**this summer, from June through August.^{2}This included our six interns attending AISFP and pursuing a number of independent lines of research, with a heavy focus on tiling agents. Among other activities, interns looked for Vingean reflection in expected utility maximizers, distilled early research on subsystem alignment, and built on Abram’s Complete Class Theorems approach to decision theory.

In related news, we’ve been restructuring and growing our operations team to ensure we’re well positioned to support the research team as we grow. Alex Vermeer has taken on a more general support role as our process and projects head. In addition to his donor relationships and fundraising focus, Colm Ó Riain has taken on a central role in our recruiting efforts as our head of growth. Aaron Silverbook is now heading operations; we’ve brought on Carson Jones as our new office manager; and long-time remote MIRI contractor Jimmy Rintjema is now our digital infrastructure lead.^{3}

On the fundraising side, I’m happy to announce that we’ve received several major donations and grants.

First, following our $1.01 million donation from an anonymous Ethereum investor in 2017, we’ve received a huge new donation of **$1.02 million** from “Anonymous Ethereum Investor #2”, based in Canada! The donation was made through Rethink Charity Forward’s recently established tax-advantaged fund for Canadian MIRI supporters.^{4}

Second, the departing administrator of the Long-Term Future Fund, Nick Beckstead, has recommended a **$489,000** grant to MIRI, aimed chiefly at funding improvements to organizational efficiency and staff productivity.

Together, these contributions have helped ensure that we remain in the solid position we were in following our 2017 fundraiser, as we attempt to greatly scale our team size. Our enormous thanks for this incredible support, and further thanks to RC Forward and the Centre for Effective Altruism for helping build the infrastructure that made these contributions possible.

We’ve also received a **$150,000 ****AI Safety Retraining Program** grant from the Open Philanthropy Project to provide stipends and guidance to a few highly technically skilled individuals. The goal of the program is to free up 3-6 months of time for strong candidates to spend on retraining, so that they can potentially transition to full-time work on AI alignment. Buck is currently selecting candidates for the program; to date, we’ve made two grants to individuals.^{5}

The LessWrong development team has launched a beta for the **AI Alignment Forum**, a new research forum for technical AI safety work that we’ve been contributing to. I’m very grateful to the LW team for taking on this project, and I’m really looking forward to the launch of the new forum.

Finally, we’ve made substantial progress on the **tiling problem**, which we’ll likely be detailing later this year. See our March research plans and predictions write-up for more on our research priorities.

We’re very happy about these newer developments, and we’re particularly excited to have Buck and Ben on the team. We have a few more big announcements coming up in the not-so-distant future, so stay tuned.

- Ben was a workshop participant, which eventually led to him coming on board at MIRI.
- We also have another research intern joining us in the fall.
- We’ve long considered Jimmy to be full-time staff, but he isn’t officially an employee since he lives in Canada.
- H/T to Colm for setting up a number of tax-advantaged giving channels for international donors. If you’re a MIRI supporter outside the US, make sure to check out our Tax-Advantaged Donations page.
- We aren’t taking formal applications, but if you’re particularly interested in the program or have questions, you’re welcome to shoot Buck an email.

The post Summer MIRI Updates appeared first on Machine Intelligence Research Institute.

]]>The post August 2018 Newsletter appeared first on Machine Intelligence Research Institute.

]]>- New posts to the new AI Alignment Forum: Buridan’s Ass in Coordination Games; Probability is Real, and Value is Complex; Safely and Usefully Spectating on AIs Optimizing Over Toy Worlds
- MIRI Research Associate Vadim Kosoy wins a $7500 AI Alignment Prize for “The Learning-Theoretic AI Alignment Research Agenda.” Applications for the prize’s next round will be open through December 31.
- Interns from MIRI and the Center for Human-Compatible AI collaborated at an AI safety research workshop.
- This year’s AI Summer Fellows Program was very successful, and its one-day blogathon resulted in a number of interesting write-ups, such as Dependent Type Theory and Zero-Shot Reasoning, Conceptual Problems with Utility Functions (and follow-up), Complete Class: Consequentialist Foundations, and Agents That Learn From Human Behavior Can’t Learn Human Values That Humans Haven’t Learned Yet.
- See Rohin Shah’s alignment newsletter for more discussion of recent posts to the new AI Alignment Forum.

- The Future of Humanity Institute is seeking project managers for its Research Scholars Programme and its Governance of AI Program.

The post August 2018 Newsletter appeared first on Machine Intelligence Research Institute.

]]>The post July 2018 Newsletter appeared first on Machine Intelligence Research Institute.

]]>- A new paper: “Forecasting Using Incomplete Models“
- New research write-ups and discussions: Prisoners’ Dilemma with Costs to Modeling; Counterfactual Mugging Poker Game; Optimization Amplifies
- Eliezer Yudkowsky, Paul Christiano, Jessica Taylor, and Wei Dai discuss Alex Zhu’s FAQ for Paul’s research agenda.
- We attended EA Global in SF, and gave a short talk on “Categorizing Variants of Goodhart’s Law.”
- Roman Yampolskiy’s forthcoming anthology,
*Artificial Intelligence Safety and Security*, includes reprinted papers by Nate Soares (“The Value Learning Problem“) and by Nick Bostrom and Eliezer Yudkowsky (“The Ethics of Artificial Intelligence“). - Stuart Armstrong’s 2014 primer on AI risk,
*Smarter Than Us: The Rise of Machine Intelligence*, is now available as a free web book at smarterthan.us.

- OpenAI announces that their OpenAI Five system “has started to defeat amateur human teams at Dota 2” (plus an update). Discussion on LessWrong and Hacker News.
- Rohin Shah, a PhD student at the Center for Human-Compatible AI, comments on recent alignment-related results in his regularly updated Alignment Newsletter.

The post July 2018 Newsletter appeared first on Machine Intelligence Research Institute.

]]>The post New paper: “Forecasting using incomplete models” appeared first on Machine Intelligence Research Institute.

]]>We consider the task of forecasting an infinite sequence of future observations based on some number of past observations, where the probability measure generating the observations is “suspected” to satisfy one or more of a set of incomplete models, i.e., convex sets in the space of probability measures.

This setting is in some sense intermediate between the realizable setting where the probability measure comes from some known set of probability measures (which can be addressed using e.g. Bayesian inference) and the unrealizable setting where the probability measure is completely arbitrary.

We demonstrate a method of forecasting which guarantees that, whenever the true probability measure satisfies an incomplete model in a given countable set, the forecast converges to the same incomplete model in the (appropriately normalized) Kantorovich-Rubinstein metric. This is analogous to merging of opinions for Bayesian inference, except that convergence in the Kantorovich-Rubinstein metric is weaker than convergence in total variation.

Kosoy’s work builds on logical inductors to create a cleaner (purely learning-theoretic) formalism for modeling complex environments, showing that the methods developed in “Logical induction” are useful for applications in classical sequence prediction unrelated to logic.

“Forecasting using incomplete models” also shows that the intuitive concept of an “incomplete” or “partial” model has an elegant and useful formalization related to Knightian uncertainty. Additionally, Kosoy shows that using incomplete models to generalize Bayesian inference allows an agent to make predictions about environments that can be as complex as the agent itself, or more complex — as contrasted with classical Bayesian inference.

For more of Kosoy’s research, see “Optimal polynomial-time estimators” and the Intelligent Agent Foundations Forum.

*Get notified every time a new technical paper is published.*

The post New paper: “Forecasting using incomplete models” appeared first on Machine Intelligence Research Institute.

]]>