Get Blog – Machine Intelligence Research Institute delivered by email
FeedBurner makes it easy to receive content updates in My Yahoo!, Newsgator, Bloglines, and other news readers.
Learn more about syndication and FeedBurner...
|
Our big announcement this month is our paper “Logical Induction,” introducing an algorithm that learns to assign reasonable probabilities to mathematical, empirical, and self-referential claims in a way that outpaces deduction. MIRI’s 2016 fundraiser is also live, and runs through the end of October.
Research updates
General updates
News and links
|
The post October 2016 Newsletter appeared first on Machine Intelligence Research Institute.
We’ve uploaded the final set of videos from our recent Colloquium Series on Robust and Beneficial AI (CSRBAI) at the MIRI office, co-hosted with the Future of Humanity Institute. A full list of CSRBAI talks with public video or slides:
For a recap of talks from the earlier weeks at CSRBAI, see my previous blog posts on transparency, robustness and error tolerance, and preference specification. The last set of talks was part of the week focused on Agent Models and Multi-Agent Dilemmas:
Michael Wellman, Professor of Computer Science and Engineering at the University of Michigan, spoke about the implications and risks of autonomous agents in the financial markets (slides). Abstract:
Design for robust and beneficial AI is a topic for the future, but also of more immediate concern for the leading edge of autonomous agents emerging in many domains today. One area where AI is already ubiquitous is on financial markets, where a large fraction of trading is routinely initiated and conducted by algorithms. Models and observational studies have given us some insight on the implications of AI traders for market performance and stability. Design and regulation of market environments given the presence of AIs may also yield lessons for dealing with autonomous agents more generally.
Stefano Albrecht, a Postdoctoral Fellow in the Department of Computer Science at the University of Texas at Austin, spoke about “learning to distinguish between belief and truth” (slides). Abstract:
Intelligent agents routinely build models of other agents to facilitate the planning of their own actions. Sophisticated agents may also maintain beliefs over a set of alternative models. Unfortunately, these methods usually do not check the validity of their models during the interaction. Hence, an agent may learn and use incorrect models without ever realising it. In this talk, I will argue that robust agents should have both abilities: to construct models of other agents and contemplate the correctness of their models. I will present a method for behavioural hypothesis testing along with some experimental results. The talk will conclude with open problems and a possible research agenda.
Stuart Armstrong, from the Future of Humanity Institute in Oxford, spoke about “reduced impact AI” (slides). Abstract:
This talk will look at some of the ideas developed to create safe AI without solving the problem of friendliness. It will focus first on “reduced impact AI”, AIs designed to have little effect on the world – but from whom high impact can nevertheless be extracted. It will then delve into the new idea of AIs designed to have preferences over their own virtual worlds only, and look at the advantages – and limitations – of using indifference as a tool of AI control.
Lastly, Andrew Critch, a MIRI research fellow, spoke about robust cooperation in bounded agents. This talk is based on the paper “Parametric Bounded Löb’s Theorem and Robust Cooperation of Bounded Agents.” Talk abstract:
The first interaction between a pair of agents who might destroy each other can resemble a one-shot prisoner’s dilemma. Consider such a game where each player is an algorithm with read-access to its opponent’s source code. Tennenholtz (2004) introduced an agent which cooperates iff its opponent’s source code is identical to its own, thus sometimes achieving mutual cooperation while remaining unexploitable in general. However, precise equality of programs is a fragile cooperative criterion. Here, I will exhibit a new and more robust cooperative criterion, inspired by ideas of LaVictoire, Barasz and others (2014), using a new theorem in provability logic for bounded reasoners.
The post CSRBAI talks on agent models and multi-agent dilemmas appeared first on Machine Intelligence Research Institute.
Our 2016 fundraiser is underway! Unlike in past years, we’ll only be running one fundraiser in 2016, from Sep. 16 to Oct. 31. Our progress so far (updated live):
Employer matching and pledges to give later this year also count towards the total. Click here to learn more.
MIRI is a nonprofit research group based in Berkeley, California. We do foundational research in mathematics and computer science that’s aimed at ensuring that smarter-than-human AI systems have a positive impact on the world.
2016 has been a big year for MIRI, and for the wider field of AI alignment research. Our 2016 strategic update in early August reviewed a number of recent developments:
We also published new results in decision theory and logical uncertainty, including “Parametric bounded Löb’s theorem and robust cooperation of bounded agents” and “A formal solution to the grain of truth problem.” For a survey of our research progress and other updates from last year, see our 2015 review.
In the last three weeks, there have been three more major developments:
Things have been moving fast over the last nine months. If we can replicate last year’s fundraising successes, we’ll be in an excellent position to move forward on our plans to grow our team and scale our research activities.
Humans are far better than other species at altering our environment to suit our preferences. This is primarily due not to our strength or speed, but to our intelligence, broadly construed — our ability to reason, plan, accumulate scientific knowledge, and invent new technologies. AI is a technology that appears likely to have a uniquely large impact on the world because it has the potential to automate these abilities, and to eventually decisively surpass humans on the relevant cognitive metrics.
Separate from the task of building intelligent computer systems is the task of ensuring that these systems are aligned with our values. Aligning an AI system requires surmounting a number of serious technical challenges, most of which have received relatively little scholarly attention to date. MIRI’s role as a nonprofit in this space, from our perspective, is to help solve parts of the problem that are a poor fit for mainstream industry and academic groups.
Our long-term plans are contingent on future developments in the field of AI. Because these developments are highly uncertain, we currently focus mostly on work that we expect to be useful in a wide variety of possible scenarios. The more optimistic scenarios we consider often look something like this:
The above is a vague sketch, and we prioritize research we think would be useful in less optimistic scenarios as well. Additionally, “short term” and “long term” here are relative, and different timeline forecasts can have very different policy implications. Still, the sketch may help clarify the directions we’d like to see the research community move in.
For more on our research focus and methodology, see our research page and MIRI’s Approach.
We currently employ seven technical research staff (six research fellows and one assistant research fellow), plus two researchers signed on to join in the coming months and an additional six research associates and research interns.1 Our budget this year is about $1.75M, up from $1.65M in 2015 and $950k in 2014.2
Our eventual goal (subject to revision) is to grow until we have between 13 and 17 technical research staff, at which point our budget would likely be in the $3–4M range. If we reach that point successfully while maintaining a two-year runway, we’re likely to shift out of growth mode.
Our budget estimate for 2017 is roughly $2–2.2M, which means that we’re entering this fundraiser with about 14 months’ runway. We’re uncertain about how many donations we’ll receive between November and next September,3 but projecting from current trends, we expect about 4/5ths of our total donations to come from the fundraiser and 1/5th to come in off-fundraiser.4 Based on this, we have the following fundraiser goals:
Basic target – $750,000. We feel good about our ability to execute our growth plans at this funding level. We’ll be able to move forward comfortably, albeit with somewhat more caution than at the higher targets.
Growth target – $1,000,000. This would amount to about half a year’s runway. At this level, we can afford to make more uncertain but high-expected-value bets in our growth plans. There’s a risk that we’ll dip below a year’s runway in 2017 if we make more hires than expected, but the growing support of our donor base would make us feel comfortable about taking such risks.
Stretch target – $1,250,000. At this level, even if we exceed my growth expectations, we’d be able to grow without real risk of dipping below a year’s runway. Past $1.25M we would not expect additional donations to affect our 2017 plans much, assuming moderate off-fundraiser support.5
If we hit our growth and stretch targets, we’ll be able to execute several additional programs we’re considering with more confidence. These include contracting a larger pool of researchers to do early work with us on logical induction and on our machine learning agenda, and generally spending more time on academic outreach, field-growing, and training or trialing potential collaborators and hires.
As always, you’re invited to get in touch if you have questions about our upcoming plans and recent activities. I’m very much looking forward to seeing what new milestones the growing alignment research community will hit in the coming year, and I’m very grateful for the thoughtful engagement and support that’s helped us get to this point.
For comparison, our revenue was about $1.6 million in 2015: $167k in grants, $960k in fundraiser contributions, and $467k in off-fundraiser (non-grant) contributions. Our situation in 2015 was somewhat different, however: we ran two 2015 fundraisers, whereas we’re skipping our winter fundraiser this year and advising December donors to pledge early or give off-fundraiser.
The post MIRI’s 2016 Fundraiser appeared first on Machine Intelligence Research Institute.
MIRI is releasing a paper introducing a new model of deductively limited reasoning: “Logical induction,” authored by Scott Garrabrant, Tsvi Benson-Tilsen, Andrew Critch, myself, and Jessica Taylor. Readers may wish to start with the abridged version.
Consider a setting where a reasoner is observing a deductive process (such as a community of mathematicians and computer programmers) and waiting for proofs of various logical claims (such as the abc conjecture, or “this computer program has a bug in it”), while making guesses about which claims will turn out to be true. Roughly speaking, our paper presents a computable (though inefficient) algorithm that outpaces deduction, assigning high subjective probabilities to provable conjectures and low probabilities to disprovable conjectures long before the proofs can be produced.
This algorithm has a large number of nice theoretical properties. Still speaking roughly, the algorithm learns to assign probabilities to sentences in ways that respect any logical or statistical pattern that can be described in polynomial time. Additionally, it learns to reason well about its own beliefs and trust its future beliefs while avoiding paradox. Quoting from the abstract:
These properties and many others all follow from a single logical induction criterion, which is motivated by a series of stock trading analogies. Roughly speaking, each logical sentence φ is associated with a stock that is worth $1 per share if φ is true and nothing otherwise, and we interpret the belief-state of a logically uncertain reasoner as a set of market prices, where ℙn(φ)=50% means that on day n, shares of φ may be bought or sold from the reasoner for 50¢. The logical induction criterion says (very roughly) that there should not be any polynomial-time computable trading strategy with finite risk tolerance that earns unbounded profits in that market over time.
This criterion is analogous to the “no Dutch book” criterion used to support other theories of ideal reasoning, such as Bayesian probability theory and expected utility theory. We believe that the logical induction criterion may serve a similar role for reasoners with deductive limitations, capturing some of what we mean by “good reasoning” in these cases.
The logical induction algorithm that we provide is theoretical rather than practical. It can be thought of as a counterpart to Ray Solomonoff’s theory of inductive inference, which provided an uncomputable method for ideal management of empirical uncertainty but no corresponding method for reasoning under uncertainty about logical or mathematical sentences.1 Logical induction closes this gap.
Any algorithm that satisfies the logical induction criterion will exhibit the following properties, among others:
1. Limit convergence and limit coherence: The beliefs of a logical inductor are perfectly consistent in the limit. (Every provably true sentence eventually gets probability 1, every provably false sentence eventually gets probability 0, if φ provably implies ψ then the probability of φ converges to some value no higher than the probability of ψ, and so on.)
2. Provability induction: Logical inductors learn to recognize any pattern in theorems (or contradictions) that can be identified in polynomial time.
◦ Consider a sequence of conjectures generated by a brilliant mathematician, such as Ramanujan, that are difficult to prove but keep turning out to be true. A logical inductor will recognize this pattern and start assigning Ramanujan’s conjectures high probabilities well before it has enough resources to verify them.
◦ As another example, consider the sequence of claims “on input n, this long-running computation outputs a natural number between 0 and 9.” If those claims are all true, then (roughly speaking) a logical inductor learns to assign high probabilities to them as fast as they can be generated. If they’re all false, a logical inductor learns to assign them low probabilities as fast as they can be generated. In this sense, it learns inductively to predict how computer programs will behave.
◦ Similarly, given any polynomial-time method for writing down computer programs that halt, logical inductors learn to believe that they will halt roughly as fast as the source codes can be generated. Furthermore, given any polynomial-time method for writing down computer programs that provably fail to halt, logical inductors learn to believe that they will fail to halt roughly as fast as the source codes can be generated. When it comes to computer programs that fail to halt but for which there is no proof of this fact, logical inductors will learn not to anticipate that the program is going to halt anytime soon, even though they can’t tell whether the program is going to halt in the long run. In this way, logical inductors give some formal backing to the intuition of many computer scientists that while the halting problem is undecidable in full generality, this rarely interferes with reasoning about computer programs in practice.2
3. Affine coherence: Logical inductors learn to respect logical relationships between different sentences’ truth-values, often long before the sentences can be proven. (E.g., they will learn for arbitrary programs that “this program outputs 3” and “this program outputs 4” are mutually exclusive, often long before they’re able to evaluate the program in question.)
4. Learning pseudorandom frequencies: When faced with a sufficiently pseudorandom sequence, logical inductors learn to use appropriate statistical summaries. For example, if the Ackermann(n,n)th digit in the decimal expansion of π is hard to predict for large n, a logical inductor will learn to assign ~10% subjective probability to the claim “the Ackermann(n,n)th digit in the decimal expansion of π is a 7.”
5. Calibration and unbiasedness: On sequences that a logical inductor assigns ~30% probability to, if the average frequency of truth converges, then it converges to ~30%. In fact, on any subsequence where the average frequency of truth converges, there is no efficient method for finding a bias in the logical inductor’s beliefs.
6. Scientific induction: Logical inductors can be used to do sequence prediction, and when doing so, they dominate the universal semimeasure.
7. Closure under conditioning: Conditional probabilities in this framework are well-defined, and conditionalized logical inductors are also logical inductors.3
8. Introspection: Logical inductors have accurate beliefs about their own beliefs, in a manner that avoids the standard paradoxes of self-reference.
◦ For instance, the probabilities on a sequence that says “I have probability less than 50% on the nth day” go extremely close to 50% and oscillate pseudorandomly, such that there is no polynomial-time method to tell whether the nth one is slightly above or slightly below 50%.
9. Self-trust: Logical inductors learn to trust their future beliefs more than their current beliefs. This gives some formal backing to the intuition that real-world probabilistic agents can often be reasonably confident in their future reasoning in practice, even though Gödel’s incompleteness theorems place strong limits on reflective reasoning in full generality.4
The above claims are all quite vague; for the precise statements, refer to the paper.
Logical induction was developed by Scott Garrabrant in an effort to solve an open problem we spoke about six months ago. Roughly speaking, we had formalized two different desiderata for good reasoning under logical uncertainty: the ability to recognize patterns in what is provable (such as mutual exclusivity relationships between claims about computer programs), and the ability to recognize statistical patterns in sequences of logical claims (such as recognizing that the decimal digits of π seem pretty pseudorandom). Neither was too difficult to achieve in isolation, but we were surprised to learn that our simple algorithms for achieving one seemed quite incompatible with our simple algorithms for achieving the other. Logical inductors were born of Scott’s attempts to achieve both simultaneously.5
I think there’s a good chance that this framework will open up new avenues of study in questions of metamathematics, decision theory, game theory, and computational reflection that have long seemed intractable. I’m also cautiously optimistic that they’ll improve our understanding of decision theory and counterfactual reasoning, and other problems related to AI value alignment.6
We’ve posted a talk online that helps provide more background for our work on logical induction:7
“Logical induction” is a large piece of work, and there are undoubtedly still a number of bugs. We’d very much appreciate feedback: send typos, errors, and other comments to errata@intelligence.org.8
Get notified every time a new technical paper is published.
Suppose, for example, that a human agent makes an (unforced) losing chess move. An AI system programmed to learn the human’s preferences from observed behavior probably shouldn’t conclude that the human wanted to lose. Instead, our toy model of this dilemma should allow that the human may be resource-limited and may not be able to deduce the full implications of their moves; and our model should allow that the AI system is aware of this too, or can learn about it.
The post New paper: “Logical induction” appeared first on Machine Intelligence Research Institute.
A major announcement today: the Open Philanthropy Project has granted MIRI $500,000 over the coming year to study the questions outlined in our agent foundations and machine learning research agendas, with a strong chance of renewal next year. This represents MIRI’s largest grant to date, and our second-largest single contribution.
Coming on the heels of a $300,000 donation by Blake Borgeson, this support will help us continue on the growth trajectory we outlined in our summer and winter fundraisers last year and effect another doubling of the research team. These growth plans assume continued support from other donors in line with our fundraising successes last year; we’ll be discussing our remaining funding gap in more detail in our 2016 fundraiser, which we’ll be kicking off later this month.
The Open Philanthropy Project is a joint initiative run by staff from the philanthropic foundation Good Ventures and the charity evaluator GiveWell. Open Phil has recently made it a priority to identify opportunities for researchers to address potential risks from advanced AI, and we consider their early work in this area promising: grants to Stuart Russell, Robin Hanson, and the Future of Life Institute, plus a stated interest in funding work related to “Concrete Problems in AI Safety,” a recent paper co-authored by four Open Phil technical advisers, Christopher Olah (Google Brain), Dario Amodei (OpenAI), Paul Christiano (UC Berkeley), and Jacob Steinhardt (Stanford), along with John Schulman (OpenAI) and Dan Mané (Google Brain).
Open Phil’s grant isn’t a full endorsement, and they note a number of reservations about our work in an extensive writeup detailing the thinking that went into the grant decision. Separately, Open Phil Executive Director Holden Karnofsky has written some personal thoughts about how his views of MIRI and the effective altruism community have evolved in recent years.
Open Phil’s decision was informed in part by their technical advisers’ evaluations of our recent work on logical uncertainty and Vingean reflection, together with reviews by seven anonymous computer science professors and one anonymous graduate student. The reviews, most of which are collected here, are generally negative: reviewers felt that “Inductive coherence” and “Asymptotic convergence in online learning with unbounded delays” were not important results and that these research directions were unlikely to be productive, and Open Phil’s advisers were skeptical or uncertain about the work’s relevance to aligning AI systems with human values.
It’s worth mentioning in that context that the results in “Inductive coherence” and “Asymptotic convergence…” led directly to a more significant unpublished result, logical induction, that we’ve recently discussed with Open Phil and members of the effective altruism community. The result is being written up, and we plan to put up a preprint soon. In light of this progress, we are more confident than the reviewers that Garrabrant et al.’s earlier papers represented important steps in the right direction. If this wasn’t apparent to reviewers, then it could suggest that our exposition is weak, or that the importance of our results was inherently difficult to assess from the papers alone.
In general, I think the reviewers’ criticisms are reasonable — either I agree with them, or I think it would take a longer conversation to resolve the disagreement. The level of detail and sophistication of the comments is also quite valuable.
The content of the reviews was mostly in line with my advance predictions, though my predictions were low-confidence. I’ve written up quick responses to some of the reviewers’ comments, with my predictions and some observations from Eliezer Yudkowsky included in appendices. This is likely to be the beginning of a longer discussion of our research priorities and progress, as we have yet to write up our views on a lot of these issues in any detail.
We’re very grateful for Open Phil’s support, and also for the (significant) time they and their advisers spent assessing our work. This grant follows a number of challenging and deep conversations with researchers at GiveWell and Open Phil about our organizational strategy over the years, which have helped us refine our views and arguments.
Past public exchanges between MIRI and GiveWell / Open Phil staff include:
See also Open Phil’s posts on transformative AI and AI risk as a philanthropic opportunity, and their earlier AI risk cause report.
The post Grant announcement from the Open Philanthropy Project appeared first on Machine Intelligence Research Institute.
|
Research updates
General updates
News and links
|
The post September 2016 Newsletter appeared first on Machine Intelligence Research Institute.