Fast and frugal trees allow you to make rapid decisions based on a few pieces of information. You can easily carry them out in your head. Surprisingly, the accuracy of these decisions rivals those made by gold-standard methods like logistic regression, especially when predicting out of sample.

Intrigued? Check out this post by Nathaniel Phillips and the new R Package he’s created to create, visualize and test fast and frual trees. For all you judgment and decision making researchers out there, Phillips will also be presenting the R package at the annual meeting of the Society for Judgment and Decision Making (SJDM) in Boston in November 2016. If you know R, you could be building fast and frugal trees today!

The post Turn your tough decisions into simple rules appeared first on Decision Science News.

]]>The Operations, Information and Decisions Department at the Wharton School is home to faculty with a diverse set of interests in behavioral economics, decision-making, information technology, information-based strategy, operations management, and operations research. We are seeking applicants for a full-time, tenure-track faculty position at any level: Assistant, Associate, or Full Professor. Applicants must have a Ph.D. (expected completion by June 2017 is preferred but by June 30, 2018 is acceptable) from an accredited institution and have an outstanding research record or potential in the OID Department’s areas of research. The appointment is expected to begin July 1, 2017.

More information about the Department is available at:

https://oid.wharton.upenn.edu/index.cfm

Interested individuals should complete and submit an online application via our secure website, and must include:

• A curriculum vitae

• A job market paper

• (Applicants for an Assistant Professor position) Three letters of recommendation submitted by references

To apply, please visit this web site:

https://oid.wharton.upenn.edu/faculty/faculty-recruiting/

Further materials, including (additional) papers and letters of recommendation, will be requested as needed.

To ensure full consideration, materials should be received by November 1st, 2016.

Contact:

OID Department

The Wharton School

University of Pennsylvania

3730 Walnut Street

500 Jon M. Huntsman Hall

Philadelphia, PA 19104-6340

The University of Pennsylvania is an affirmative action/equal opportunity employer. All qualified applicants will receive consideration for employment and will not be discriminated against on the basis of race, color, religion, sex, national origin, disability status, protected veteran status, or any other characteristic protected by law.

The post Professorship in Operations, Information, and Decisions Department at Wharton appeared first on Decision Science News.

]]>It just got a lot easier to simulate the performance of simple heuristics.

Jean Czerlinski Whitmore, a software engineer at Google with a long history in modeling cognition, and Daniel Barkoczi, a postdoctoral fellow at the Max Planck Institute for Human Development, have created heuristica: an R package to model the performance of simple heuristics. It comprises the heuristics covered in the first chapters of Simple Heuristics That Make Us Smart such as Take The Best, Unit Weighted Linear model, and more. The package also includes data, such as the the original German cities data set which has become a benchmark for testing heuristic models of choice, cited in hundreds of papers.

A good place to start is the README vignette, as with vignettes:

- Comparing the performance of simple heuristics using the Heuristica R package
- How to make your own heuristic
- Reproducing Results (Take The Best vs. regression on city population)

Here’s the heuristica package’s home on CRAN and here’s a description of the package in the authors’ own words:

The heuristica R package implements heuristic decision models, such as Take The Best (TTB) and a unit-weighted linear model. The models are designed for two-alternative choice tasks, such as which of two schools has a higher drop-out rate. The package also wraps more well-known models like regression and logistic regression into the two-alternative choice framework so all these models can be assessed side-by-side. It provides functions to measure accuracy, such as an overall

`percentCorrect`

and, for advanced users, some confusion matrix functions. These measures can be applied in-sample or out-of-sample.The goal is to make it easy to explore the range of conditions in which simple heuristics are better than more complex models. Optimizing is not always better!

The post Heuristica: An R package for testing models of binary choice appeared first on Decision Science News.

]]>Some experts seem pretty sure Donald Trump will be the next US President. Michael Moore wrote an article entitled 5 Reasons Why Trump Will Win.

Prediction maven Nate Silver has warned a few days ago “Don’t think people are really grasping how plausible it is that Trump could become president. It’s a close election right now.”

Despite this, we think that Hillary Clinton is going to win.

And we’ve put our money where our mouth is. There’s a prediction market called PredictIt in which US citizens in most states can legally bet on events happening or not. There’s an $850 limit on any contract, but you can get around that, in the following way.

As the figure up top shows, we’ve placed two bets:

- We bet $799.50 that the next President will not be a Republican. That is, we bought 1250 shares of “no” on that contract at 65 cents each. If the next President is indeed not a Republican, we’ll be able to sell those shares for a dollar each, or $1230. Otherwise we lose our money.
- We bet $849.87 that Hillary Clinton will be the next President. That is, we bought 1349 shares of “yes” on that contract at 63 cents. If Hillary wins, we’ll be able to sell our shares for $1349. Otherwise we lose our money.

So, we’ve bet $1,649.37. If Hillary wins, we’ll have $2,579 (minus the market’s 10% fee on profits). If Trump or some other Republican wins, we’ll have bupkis.

The post We’ve bet that Hillary Clinton will win appeared first on Decision Science News.

]]>Decision Science News readers presumably like to read about judgment and decision making, behavioral economics, and policy research. They may therefore be interested in the Behavioral Science and Policy Association’s PolicyShop blog, which covers all these topics.

The post Behavioral Science and Policy Association PolicyShop Blog appeared first on Decision Science News.

]]>This winter, the Consumer Financial Protection Bureau (CFPB) will host its second research conference on consumer finance.

We encourage the submission of a variety of research. This includes, but is not limited to, work on: the ways consumers and households make decisions about borrowing, saving, and financial risk-taking; how various forms of credit (mortgage, student loans, credit cards, installment loans etc.) affect household well-being; the structure and functioning of consumer financial markets; distinct and underserved populations; and relevant innovations in modeling or data. A particular area of interest for the CFPB is the dynamics of households’ balance sheets.

A deliberate aim of the conference is to connect the core community of consumer finance researchers and policymakers with the best research being conducted across the wide range of disciplines and approaches that can inform the topic. Disciplines from which we hope to receive submissions include, but are not limited to, economics, the behavioral sciences, cognitive science, and psychology.

The conference’s scientific committee includes:

- Adair Morse (University of California Berkeley, Haas School of Business)
- Annette Vissing-Jorgensen (University of California Berkeley, Haas School of Business)
- Colin Camerer (California Institute of Technology)
- Eric Johnson (Columbia University, Columbia Business School)
- Jonathan Levin (Stanford University)
- Jonathan Parker (Massachusetts Institute of Technology, Sloan School of Management)
- José-Victor Rios-Rull (University of Pennsylvania)
- Judy Chevalier (Yale School of Management)
- Matthew Rabin (Harvard University)
- Susan Dynarski (University of Michigan)

Authors may submit complete papers or detailed abstracts that include preliminary results. All submissions should be made in electronic PDF format to CFPB ResearchConference at cfpb.gov by Friday, August 26th, 2016.

Please remember to include contact information on the cover page for the corresponding author. Please submit questions or concerns to Worthy.Cho at cfpb.gov.

The post Second Consumer Financial Protection Bureau (CFPB) Research Conference on Consumer Finance (December 15-16th, 2016) appeared first on Decision Science News.

]]>At Decision Science News, we are always on the lookout for rules of thumb.

Our colleague Justin Rao was thinking it would be useful to express calories as a percentage of daily calories. So instead of a coke being 150 calories, you could think of it as 7.5% of your daily calories. Or whatever. The whatever is key.

This is an example of putting unfamiliar numbers in perspective.

So, we were then interested to see if there would be an easy rule of thumb for people to calculate how many calories per day they should be eating, so that they could re-express foods as a percentage of that.

We found some calorie guidelines on the Web published by the US government. With the help of Jake Hofman, we used Hadley Wickham‘s rvest package to scrape them and his other tools to process and plot them.

The result is above. If you have any ideas on how to fit it elegantly, let us know.

We tried a number of fits. Lines are good for heuristics, so we made a bi-linear fit to the raw data (in points). We’re all grownups reading this blog, so let’s focus on the lines to the right of the peak.

Time to make the heuristics. For women, you need about 65 fewer calories per day for every decade after age 20. For men, you need about 105 fewer calories per day for every decade after age 20. Or let’s just say 70 and 100 to keep it simple.

So, if you have an opposite sex life partner (OSLP?), keep in mind that you may need to cut back by more or fewer calories as the person across the table as you age together. Same sex life partner (SSLP?), cut back the same amount. Just don’t go beyond the range of the chart. The guidelines suggest even sedentary men shouldn’t eat fewer than 2,000 calories a day at any age. For women, that number is 1650.

REFERENCES

Barrio, Pablo J., Daniel G. Goldstein, & Jake M. Hofman. (2016). Improving comprehension of numbers in the news. ACM Conference on Human Factors in Computing Systems (CHI ’16). [Download]

The R code, below, has some other attempts at plots in it. You may be most interested in it as a way to see rvest in action. Or just to get the data.

The post How many calories should you eat per day? appeared first on Decision Science News.

]]>Two recent events in the UK made it look like prediction markets’ predictions aren’t worth much.

The soccer team Leicester City won the premiere league title despite the markets putting the odds of them doing so at 5,000 to 1 (.02%).

Last week, people in the UK, voted to leave the European Union. A few hours before it was sure they would exit, a prediction market put their probability of leaving at 10%. See the figure above from PredictIt. X axis is roughly time before the outcome was certain. Y axis can be interpreted as probability of exit (70 cents = 70%). It jumped from 10% to 90% in just five hours.

Analysts like to “explain” market results, coming up with a reason why an event was a failure of the prediction market. For instance, in the two events above, the Wall Street Journal, perhaps correctly, claims the bets were unduly influenced by London bettors. Through big London bets the odds moved to reflect what Londoners believe instead of the sentiment of the crowd. In predicting a Brexit, the sentiment of the crowd is exactly what you want.

Whenever the prediction market is far on the wrong side of 50%, explanations will arise as to why the prediction market was wrong. Let’s take a step back here.

A desirable property of a prediction market is that it is calibrated. To be calibrated, events that it predicts to be 90% likely should occur 90% of the time. Events that it predicts to be 10% likely should occur 10% of the time.

If events that it predicts to be 10% likely (e.g. Brexit) occur 0% of the time, the prediction market has a problem. It is over-estimating the chances.

Looking at the calibration of prediction markets across many events, one will see that they are typically very well calibrated. Take, for instance that Hypermind prediction market. The figure below shows close to 500 events that it predicted. If the market were perfectly calibrated, the points would fall along the diagonal line (i.e., 10% likely events would happen 10% of the time, 20% likely events would happen 20% of the time, and so on).

It’s very well calibrated. Here’s a similar chart for U.S. primaries at PredictWise.

So the next time a market “misses” a 10% prediction, let us keep in mind that it needs to miss 10% of those predictions to stay calibrated. As the Émile Servan-Schreiber mentions in this post, “It is perhaps a bit ironic to note that the data from the Brexit question slightly improved [the prediction market’s] overall calibration. It is as if the occurence of an unlikely event was long overdue in order to better match predicted probabilities to observed outcomes!”

You may wish that that prediction markets had prefect calibration and predicted only 0% or 100%. We do, too. But the world is an imperfect place.

The post Prediction markets have to occasionally “get it wrong” to be calibrated appeared first on Decision Science News.

]]>Robert Meunchen has done a rather in depth analysis of the popularity of various packages for statistical computation. More detail here.

We were surprised to see that R has passed SAS for scholarly use. We were surprised because we assumed this would have happened years ago. Meunchen too predicted it would have happened in 2014.

In addition, we were surprised to see that SPSS holds the number one spot, and by a fair margin. We can only think of one person who uses SPSS.

Perhaps we’re guilty of false consensus reasoning.

A chart further down the post shows that Python is rapidly growing in use. That meshes with what we observe. Many of our PhD student interns come in using Python for data analysis. One who I worked with, Chris Riederer, is even writing a version of dplyr in Python and calling it dplython, of course.

The post R is now the number two statistical computing program in scholarly use appeared first on Decision Science News.

]]>Say you’ve got 30 numbers and a strong urge to estimate their standard deviation. But you’ve left your computer at home. Unless you’re really good at mentally squaring and summing, it’s pretty hard to compute a standard deviation in your head. But there’s a heuristic you can use:

Subtract the smallest number from the largest number and divide by four

Let’s call it the “range over four” heuristic. You could, and probably should, be skeptical. You could want to see how accurate the heuristic is. And you could want to see how the heuristic’s accuracy depends on the distribution of numbers you are dealing with.

Fine.

We generated random numbers from four distributions, pictured above. We nickname them (from top to bottom): floor, left of center, normalish, and uniform. They’re all beta distributions. If you want more detail, they’re the same beta distributions studied in Goldstein and Rothschild (2014). See the code below for parameters.

We vary two things in our simulation:

1) The number of observations on which we’re estimating the standard deviation.

2) The distributions from which the observations are drawn

With each sample, we compute the actual standard deviation and compare it to the heuristic’s estimate of the standard deviation. We do this many times and take the average. Because we like the way mape sounds, we used mean absolute percent error (MAPE) as our error metric. Enough messing around. Let’s show the result.

There you have it. With about 30 to 40 observations, we could get an average absolute error of less than 10 percent for three of our distributions, even the skewed ones. With more observations, the error grew for those distributions.

With the uniform distribution, error was over 15 percent in the 30-40 observation range. We’re fine with that. We don’t tend to measure too many things that are uniformly distributed.

Another thing that set the uniform distribution apart is that its error continued to go down as more observations were added. Why is this? The standard deviation of a uniform distribution between 0 and 1 is 1/sqrt(12) or 0.289. The heuristic, if it were lucky enough to draw 1 and a 0 as its sample range, would estimate the standard deviation as 1/4 or .25. So, the sample size increases, the error for the uniform distribution should drop down to a MAPE of 13.4% and flatten out. The graph shows it is well on its way towards doing so.

REFERENCES

Browne, R. H. (2001). Using the sample range as a basis for calculating sample size in power calculations. The American Statistician, 55(4), 293-298.

Hozo, S., Djulbegovic, B., & Hozo, I. (2005). Estimating the mean and variance from the median, range, and the size of a sample. BMC medical research methodology, 5(1), 1.

Ramıreza, A., & Coxb, C. (2012). Improving on the range rule of thumb. Rose-Hulman Undergraduate Mathematics Journal, 13(2).

Wan, X., Wang, W., Liu, J., & Tong, T. (2014). Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC medical research methodology, 14(1), 135.

Want to play with it yourself? R Code below. Thanks to Hadley Wickham for creating tools like dplyr and ggplot2 which take R to the next level.

The post Can’t compute the standard deviation in your head? Divide the range by four. appeared first on Decision Science News.

]]>