<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title>Carl Shan</title>
 <link href="http://carlshan.com/atom.xml" rel="self"/>
 <link href="http://carlshan.com/"/>
 <updated>2020-07-15T11:59:28-07:00</updated>
 <id>http://carlshan.com/</id>
 <author>
   <name>Carl Shan</name>
   <email>carl@carlshan.com</email>
 </author>

 
 <entry>
   <title>From Chicago to Finland: A Journey</title>
   <link href="http://carlshan.com/2016/04/07/finland.html"/>
   <updated>2016-04-07T00:00:00-07:00</updated>
   <id>http://carlshan.com/2016/04/07/finland</id>
   <content type="html">&lt;h1 id=&quot;from-chicago-to-finland-a-journey&quot;&gt;From Chicago to Finland: A Journey&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;7 April 2016 - Finland &lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;somewhere-in-northern-finland-november-2015&quot;&gt;Somewhere in northern Finland, November 2015&lt;/h2&gt;

&lt;p&gt;The sun slowly crept behind my head, its morning rays prickling my neck. The air breezing around us was dry and chilly. Each breath I took felt simultaneously dehydrating and frigid. I glanced at the scraggly forest of trees lining the road and looked down at my phone. I peered at the blue dot supposedly indicating where I was. I found that it believed I was still in Oulu, a city I had left nearly an hour ago.&lt;/p&gt;

&lt;p&gt;Sighing, I looked up at my friend Aatash with a frown.&lt;/p&gt;

&lt;p&gt;We were lost.&lt;/p&gt;

&lt;p&gt;The two of us were trudging along Metsokangas, a small town in northern Finland. Or at least that was the plan. Instead we found ourselves lost in the Finnish countryside. Aatash and I were looking for Metsokangas Comprehensive School, a Finnish public school with around 1000 students. We had scheduled an early morning meeting with Kalle, the gregarious principal who had responded to our requests for a tour.&lt;/p&gt;

&lt;p&gt;However instead of finding ourselves in the town of Metsokangas, our inept navigation caused us to get off at the wrong bus stop. When we discovered our mistake, we tried to head back the way we came, knowing that the correct stop would be somewhere on the path.&lt;/p&gt;

&lt;p&gt;We were surrounded on all sides by clumps of tall and skinny trees that tapered off sharply, appearing as bundles of upright green needles. Unfortunately, the uniformity of the landscape left us few landmarks to work with. We weren’t sure which way to go.&lt;/p&gt;

&lt;p&gt;As I glanced at the barren fields around us, I thought back to the chain of events that to this point.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;seven-months-earlier-in-my-apartment-in-chicago-illinois-april-2015&quot;&gt;Seven months earlier. In my apartment in Chicago, Illinois. April 2015.&lt;/h2&gt;

&lt;p&gt;I clicked “Send” on the email I had spent hours writing and rewriting, pushed back my chair and got up to stretch. My lips felt dry and my heart drummed an erratic staccato.&lt;/p&gt;

&lt;p&gt;There. I had done it.&lt;/p&gt;

&lt;p&gt;After a quick gulp of water, I anxiously opened my “Sent” folder to reread the email I had just delivered. My eyes darted over the lines of text.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Hey Emma,&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;Carl here. I just wanted to shoot you a note that, unfortunately, after some introspection, I decided that going back to school wasn’t exactly what I wanted to do this year.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;That’s a shame because I’m sure I would have had a blast in both the MS program, and in learning about the ways to get involved in CMU educational projects, like yours. Nevertheless, there are other things I want to explore before making a decision to return to school.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;I also plan on continuing honing my skills in service of making a difference in parts of the world that I care about, education included, so I sincerely hope that our paths may cross again.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;Best,
Carl&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;As I reread my email to the professor who could have been my advisor, I wasn’t only trying to confirm that I had sent it. I was also taking a last glance over the shoulder at the road not taken. Rereading this email was the final regret-tinged look at the choice I didn’t make, as if by poring over each word in my email I could peer into the hazy Multiverse of Could-Have-Beens, hoping to catch a glimpse at other futures I might have had.&lt;/p&gt;

&lt;p&gt;I was rereading my email to grapple with the question I guess many people spend much of their lives untangling: did I make the right choice?&lt;/p&gt;

&lt;p&gt;After all it was only a month earlier that I had yelped with joy and surprise when I had received my admissions letter to Carnegie Mellon University’s Machine Learning department. This department was founded in the late 1990s and was the first in the United States to focus on the field of machine learning. Many important discoveries in the field of machine learning had been made there. Over the years the department graduated a number of notable alumni who now held significant roles within academia and industry.&lt;/p&gt;

&lt;p&gt;I had even received a scholarship, which both surprised and thrilled me.&lt;/p&gt;

&lt;p&gt;The honeymoon period lasted a number of days. But the triumphant confidence in knowing what I would be doing for the next few years slowly began to trickle away. Instead, a gnawing discomfort crept into my mind. After all, why was I so interested in going to graduate school in machine learning? It wasn’t an option that had been on my radar even just six months prior.&lt;/p&gt;

&lt;p&gt;Slowly but surely, a batch of doubts were beginning to simmer in my head.&lt;/p&gt;

&lt;p&gt;I remember sitting home one day in March, trying to wrap my head around the doubt I was beginning to have around my decision to enroll in graduate school. Why had I chosen to go to graduate school? Was it what I really wanted to do?
It was the fact that I was now willing to ask these questions that were causing the pangs of self-doubt. The truth was that I had actually opportunistically stumbled my way into being accepted into graduate school. I had been working in a position where most of my coworkers had at least a Masters, with a number of them with PhDs. That environment, combined with the unquestioned belief that more education and credentials are a “safe” or “correct” choice, gave me career myopia.&lt;/p&gt;

&lt;p&gt;Don’t get me wrong. I had reasons for getting a graduate degree in machine learning beyond acquiring additional credentials. While I was applying to graduate school I was part of a program called the Data Science for Social Good Fellowship (DSSG). Through DSSG I worked as a data scientist in Chicago, where I collaborated with nonprofits and school districts to use my background in statistics and computer science to solve organizational challenges.&lt;/p&gt;

&lt;p&gt;In this role, I noticed how valuable my technical background was to the social sector as there was a &lt;a href=&quot;/2014/06/22/dssg-week3.html&quot;&gt;dearth of technical professionals&lt;/a&gt; applying to work with nonprofits or governments. If I built upon my technical abilities by getting another degree in machine learning, I reasoned that could make even more impact.&lt;/p&gt;

&lt;p&gt;Thus the immediate rewards of getting another technical degree seemed obvious. The costs appeared negligible. The burden of proof had sneakily crept from a positive one to a negative one: from “Why should I go to graduate school?” to “Why &lt;em&gt;shouldn’t&lt;/em&gt; I?”&lt;/p&gt;

&lt;p&gt;From that point on, I was on a one-track mind of getting accepted to graduate school, focusing my attention on how to optimize getting accepted, rather than understanding what I wanted out of my life or career.&lt;/p&gt;

&lt;p&gt;But just as in love, any relationship cobbled from momentary infatuation may have an initial sharp peak, but also, in inevitable symmetry, also a steep plummet. My fling with graduate school was simply that: a fling. The initial euphoria I experienced couldn’t conceal the cracks between me and Carnegie Mellon: I didn’t think I wanted to continue working as a data scientist, I didn’t know what I’d use my degree towards, I hadn’t explored other career options to learn if getting a graduate degree was necessary or valuable.&lt;/p&gt;

&lt;p&gt;During this period of self-doubt, I drew inspiration from my friend Cindy’s courageous decision to decline her offer to Harvard Medical School. After confronting an uncomfortable truth — that the career she and her parents had spent years preparing for wasn’t what she wanted — Cindy turned down Harvard with no other opportunities in hand.[1] Her decision to trust herself and plunge into uncertainty emboldened me to take my own leap of faith.&lt;/p&gt;

&lt;p&gt;It became clear to me that attending graduate school was something I decided on without deliberate reflections on my goals, values or passions. These underlying bedrocks of commitments weren’t there. I didn’t want, need nor even particularly &lt;em&gt;like&lt;/em&gt; the idea of more formal schooling.&lt;/p&gt;

&lt;p&gt;And just like that, it was over.&lt;/p&gt;

&lt;p&gt;So then, what was next?&lt;/p&gt;

&lt;hr /&gt;
&lt;h2 id=&quot;somewhere-in-northern-finland-november-2015-1&quot;&gt;Somewhere in northern Finland, November 2015.&lt;/h2&gt;

&lt;p&gt;I was relieved.&lt;/p&gt;

&lt;p&gt;After 35 minutes of wandering, Aatash and I finally found ourselves back on the main road. Or at least that’s what we suspected. Google Maps hadn’t been particularly cooperative when we asked it to pinpoint our exact location.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/finland/metsokangas.jpeg&quot; alt=&quot;Entrance to Metsokangas&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The entrance to Metsokangas Comprehensive School&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;As practicing statisticians and data scientists, we thought of no better way to decide this than to run a large-scale experiment. The following section details exactly what we tested and discovered.&lt;/p&gt;

&lt;p&gt;Nevertheless, we could now see buildings poking over the gangly forestry around us. As we progressed towards these beacons, we were filled with hope. There was one building in particular that stuck out. It was too large to be a home, and too garishly colorful to be an office building.&lt;/p&gt;

&lt;p&gt;Our confidence increasing with every step, Aatash and I prepared ourselves for our first look inside the heralded Finnish education. We were excited. In education circles, Finland’s education system is often lauded as a model to emulate. Finnish students consistently scored near the top of international exams, and teaching was regularly seen as a highly desirable profession for ambitious Finns to pursue.&lt;/p&gt;

&lt;p&gt;As we came closer to the building, we could tell that it was indeed what we had been looking for. We had found the Metsokangas Comprehensive School.&lt;/p&gt;

&lt;p&gt;We had finally arrived.&lt;/p&gt;

&lt;hr /&gt;
&lt;h2 id=&quot;four-months-earlier-in-berkeley-california-july-2015&quot;&gt;Four months earlier. In Berkeley, California. July 2015.&lt;/h2&gt;
&lt;p&gt;In the months following my decision to decline graduate school, I wrapped up my Fellowship and headed back to the Bay Area. Technically I was working a summer job as a data scientist at an enterprise machine learning startup. But it was really simply an internship I had taken when I was still convinced that I was going to need a summer internship to bridge my Fellowship and starting graduate school in the fall. Now that my fall had been drastically altered, I needed to figure out what I would be investing my time and energies into instead.&lt;/p&gt;

&lt;p&gt;When I was in Chicago, I had kept in touch with a few friends of mine that I had met while in college. One of them, Aatash, had been my roommate my senior year. The other, Andrew, had been introduced through a close mutual friend.&lt;/p&gt;

&lt;p&gt;When I landed back in the Bay, I got a chance to meet up with each of them individually.&lt;/p&gt;

&lt;p&gt;Over a meal of Asian-fusion tacos in Oakland with Aatash, I learned from him that he had recently left his job. When we sat down, he plopped down a thick copy of Diane Ravitch’s &lt;em&gt;The Life and Death of the Great American School System&lt;/em&gt; on our table. I had heard of Diane Ravitch and her numerous books. Her critical, and oftentimes controversial, stances on the modern wave of education reform was something I loosely remembered encountering in a few college courses.&lt;/p&gt;

&lt;p&gt;“Why are you reading this book on education history?” I asked, my interest piqued.&lt;/p&gt;

&lt;p&gt;Aatash grinned at my question, as if expecting it. He replied that after leaving his job, he’d become increasingly interested in deepening his knowledge of the US K-12 education system. This book was just the latest in a series he’d be reading.&lt;/p&gt;

&lt;p&gt;Now, this wasn’t completely out of the blue. Aatash and I became close friends in college in part due to our shared enthusiasm for pursuing a career in education. We had quickly become friends over excited and idealistic conversations. The fact that we were also both diehard NBA fans didn’t hurt.[2] (A few years ago I wrote an article that explained how I ended up becoming inspired to work in education. &lt;a href=&quot;/2014/06/08/dssg-week1.html&quot;&gt;You can read it here.&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;But that day in Oakland, over a meal involving copious amounts of spicy glazed pork belly, we found ourselves again discussing the same issues and interests that had brought us together the first time we met. It became clear to us that with extended time on our hands, there was a way to take these shared passions and work on them together.&lt;/p&gt;

&lt;p&gt;As we chatted over the meal, I brought up what I had heard from our mutual friend Andrew. After graduating from college, Andrew deliberately chose to hold off on the rat race of job hunting.&lt;/p&gt;

&lt;p&gt;Instead, Andrew spent time rigorously researching different areas that seemed to hold potential for social impact — criminal justice policies, campaign finance reform, global warming, to name a few — and tried to evaluated which field he could make the most difference in. It just so happened that when I had flown back into the Bay, Andrew was beginning his investigation into the field of education.&lt;/p&gt;

&lt;p&gt;When I told this to Aatash, the two of us wondered whether there was some way we could team up with Andrew. The three of us could come up with the topics in education we each wanted to learn about, the books we wanted to read, and the organizations we wanted to visit.&lt;/p&gt;

&lt;p&gt;As we shared ideas, I remembered something.&lt;/p&gt;

&lt;p&gt;When I was writing &lt;a href=&quot;http://www.thedatasciencehandbook.com&quot;&gt;The Data Science Handbook&lt;/a&gt;, I had interviewed a data scientist named Clare Corthell.&lt;/p&gt;

&lt;p&gt;She had created her own &lt;a href=&quot;http://datasciencemasters.org/&quot;&gt;“Open Source Data Science Masters”&lt;/a&gt; as a way to switch careers from product design into data science. She quit her job and spent 6 months reading textbooks, taking online courses and building projects to learn the skills required to become a data scientist.&lt;/p&gt;

&lt;p&gt;Her story received a wide amount of attention, especially after she successfully eventually landed a job as a data scientist. Her tale made me think that something similar could be done for education.&lt;/p&gt;

&lt;p&gt;And thus, the Self-Guided Masters in Education was born.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;finally-at-metsokongas-comprehensive-school-in-northern-finland-november-2015&quot;&gt;Finally at Metsokongas Comprehensive School in northern Finland. November 2015.&lt;/h2&gt;
&lt;p&gt;Soon after arriving at Metsokongas Comprehensive School we met with Kalle, the school’s energetic principal. He eagerly led Aatash and me through a tour of the school’s campus, explaining that we were simply one of the many groups that have visited Metsokongas this year. After Microsoft named the school a “Showcase School”, a recognition given to schools that are “intentionally redesign[ing] learning spaces … [and] driv[ing] personalized learning”, Kalle found hordes of admiring visitors from around the world beating on his doors. Aatash and I were simply the latest wave.&lt;/p&gt;

&lt;p&gt;As he explained this to us, we took in the sights around us.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/finland/metsokangas_students.png&quot; alt=&quot;Students Working&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Students working in the hallways of Metsokangas. Photo Credit: &lt;a href=&quot;http://buildingcommunitybridges.weebly.com/blog/the-presentation-of-metsokangas-comprehensive-school&quot;&gt;Building Community Bridges&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;We found students lounging in the hallways, idly chatting with friends as they worked on assignments. When Aatash asked why so many students were working outside of a classroom, Kalle explained to us the school’s philosophy of encouraging students to work in comfortable areas and positions, rather than boxing them within orderly classrooms.&lt;/p&gt;

&lt;p&gt;As we walked through the facilities, Aatash and I noticed that nearly all the classrooms could be connected, simply by folding in the collapsable wall partitioning two rooms. Kalle told us that the idea was to encourage greater interdisciplinary collaboration between different classrooms. In addition, unlike most classrooms in the United States, each class we visited had two teachers. One could teach while the other walked around classroom, gently shushing or encouraging as they went around. This effectively halved the student-to-teacher ratio. I was surprised by this co-teaching approach, having never seen it before.&lt;/p&gt;

&lt;p&gt;During my visit, one particular memory of visiting a 7th grade Home Economics course stands out. I remember blinking in surprise as I entered the classroom. Many of the young students were wielding large cutting knives. They stood around a kitchen listening diligently as a teacher explained something in Finnish.&lt;/p&gt;

&lt;p&gt;“They’re going through the cooking unit right now,” Kalle whispered to me. I gaped at him, dumbfounded at the level of trust and responsibility these students were given. He grinned at the look on my face.&lt;/p&gt;

&lt;p&gt;“Home Economics is a mandatory course for all Finnish students. Students learn about nutrition, budgeting, cooking and many other topics that we think you need to know.”&lt;/p&gt;

&lt;p&gt;As we left the classroom I marveled at the trust these students received — imagine the uproar if a group of American 7th graders were using sharp knives as part of a course — as well as practical curriculum of the Finnish system.&lt;/p&gt;

&lt;p&gt;Throughout the tour, Aatash and I got a chance to speak with students, teachers and administrators. We learned about its history, funding, pedagogical approach and school organization. We even got to play with some of the students during one of the breaks.&lt;/p&gt;

&lt;p&gt;By the time the visit was over, Aatash and I felt utterly stuffed full of experiences to later digest and ruminate. We had chosen to visit Metsokangas Comprehensive School as part of our tour of the Finnish school system, hoping to see some of the best that Finland had to offer. We were not disappointed.&lt;/p&gt;

&lt;p&gt;A comprehensive analysis of all that we learned in Finland will be saved for a future article, but I’ve listed some of the highlights below:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Little standardized testing&lt;/strong&gt;: Finnish students famously score near the top of international assessments in math and science. Yet students in Finland are rarely assessed on standardized tests. This runs contrary to the high-stakes testing that the American accountability movement is pushing, demonstrating that it’s possible to build a high-performing education system without loads of testing.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Professional autonomy&lt;/strong&gt;: As a result of the lack of testing, a culture of professional autonomy is created for teachers, as there isn’t a constant threat of removal due to students’ performance on tests.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;More responsibility&lt;/strong&gt;: The Finnish curriculum trusts students with a greater degree of responsibilities much more frequently than US culture, with the Home Economics story from above being a great example.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Special education&lt;/strong&gt;: In Finland, nearly 50% of students are classified as “special learners” at one point or another. Unlike Americans, who have long maintained a culture of stigma against “special education”, however, Finns don’t see being a “special learner” as something negative. Many more students in Finland receive special education instruction, which is categorized differently from that of the US. The label is not necessarily related to a developmental disability or physical disability; rather, it refers to any sort of difficulty with regard to learning reading and writing, mathematics, or foreign languages. Students who are classified as being “special learners” receive extra tutoring and support to help them catch up.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;img src=&quot;/images/finland/metsokangas_friends.jpeg&quot; alt=&quot;Aatash and I&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Aatash and I near Metsokangas. We definitely didn’t pick the best lighting for this photo.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;the-self-guided-masters-in-education-present-time&quot;&gt;The Self-Guided Masters in Education. Present time.&lt;/h2&gt;

&lt;p&gt;As Aatash, Andrew and I planned our curriculum for the Self-Guided Education Masters (SGEM), we decided to visit Finland. Finland’s education system is commonly lauded in education circles as, within the span of 50 years, it leapt to become one of the highest-performing national systems in the world.&lt;/p&gt;

&lt;p&gt;One of our goals during the SGEM was to investigate what makes for a “high-performing education system” and why countries like Finland had one.&lt;/p&gt;

&lt;p&gt;We also created the SGEM to structure and explain what we were doing to our friends, parents and other curious individuals. It contains a list of books, “capstone projects” and topics we wanted to learn. We even put up a website describing what it is, what our goals are and why we chose to do this rather than go through a more traditional Masters degree. If your curiosity has been whetted and you want to learn more, you can check out the website at &lt;a href=&quot;http://www.understanding.education&quot;&gt;www.understanding.education&lt;/a&gt; (it’s quite the domain, right?).&lt;/p&gt;

&lt;p&gt;The three of us who created this program are hoping to use it as a way to structure what we wanted to learn and work on. For example, we recently submitted a proposal for a project-based learning high school to a &lt;a href=&quot;http://www.xqsuperschool.org&quot;&gt;national competition&lt;/a&gt;, and we’ll also be putting online a “Master’s thesis” that synthesizes what we’ve done and learned.&lt;/p&gt;

&lt;p&gt;Our goal in creating the Self-Guided Masters in Education is to use it as a compass to point us in the direction of where we want to spend the next few years of our lives. Would it be to launch a school? Create a company or nonprofit? Or should we become teachers, or even work in a district’s central office?&lt;/p&gt;

&lt;p&gt;As of right now, just as it was in northern Finland, our exact destination isn’t clear. The future is still hazy and blotted with unanswered questions, but I can’t help feel that I made the right choice in creating my own path rather than attending Carnegie Mellon.&lt;/p&gt;

&lt;p&gt;By pursuing an independently created curriculum rather than entering a formal graduate program, I’ve found myself more comfortable with uncertainty with each passing day. Through the SGEM, I’ve also found myself starting from “square one” in asking myself the hard questions: what drives me? What are my long-term life goals? What do I value?&lt;/p&gt;

&lt;p&gt;These are the sorts of thoughts that I didn’t typically find myself having the space to think about when I was working. Or if I happened to think them, I usually relegated them to the bucket of “Things That Are Important But Not Urgent” and always found a reason de-prioritize answering them lower than any immediate tasks.&lt;/p&gt;

&lt;p&gt;However, by deliberately carving out space in my life for unstructured thinking, I’ve been probing deeper into what I want out of life.&lt;/p&gt;

&lt;p&gt;Through this probing, one of the conclusions I’ve come to in the past few months is how important it is for me to align my job with my core identity. My past jobs as a data scientist and project manager paid well and were high-status professions, but I felt uneasy with them both because I didn’t identify strongly as a “data scientist” nor “project manager.”&lt;/p&gt;

&lt;p&gt;Pursuing another technical graduate degree that would have been taking another step down a professional path in technology. But why deepen into a career that didn’t fit who I am?&lt;/p&gt;

&lt;p&gt;Instead, I feel a greater sense of identification with descriptions like “educator” and “teacher” and believe my next job will be more aligned with these terms. As a result of these nuggets of self-knowledge, I’ve become increasingly confident that the next job I take will be more authentic to my identity and goals.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;During the first few months of pursuing the SGEM, I felt a great deal of confusion, anxiety and self-doubt. What if I made the wrong choice by turning down graduate school? Who in the world makes up their own “masters” degree? What if this was all a half-baked idea and a waste of my time?&lt;/p&gt;

&lt;p&gt;I remember feeling similarly afraid and anxious when I was lost in the Finnish countryside, on my way to Metsokangas.&lt;/p&gt;

&lt;p&gt;But over time, these feelings of against about my professional decision peacefully disappeared. They were replaced by a calm confidence in this journey. Where does this confidence come from? It stems from one very simple fact: despite not enrolling in a graduate school, I’m finding myself learning more than ever.&lt;/p&gt;

&lt;p&gt;Thus I believe that, just like in Finland, I will eventually find my way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Footnotes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;[1] You can read more about Cindy’s story in the first chapter of my book, &lt;a href=&quot;http://www.collegeuncensored.org&quot;&gt;College Uncensored&lt;/a&gt;. Her story is worth a read.&lt;/p&gt;

&lt;p&gt;[2] Unfortunately, Aatash’s favorite team is the Los Angeles Lakers. But hey, nobody’s perfect!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>How We Priced Our Book With An Experiment</title>
   <link href="http://carlshan.com/2015/05/27/dsh-experiment.html"/>
   <updated>2015-05-27T00:00:00-07:00</updated>
   <id>http://carlshan.com/2015/05/27/dsh-experiment</id>
   <content type="html">&lt;h1 id=&quot;how-we-priced-our-book-with-an-experiment&quot;&gt;How We Priced Our Book With An Experiment&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;27 May 2015 - Chicago&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Summary: We conducted a large experiment to test pricing strategies for our book and came to some very surprising findings about allowing customers to pay what they wanted.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Specifically, we found strong evidence that we should let customers pay what they want, which would help us earn more money and more readers when compared with traditional pricing models. We hope our findings can inspire other authors, musicians and creators to look into pay-what-you-want pricing and run experiments of their own.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;To get automatically notified about new posts, you can subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;. You can also subscribe via &lt;a href=&quot;http://feeds.feedburner.com/carlshan&quot;&gt;RSS&lt;/a&gt; to this blog to get updates.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;introduction-pay-what-you-want&quot;&gt;Introduction: Pay What You Want?&lt;/h2&gt;
&lt;p&gt;My co-authors (Henry Wang, William Chen, Max Song) and I have been working on our book, The Data Science Handbook, for over a year now. Shortly before launch, we asked ourselves an important question that many authors face: how much should we charge for our book?&lt;/p&gt;

&lt;p&gt;We had heard of Pay-What-You-Want (PWYW) models, where readers can purchase the book for any amount they want (or at least above a threshold you set). However, many authors and creators worry that only a small percentage of people will contribute in a PWYW pricing model, and that these contributors will opt for meager amounts in the $1-$5 range.&lt;/p&gt;

&lt;p&gt;On the other hand, we also felt that PWYW was an exciting model to try. A PWYW model would allow us to get the book out to as many people as possible without putting the book behind a paywall. We also had an inkling that this experimental pricing model would increase exposure for our book.&lt;/p&gt;

&lt;p&gt;So we set out to answer this simple question: &lt;strong&gt;how should we price our book?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;As practicing statisticians and data scientists, we thought of no better way to decide this than to run a large-scale experiment. The following section details exactly what we tested and discovered.&lt;/p&gt;

&lt;h2 id=&quot;tldr---letting-customers-pay-what-they-want-wins-the-day&quot;&gt;TL;DR - Letting Customers Pay What They Want Wins the Day&lt;/h2&gt;
&lt;p&gt;We experimented with 7 different pricing models pre-launch, with our subscriber base of 5,700 people. In these 7 different models, we compared different pricing schemes, including fixed prices at $19 and $29, along with several Pay What You Want (PWYW) models with varying minimum amounts and suggested price points.&lt;/p&gt;

&lt;p&gt;Before the experiment began, we had agreed to choose whichever variant maximized the two things we cared about: the total number of readers and net revenue (later on, we’ll explain how we prioritized the two).&lt;/p&gt;

&lt;p&gt;Before conducting the experiment, we thought that setting a fixed price at $29, like a traditional book, would lead to the maximum revenue.&lt;/p&gt;

&lt;p&gt;After we analyzed our results, to our surprise, we discovered strong statistical evidence that &lt;strong&gt;with a PWYW model for our book, we could significantly expand our readership (by 4x!) while earning at least as much revenue (and potentially even more) as either of the fixed-priced variants.&lt;/strong&gt;&lt;/p&gt;

&lt;h2 id=&quot;the-prices-we-tested-setting-up-our-experiment&quot;&gt;The Prices We Tested: Setting Up Our Experiment&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;On notation: throughout this post, PWYW models will be described as (Minimum Price/Suggested Price). Example. ($0/$19) means ($0 Minimum Price, $19 Suggested Price).&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Through a sign-up page on our website, we’ve been continuously gathering email addresses of individuals interested in our book throughout the process of promoting the Data Science Handbook.&lt;/p&gt;

&lt;p&gt;We conducted this pricing experiment before the official launch of the book by letting our 5,700 subscribers pre-order a special early release of the book. The following diagram shows our experimental setup:&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/experiment-setup.png&quot; alt=&quot;experiment setup&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;We started the early release pre-order process on Monday, April 20th. We stopped the pre-orders one week later, so that we could analyze our results.&lt;/p&gt;

&lt;p&gt;Through Gumroad, we tracked data on the number of people who landed on each link, whether they purchased, and how much they chose to pay.&lt;/p&gt;

&lt;p&gt;Note: To guard against people buying the book who were not originally assigned to that bucket (for example, those who inadvertently stumbled across our links online), we filtered out all email addresses that purchased a book through a variant that they were not explicitly assigned to. This gave us more confidence in the rigor of our statistical analyses.&lt;/p&gt;

&lt;h2 id=&quot;what-we-found-experiment-results&quot;&gt;What We Found: Experiment Results&lt;/h2&gt;

&lt;p&gt;The roughly 800 users in each of our experimental buckets went through a funnel, where they clicked through the email to visit the purchase page, and then decided whether or not to purchase. We collected data on user behavior in this funnel, as well as the price they paid.&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/funnel.png&quot; alt=&quot;conversion funnel&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;For each of the experimental variants, we collected data on 6 key metrics:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Email CTR - # of people who clicked through to the purchase page / # of people who received the email. The emails were identical, minus the link and a short section about the price.&lt;/li&gt;
  &lt;li&gt;Conversion Rate - # of purchases / # of people who clicked through to the purchase page&lt;/li&gt;
  &lt;li&gt;Total Sales - # of sales, regardless of whether a reader paid $0 or $100&lt;/li&gt;
  &lt;li&gt;Net Revenue - Total revenue generated, minus fees from Gumroad&lt;/li&gt;
  &lt;li&gt;Mean Sales Price - Average sales price that people paid&lt;/li&gt;
  &lt;li&gt;Max Sales Price - Largest sales price paid in that bucket&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Below, you’ll see some plots on how each pricing variant performed on each metric. Each of the seven circles represents a different pricing variant, with the area of the circle being proportional to the magnitude it represents. The larger the circle, the “better” that pricing variant did in terms of our metrics.&lt;/p&gt;

&lt;p&gt;The blue circles are the variants that were fixed at $19 and $29. The orange circles are the PWYW variants.&lt;/p&gt;

&lt;p&gt;The X-axis of the following plots describes the minimum prices we offered: free, $10, $19 (this was a fixed price), $20 and $29 (also fixed). The Y-axes are the prices we suggested when we were using a PWYW variant: $19 and $29.&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/pwyw-vs-fixed.jpg&quot; alt=&quot;pwyw vs fixed&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/metric-plots.png&quot; alt=&quot;plots&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;Looking above, it’s no surprise at a PWYW model of ($0/$19) had the highest conversion rate  (upper right plot), and as a result the greatest number of people who downloaded the book . After all, you can get it for free!&lt;/p&gt;

&lt;p&gt;Much to our surprise, many of our readers who got this variant paid much more than $0. In fact, as you can see above in the “Mean Sales Price” plot in the bottom left corner, our average purchase price was about $9. Some readers even paid $30.&lt;/p&gt;

&lt;p&gt;To examine the distribution of payments we received for each variant, we also examined the histogram of payments for each of the 5 PWYW variants:&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/sales-distribution.png&quot; alt=&quot;sales distribution&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;It’s again no surprise to see a large chunk of purchases at the minimum. However, you can also see fairly sizable clumps of readers who pay amounts around $5, $10, $15 and $20 (and even some who paid in the $30-$50 range).&lt;/p&gt;

&lt;p&gt;In fact, readers seemed to like paying amounts that were multiples of $5, perhaps because it represented a nice round number.&lt;/p&gt;
&lt;h2 id=&quot;surprising-insights-on-pay-what-you-want&quot;&gt;Surprising Insights on Pay What You Want&lt;/h2&gt;
&lt;h3 id=&quot;you-can-earn-as-much-from-a-pwyw-model-and-possibly-more-as-from-a-fixed-price-model&quot;&gt;You Can Earn As Much from a PWYW model (and possibly more) as from a Fixed Price model&lt;/h3&gt;
&lt;p&gt;Traditional advice told us that we should price our book at a high, fixed price point, since people interested in advancing their careers will typically pay a premium for a book that helps them do exactly that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;However, our ($0/$19) variant was ranked second in total revenue generated (tying with a fixed price of $29).&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/net-revenue.png&quot; alt=&quot;net revenue&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;In fact, if anything, the data lends credence to the belief that &lt;strong&gt;you can earn even more from PWYW than from setting a fixed price.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;What do we mean by that?&lt;/p&gt;

&lt;p&gt;Well, our ($0/$19) variant actually made nearly twice as much money as fixing the price at $19. The difference in earnings was large, and is strong statistical evidence that our book would make more money if we made it free, and simply had a suggested price of $19, than if we had fixed the price at $19.[1]&lt;/p&gt;

&lt;p&gt;This was an incredible result, since it suggested that with a PWYW model, we could generate the same amount of revenue as a fixed price model, while attracting 3-4x more readership!&lt;/p&gt;

&lt;h3 id=&quot;higher-suggested-price-didnt-translate-to-higher-average-payments-but&quot;&gt;Higher Suggested Price Didn’t Translate to Higher Average Payments. But…&lt;/h3&gt;
&lt;p&gt;The “suggested” price didn’t seem to have seem to have a large impact on the price people paid. Compare the mean purchase prices between $19 suggested and $29 suggested in both the $0 minimum variants and the $10 minimum variants.&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/mean-sales.png&quot; alt=&quot;mean sales price&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;As you can see, moving the suggested price from $19 to $29 in both cases increased average purchase price by only $1.&lt;/p&gt;

&lt;p&gt;However, we don’t mean to imply the suggested price had zero effect. In fact, &lt;strong&gt;the data lends support to actually having a lower suggested price.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can look to see what happened to conversion rates when we changed the suggested price from $19 to $29. In both cases we tested ($0 minimum and $10 minimum), a lower suggested price had a higher conversion rate, and drove ultimately more revenue.[2]&lt;/p&gt;

&lt;p&gt;Therefore, it seems that even if the average sale is the same despite different suggested price, &lt;em&gt;total sales&lt;/em&gt; increased when you have a lower suggested price. This is perhaps due to certain readers being turned off by a higher suggested price, even if they could get it for $0.&lt;/p&gt;

&lt;p&gt;Just imagine seeing a piece of chocolate being offered for free, but having a suggested price of $100. You might scoff at the absurdly high suggested price and refuse the candy, despite being able to take it for nothing.&lt;/p&gt;

&lt;p&gt;On the other hand, if you were offered the same scenario, but this time the free candy had a suggested price of just $0.25, you may see this as fair and be much more inclined to part with your quarter.&lt;/p&gt;

&lt;h2 id=&quot;try-it-out-for-yourself&quot;&gt;Try It Out For Yourself&lt;/h2&gt;
&lt;p&gt;We think that all of these findings should spur authors and creators to conduct testing on their own product pricing. &lt;a href=&quot;www.gumroad.com&quot;&gt;Gumroad&lt;/a&gt;, our sales platform, makes it remarkably easy to create product variants, which you can email out to randomized batches of your followers. Or, you can use the suite of A/B testing tools to ensure that different visitors to your website receive different product links.&lt;/p&gt;

&lt;p&gt;By doing so, you may discover that you could reach a larger audience, while also earning higher revenue.&lt;/p&gt;

&lt;p&gt;[1] This result &lt;em&gt;just&lt;/em&gt; missed the cutoff for statistical significance. The actual p-value comparing $0/$19 with a fixed $19 was 0.057, missing our threshold of 0.05 necessary to qualify as statistically significant. Nevertheless, the very low p-value is a strongly suggestive result in favor of a PWYW model.&lt;/p&gt;

&lt;p&gt;[2] Beyond being practically significant, this was also statistically significant with a p-value close to 0.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;If you want to be notified when my next article is published, subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>How Data Science Can Be Used For Social Good</title>
   <link href="http://carlshan.com/2015/01/08/data-science-social-good.html"/>
   <updated>2015-01-08T00:00:00-08:00</updated>
   <id>http://carlshan.com/2015/01/08/data-science-social-good</id>
   <content type="html">&lt;h1 id=&quot;how-data-science-can-be-used-for-social-good&quot;&gt;How Data Science Can Be Used For Social Good&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;08 Jan 2015 - Chicago&lt;/p&gt;

&lt;p&gt;&lt;em&gt;To get automatically notified about new posts, you can subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;. You can also subscribe via &lt;a href=&quot;http://feeds.feedburner.com/carlshan&quot;&gt;RSS&lt;/a&gt; to this blog to get updates.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;##Introduction
-&amp;gt;&lt;img src=&quot;/images/give_directly/give_directly.png&quot; alt=&quot;Give Directly&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;em&gt;Credit: Google Images&lt;/em&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;In 2013 Kush Varshney, a researcher at IBM, signed up through a non profit called DataKind to volunteer his technical skills on assisting pro bono projects. DataKind’s flagship program, DataCorps, assembles teams of data scientists to partner with social organizations like governments, foundations or NGOs for three- to six-month collaborations to clean, analyze, visualize and otherwise use data to make the world a better place.&lt;/p&gt;

&lt;p&gt;Kush, who holds a PhD in electrical engineering and computer science from MIT, was promptly contacted by DataKind to work on a project with GiveDirectly. He was joined by another team member, Brian Abelson, himself now a data scientist at an open data search company. The two of them were brought together to tackle a challenging problem for a non profit called GiveDirectly.&lt;/p&gt;

&lt;p&gt;GiveDirectly conducts direct cash transfers to low-income families in Uganda and Kenya through mobile payments. These donations are given with no strings attached, trusting that the poor know how to best use the money effectively. One of the top-rated charities on GiveWell, GiveDirectly has had randomized controlled trials conducted evaluating the effectiveness of its approach, with strong positive results.&lt;/p&gt;

&lt;p&gt;GiveDirectly’s model is to conduct direct cash transfers to villages with large number of residents in poverty. However, to assess which villages these are, the organization relied upon staff members to individually visit villages in Uganda and Kenya and assess the relative poverty of the inhabitants.&lt;/p&gt;

&lt;p&gt;When I spoke with Kush he described some drawbacks of this method, saying, “This method could be costly in both time required to visit each site, and in using donations to help pay wages for inspections that could otherwise be going directly to the poor.”&lt;/p&gt;

&lt;p&gt;Together with GiveDirectly, Kush and Brian sought a better way to accomplish this task.&lt;/p&gt;

&lt;p&gt;Enter data science.&lt;/p&gt;

&lt;p&gt;##What Is Data Science?
-&amp;gt;&lt;img src=&quot;/images/give_directly/venn_diagram.png&quot; alt=&quot;Data Science Venn Diagram&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;em&gt;Credit: Drew Conway - The Data Science Venn Diagram&lt;/em&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;Data science is an emerging discipline that combines techniques of computer science, statistics, mathematics, and other computational and quantitative disciplines to analyze large amounts of data for better decision making. The field arose in response to the fast growing amount of information and the need for computational tools to augment humans in understanding and using that data.&lt;/p&gt;

&lt;p&gt;Rayid Ghani, Director of the Data Science for Social Good Fellowship and former Chief Scientist for Obama, noted that “the power of data science is typically harnessed in a spectrum with the following two extremes: helping humans in discovering new knowledge that can be used to inform decision making, or through automated predictive models that are plugged into operational systems and operate autonomously.” Put plainly, these two ways of using data can be summarized as turning data into knowledge, or converting data into action.&lt;/p&gt;

&lt;p&gt;Chiefly responsible for wrangling findings and crafting models using the data is an emerging profession: the data scientist. The “scientist” portion of the title conjures a vision of academia, partially as a result of many data scientists holding advanced STEM degrees, but it also paints a false picture of a data scientist as someone holed up in the research lab of an organization tinkering away on esoteric questions. This view of the data scientist characterizes peering into the depths of “Big Data” in pursuit of knowledge.&lt;/p&gt;

&lt;p&gt;Rayid debunks this myth, saying that “frequently, however, the challenge in data science is not the science, but rather the understanding and formulation of the problem; the knowledge of how to acquire and use the right data; and once all that work is done, how to operationalize the results of the entire process.” Accordingly, the real role of a data scientist should be thought of as much more embedded in the core of a company or non profit, directly shaping the scope and direction of the organization’s products and services.&lt;/p&gt;

&lt;p&gt;The handiwork of data scientists can be found in a plethora of products we interact with every day. Facebook uses data from each visit to tailor the posts you see in your News Feed. Amazon takes account of what you’ve purchased to recommend other items for purchase. PayPal roots out fraudulent behavior by analyzing the data from seller-buyer transactions.&lt;/p&gt;

&lt;p&gt;So far, most of the uses of data science have been towards business objectives. The technology, financial services and advertising industries are rife with opportunities to convert data into profit. But now, more and more innovative social sector organizations like GiveDirectly are catching on to how technology and data science can be used to solve their problems.&lt;/p&gt;

&lt;p&gt;Organizations like Rayid’s Data Science for Social Good Fellowship, Y Combinator-backed nonprofit Bayes Impact, and DataKind are popping up to fund, train and deploy excellent data scientists to tackle pressing social issues.&lt;/p&gt;

&lt;p&gt;##Data Science In Action
In the case of GiveDirectly, Kush and Brian were tasked to use their computational data science skills to help discover where the poorest villages were located, so that donations could be channeled to households with the highest needs.&lt;/p&gt;

&lt;p&gt;To do this, Kush and Brian used GiveDirectly’s knowledge that an indication of the poverty of a household is the type of roofing of their home. Kush told me that in Kenya, “poorer families tended to live in homes with thatched roofs. On the other hand, a home with a metal roof typically meant the family was well-to-do enough to purchase a more sturdy shelter.”&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/give_directly/thatched_metal.png&quot; alt=&quot;Thatched vs. Metal Roofs&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;em&gt;Credit: GiveDirectly&lt;/em&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;Using this knowledge, Kush and Brian used Google Maps to extract satellite images of the various villages in Kenya and deployed an algorithm that used the coloring of the roof to determine whether it was made of metal or straw. Doing this across all of the houses in the village could gave an estimate of the level of poverty in that village.&lt;/p&gt;

&lt;p&gt;In early 2014, GiveDirectly piloted this algorithm to detect poverty levels in 50 different villages in Kenya. It was doing so in one of its largest campaigns, moving $4 million to households all over western Kenya.&lt;/p&gt;

&lt;p&gt;By employing Kush and Brian’s algorithm, GiveDirectly eliminated over 100 days of manual inspection of each village. Through doing so, over $4,000 was saved, allowing GiveDirectly to fund four more households.&lt;/p&gt;

&lt;p&gt;Excited by the potential of data science playing a role in more effectively help families escape poverty, GiveDirectly is now discussing with Kush, Brian and DataKind to see how their algorithm can be used even more precisely, and scaled to additional villages.&lt;/p&gt;

&lt;p&gt;##Potential To Build The Future&lt;/p&gt;

&lt;p&gt;As an increasing volume of information is generated by the world, there will be more opportunities to apply data science towards socially meaningful causes. What if we could help guidance counselors predict which students were the most likely to drop out, and then design to successful interventions around them? What if we improve parole decisions, reduce prison overcrowding and lower prison recidivism?&lt;/p&gt;

&lt;p&gt;Examples of how data science can be applied to the social sector include:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Reduce crime and recidivism: Predictive modeling can be used to assess whether an inmate would be likely to reoffend, informing the parole decision.&lt;/li&gt;
  &lt;li&gt;Give tailored feedback and content to students: Adaptive tutoring software can be used to model how much students are learning and understanding, tailoring problems.&lt;/li&gt;
  &lt;li&gt;Spot nutrition deficiencies: Data tools can be built that monitor vitamin and mineral intake, warning users of deficiencies in their dietary and health habits.&lt;/li&gt;
  &lt;li&gt;Early prevention of shootings: Network-based analyses of gangs can be used to predict where and when future shootings will occur.&lt;/li&gt;
  &lt;li&gt;Diagnose diseases early on: Leveraging genetic, imaging, and EMR data to provide early diagnosis of diseases such as Parkinson’s, M.S., and Autism.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s clear that we can be optimistic about how data scientists can use the data at their fingertips for social good. As an emerging technological frontier, data science is in a position of immense potential. As a result, there is much to explore about how we can use it to push the human race forward.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;
&lt;p&gt;Targeting direct cash transfers to the extremely poor (2014), Kush Varshney and Brian Abelson&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;I write about data science applied to social causes. If you want to be notified when my next post is published, subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Weeks 7-12: Summer Wrapup</title>
   <link href="http://carlshan.com/2014/10/13/dssg-week7-12.html"/>
   <updated>2014-10-13T00:00:00-07:00</updated>
   <id>http://carlshan.com/2014/10/13/dssg-week7-12</id>
   <content type="html">&lt;h1 id=&quot;weeks-7-12-summer-wrapup&quot;&gt;Weeks 7-12: Summer Wrapup&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;13 October 2014 - Chicago&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is the final post in a series of posts chronicling my reflections on participating in the 2014 Data Science for Social Good Fellowship at the University of Chicago. While I had intended to post once a week, I ended up falling short of my goals. Work from DSSG piled up, making it tough to write thoughtul posts on a weekly schedule.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Nevertheless, I intend for this to be a wraup post that summarizes the work that my team and I did. Reading this will allow you to glean all the different experiences, learnings and findings I encountered over the summer.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;You can read my last post here:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;http://carlshan.com/2014/07/12/dssg-week6.html&quot;&gt;&lt;em&gt;Week 6: Progress Thus Far&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;To get automatically notified about new posts, you can subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;. You can also subscribe via &lt;a href=&quot;http://feeds.feedburner.com/carlshan&quot;&gt;RSS&lt;/a&gt; to this blog to get updates.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;
&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/health_leads_logo.png&quot; alt=&quot;Health Leads&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;em&gt;“It is health that is real wealth and not pieces of gold and silver.”&lt;/em&gt;&amp;lt;-
-&amp;gt;- Mahatma Gandhi&amp;lt;-&lt;/p&gt;

&lt;p&gt;##Introduction
President Obama’s Affordable Care Act enacted broad reforms across the United States’ healthcare system. While the healthcare landscape has changed drastically, one important constant has remained the same: a person’s health is affected significantly by non-medical factors.&lt;/p&gt;

&lt;p&gt;For example, a patient with an asthmatic condition caused by a moldy apartment will not be cured simply with better medicine. She needs a better apartment, and yet our health care system is not traditionally set up to handle these non-medical issues.&lt;/p&gt;

&lt;p&gt;During this summer’s DSSG Fellowship, our team – Chris Bopp, Cindy Chen, Isaac McCreery, myself and mentor Young-Jin Kim – worked with a nonprofit called Health Leads to apply data science to address these non-medical needs, to help patients get access to basic resources vital for a healthy life.&lt;/p&gt;

&lt;p&gt;##Health Leads
In 1996, Harvard sophomore Rebecca Onie was a volunteer at Greater Boston Legal Services, assisting low-income clients with housing problems. She found herself speaking with clients facing health issues brought on by their poverty. Some lived in dilapidated apartments, infested with rodents and insects. Others couldn’t afford basic necessities like food. Modern medicine was largely ineffective against these issues. Doctors were trained to treat medical ills, not social ones.&lt;/p&gt;

&lt;p&gt;Inspired by her experiences, Rebecca launched a health services nonprofit called Health Leads, which recruits and trains college students to work closely with patients referred by doctors who needed basic resources such as food, transportation, or housing. These college students, called “Advocates” in Health Leads lexicon, learn about each patient’s needs, and meticulously dig up resource providers – food banks, employment opportunities, childcare services – that can fulfill them.&lt;/p&gt;

&lt;p&gt;In the nearly two decades since Health Leads’ inception, its impact on the health landscape has been tremendous. In 2013 alone, Health Leads Advocates worked with over 11,000 patients to connect them with basic services and resources.&lt;/p&gt;

&lt;p&gt;##The Problem&lt;/p&gt;

&lt;p&gt;Serving a predominantly low-income patient population can pose a challenge for Health Leads. Some patients will lack stable, permanent housing or employment. Others may not own a cell phone on which they can be consistently reached. Health Leads noticed that these circumstances affected their work with some patients: despite Advocates’ best efforts, a proportion of their clients would disconnect from working with the program. These clients would be unreachable, not returning phone calls and ultimately Advocates would be forced to close their cases – never knowing if these clients received the basic resources they needed.&lt;/p&gt;

&lt;p&gt;Below is an image displaying the phone calls made to a random group of 200 different patients and whether they responded or not. Half of the clients worked with Health Leads through the completion of their case and the other half ultimately disconnected from Health Leads’ program.&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/success_disconnect.png&quot; alt=&quot;Patient Disconnection vs. Success&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;(The cases with negative days are ones where Health Leads took down the information for patient, didn’t begin working with them until a few days later.)&lt;/p&gt;

&lt;p&gt;Just at a glance, there appears to be pretty clear differences between the two groups. Most obviously, the disconnected patients seem to have many more failed communication attempts (red dots) than successful ones (green dots).&lt;/p&gt;

&lt;p&gt;However, Health Leads wanted to know: exactly what are the factors that contribute to a patient disconnecting from Health Leads? How does the difficulty of a patient’s need play into the problem? What other factors might be important to consider?&lt;/p&gt;

&lt;p&gt;Against the backdrop of these pressing questions, Health Leads came to our DSSG team to use data to help discover some answers.&lt;/p&gt;

&lt;p&gt;###The Challenges&lt;/p&gt;

&lt;p&gt;When we began tackling the problem, we ran into a slew of challenges. Unlike in the internet world where companies can track every iota of data down to the click, nonprofits serve their clients in person – meaning data must be manually recorded, rather than passively accumulated.&lt;/p&gt;

&lt;p&gt;Furthermore, it may be that the factors we end up discovering as influencing patient outcome may be outside of the control of Health Leads. What if we found that the most significant indicators of patients’ success was gender or age? It would be hard to translate a finding like this into operationalizable actions for Advocates.&lt;/p&gt;

&lt;p&gt;##Our Findings&lt;/p&gt;

&lt;p&gt;Over the summer, our team worked through the data to distill insight, discovering findings that Health Leads can use to improve their practice.&lt;/p&gt;

&lt;p&gt;For example, we developed a “Patient Complexity Index” that tries to capture the probability that a patient will disconnect from Health Leads. We incorporate information about the type of resources this patient requires and historic performance information about the Health Leads clinic where the patient is served. For instance, needs involving employment or housing are typically much harder to resolve than needs around childcare or transportation. The success rates of each of these resource connections also vary per desk. We found that different Health Leads sites specialize in different types of resource connections.&lt;/p&gt;

&lt;p&gt;By combining this information, Health Leads can more accurately quantify the difficulty of each patient so that more experienced Advocates can work with patients with more complex needs. By doing so, Health Leads can better address each patient’s different circumstances, lowering the chance that they’ll disconnect.&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/patient_needs.png&quot; alt=&quot;Patient Needs&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;em&gt;A Need Complexity Index can help quantify the difficulty of these patients’ needs&lt;/em&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;Furthermore, Health Leads currently standardizes the intervals at which patients call patients: a minimum of once every 10 days. The findings from the data confirmed previous Health Leads research that  Advocates should try to get in touch with patients frequently in the beginning stages of building a relationship with a patient. When an Advocate successfully contacts a client in the first month, that one successful phone call significantly decreases the likelihood of disconnection:&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;img src=&quot;/images/patient_call_frequency.png&quot; alt=&quot;Call Frequency&quot; /&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;-&amp;gt;&lt;em&gt;Health Leads should call new clients frequently in the first month&lt;/em&gt;&amp;lt;-&lt;/p&gt;

&lt;p&gt;###Conclusion&lt;/p&gt;

&lt;p&gt;We presented our findings and models to Health Leads at the end of this summer, and our results validate Health Leads’ emphasis on regular follow up. We believe that the information we provided reinforces organizational strategies that can increase client engagement: calling clients regularly and leveraging communication tools such as text messaging. By investigating the different factors influencing a patient’s likelihood to disconnect, our team’s findings have pointed to important steps that Health Leads can continue to take to ensure that more people get the resources they need for a healthy life.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;I write about data science applied to social causes. If you want to be notified when my next post is published, subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Week 6: Progress Thus Far</title>
   <link href="http://carlshan.com/2014/07/12/dssg-week6.html"/>
   <updated>2014-07-12T00:00:00-07:00</updated>
   <id>http://carlshan.com/2014/07/12/dssg-week6</id>
   <content type="html">&lt;h1 id=&quot;week-6-progress-thus-far&quot;&gt;Week 6: Progress Thus Far&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;12 July 2014 - Chicago&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is the sixth in a series of posts chronicling my reflections on participating in the 2014 Data Science for Social Good Fellowship at the University of Chicago.&lt;/em&gt;&lt;br /&gt;
&lt;em&gt;You can read my last post here:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;http://carlshan.com/2014/07/05/dssg-week5.html&quot;&gt;&lt;em&gt;Week 5: Learning and Doing&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;To get automatically notified about new posts, you can subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;. You can also subscribe via &lt;a href=&quot;http://feeds.feedburner.com/carlshan&quot;&gt;RSS&lt;/a&gt; to this blog to get updates.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Throughout the DSSG Fellowship, it’s been clear that my team is quite unique – unlike other groups, we were tasked with two separate projects with two different partners: Health Leads and The Chicago Alliance to End Homelessness.&lt;/p&gt;

&lt;p&gt;However, after spending a few weeks wrestling with the challenges of context-switching between different projects, tangoing with multiple parties through different communication channels, and wading through raw and smelly data, our team decided to break up into two sub-groups that would each tackle a different project.&lt;/p&gt;

&lt;p&gt;I ended up gravitating to focusing on tackling the problems presented by Health Leads.&lt;/p&gt;

&lt;p&gt;Now that it’s been just over six weeks since the Fellowship began, it would be a worthwhile reflection to assess what we’ve been able to accomplish up to this point, and what is still left to be done.&lt;/p&gt;
&lt;h2 id=&quot;health-leads&quot;&gt;Health Leads&lt;/h2&gt;
&lt;p&gt;&lt;img src=&quot;/images/health_leads_rebranding.jpg&quot; alt=&quot;Health Leads&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;the-goal&quot;&gt;The Goal&lt;/h3&gt;
&lt;p&gt;I’ve &lt;a href=&quot;http://carlshan.com/2014/06/22/dssg-week3.html&quot;&gt;written briefly about Health Leads before&lt;/a&gt;, in which I recounted the story of how Rebecca Onie founded the organization upon discovering the hidden link between social services and debilitating health conditions. To briefly summarize Health Leads mission: many health clinic patients experience health concerns that are brought on more so by social ills than medical ones. Asthma can be treated with medication, but not if the root cause is a mold-infested apartment.&lt;/p&gt;

&lt;p&gt;Health Leads trains college students to work with patients who are referred by health service providers to work on identifying patient needs and working with them to acquire the resources that satisfy these needs.&lt;/p&gt;

&lt;p&gt;Unfortunately, Health Leads is seeing a large number of their patients drop off. After one or two successful contacts, many patients stop returning phone calls. They don’t reply to emails and may live transitory lives, rendering direct mail a difficult channel of reaching them.&lt;/p&gt;

&lt;p&gt;Our team is working on sifting through the collection of interaction data Health Leads has provided us with and bringing to light the possible reasons as to why a patient may disengage. In the end, we also hope to provide insights as to how Health Leads could direct their energy and activities to boost patient responsiveness in such a way that can increases the chances they will receive the resources they need.&lt;/p&gt;

&lt;h3 id=&quot;the-challenges&quot;&gt;The Challenges&lt;/h3&gt;
&lt;p&gt;What we quickly realized upon tackling this project was that Health Leads had yet to seriously determine exactly what it meant to for a patient to be “engaged.” To be fair, even in the world of technology product management this definition can be difficult to pin down. Groupon and Zynga, both struggling companies, certainly saw high usage and engagement numbers in their heyday by measurement of engagement. However, unlike in the web-world where companies can track every iota of data down to the click, non profits oftentimes have to make do with infrequently collected data that must be actively (and sometimes, painfully) recorded rather than passively accumulated.&lt;/p&gt;

&lt;p&gt;Translating this into practice packs a painful twofold punch. Not only do we not have a great deal of data (our entire dataset totals less than 250mb), but a large portion of is afflicted with data quality issues. We see fields with low coverage, data clearly generated from user error or have otherwise untrustworthy cleanliness issues that raise our eyebrows.&lt;/p&gt;

&lt;p&gt;All this presents a rather challenging scenario. After all, it’s hard to do data science without good data.&lt;/p&gt;

&lt;p&gt;In addition to data concerns, I also mentioned earlier that nailing down the exactly definition of engagement is proving to be a challenge. The difficulty therein lies in translating a more nebulous human intuition into some rigorous formulation. If we were to proceed on the wrong calculation of engagement, any statistical machine learning methods we build to model it become suspect.&lt;/p&gt;

&lt;p&gt;Our team had initially ran a logistic regression attempting to predict outcome as a function of responsiveness, only to discover that my calculation of responsiveness was off. However, upon recalculating it I learned that the accuracy of my predictions was actually higher on the erroneous calculations, presenting quite a conundrum.&lt;/p&gt;

&lt;p&gt;Furthermore, even beyond the practical implementation concerns our team has, there are higher level questions that we’re asking ourselves. Namely, we’re questioning the underlying assumption of the entire problem: does higher engagement actually increase the chance of a successful patient outcomes?&lt;/p&gt;

&lt;p&gt;After all, if the answer is a resounding ‘No’, then the entire foundation upon which we’ve been working crumbles into sand. Unfortunately, there are some small inklings that we’re finding possibly pointing in this direction. Tentatively, we believe this surprising finding is due more to low-quality data and an iffy definition of engagement than actual causal processes in the real world. Nevertheless, finding this raises a red flag in our minds.&lt;/p&gt;

&lt;p&gt;Finally, one last challenge may be that the factors we end up discovering as influencing patient outcome may be ones outside of Health Leads control. Perhaps the most significant indicators of patients successfully acquiring necessary resources are variables such as gender or age. It may be quite difficult for Health Leads to translate into operationalizable steps their Advocates can take.&lt;/p&gt;

&lt;h3 id=&quot;the-adventure-continues&quot;&gt;The Adventure Continues&lt;/h3&gt;
&lt;p&gt;The previous section might have come off as pessimistic. But I didn’t mean it to be. Reviewing them, none of the challenges in my list are insurmountable or dead ends. In fact there are also a number of reasons to be quite positive about in thinking about what I, as a data scientist, can do to help Health Leads achieve their vision of creating a healthcare system in which all patients’ basic resource needs are adequately addressed.&lt;/p&gt;

&lt;p&gt;For starters, our team has already started to think about ways to more carefully redefine and incorporate engagement as a measurement of patient outcome. We think that our initial findings of a disconnection between patient outcome and engagement is due more so to faulty wiring at the definition level rather than an actual lack of relationship between the two factors.&lt;/p&gt;

&lt;p&gt;With Health Leads’ help, we’re also thinking of ways of engineering more substantive and accurate features from the data we have available that can paint a more nuanced and informative story about how a patient traverses through the process of getting the resources they need. As an example, one road we’re exploring is to, rather than summarize engagement using one single number averaging across multiple interactions, instead look at vectorizing engagement by calculating it at various points through a patient’s relationship with Health Leads.[1]&lt;/p&gt;

&lt;p&gt;Even if engagement ends up proving a dud – bearing little to no predictive significance on a patient’s outcome – this itself would be a landmark discovery for Health Leads. And based upon their stellar team and impressive organizational quality that I’ve observed up to this point, I have no doubt that they’ll thoughtfully incorporate this finding into their model so as to better continue serving the health and social service needs of individuals all over America.&lt;/p&gt;

&lt;h2 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h2&gt;
&lt;p&gt;[1] Another opportunity we’ll be exploring as we continue working with Health Leads will be to directly predict the outcome of a patient, rather than using engagement as a proxy.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;I write posts about data science applied to social causes. If you want to be notified when my next reflection is published, subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Week 5: Learning and Doing</title>
   <link href="http://carlshan.com/2014/07/05/dssg-week5.html"/>
   <updated>2014-07-05T00:00:00-07:00</updated>
   <id>http://carlshan.com/2014/07/05/dssg-week5</id>
   <content type="html">&lt;h1 id=&quot;week-5-learning-and-doing&quot;&gt;Week 5: Learning and Doing&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;5 July 2014 - Chicago&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is the fifth in a series of posts chronicling my reflections on participating in the 2014 Data Science for Social Good Fellowship at the University of Chicago.&lt;/em&gt;&lt;br /&gt;
&lt;em&gt;You can read my last post here:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;http://www.carlshan.com/2014/06/30/dssg-week4.html&quot;&gt;&lt;em&gt;Week 4: Bringing Humans To The Data&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;To get automatically notified about new posts, you can subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;. You can also subscribe via &lt;a href=&quot;http://feeds.feedburner.com/carlshan&quot;&gt;RSS&lt;/a&gt; to this blog to get updates.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;My number one goal in participating in the 2014 DSSG Fellowship was to create something of value. By something valuable, I had in mind a piece of software that helped satisfy the needs of the non profit partners I would work with.&lt;/p&gt;

&lt;p&gt;Now halfway through the Fellowship, I’ve increasingly noticed a tension between my goal and that of the Fellowship. As mentioned in &lt;a href=&quot;http://carlshan.com/2014/06/08/dssg-week1.html&quot;&gt;my initial reflections&lt;/a&gt; about DSSG, the goals of the Fellowship were focused around:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;“(a) helping Fellows learn how to use various techniques and tools and (b) developing each Fellow’s interests in working towards social good, open government and open science.”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Both of these are also priorities of mine, but I believe that they could both be obtained primarily through creating a valuable piece of software. Rather than through listening to lectures or reading, I find myself learning most productively (as measured by amount of content retained per unit of time) when I have the chance to apply them in practice.[1] Being exposed to and navigating this tension has made me more aware of how the different goals of learning versus doing translate into mindsets and behaviors.&lt;/p&gt;

&lt;p&gt;As someone who came into the Fellowship leaning more towards the doing camp, my mindset towards problem-solving is to focus on effectiveness, and not necessarily on efficiency. I work with the implicit assumption that my code won’t be as clean or properly abstracted as I would like it to be. I take an iterative approach where I take small stabs at the problem and refine my code as I build up my understanding of the problem.&lt;/p&gt;

&lt;p&gt;When I come across edge cases in the data (e.g., a client who has a negative age, or a field with multiple values that really mean the same thing), I put aside my curiosity to dig further, make a mental note to explore it later and exclude this data from my analyses. With less hand-wringing about how to deal with strange outliers or edge cases, I default towards simplicity and building the least complex model. In fact, the first model my team and I looked at for &lt;a href=&quot;https://healthleadsusa.org/&quot;&gt;Health Leads&lt;/a&gt; was one of the simplest possible: a single-variable logistic regression.&lt;/p&gt;

&lt;p&gt;As a result of prioritizing doing over learning, I work primarily in &lt;a href=&quot;http://ipython.org/notebook.html&quot;&gt;iPython Notebook&lt;/a&gt;, a web-based Python interpreter. Only after properly mapping out and charting the territory of the problem do I then try to translate the code I’ve hacked together into more cleanly abstracted modules and scripts.[2]&lt;/p&gt;

&lt;p&gt;In contrast to my attitude when my aim is doing, when I am clearly optimizing for learning, I focus on efficiency of process rather than effectiveness. Ira Glass, the host of the spectacular radio show This American Life, once said that it’s your taste combined with the relentless effort to materialize it your work that will &lt;a href=&quot;http://www.goodreads.com/quotes/309485-nobody-tells-this-to-people-who-are-beginners-i-wish&quot;&gt;propel you to greatness&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;When I’m optimizing for learning, I’m driven more by the curiosity to understand than by the goal of achieving. I work more slowly, pausing to try to understand the edge cases. By poking around the shells of these outliers, searching for cracks, I end up discovering potholes to fill up in my knowledge. In this state of mind, progress on my work feels slower, but the density of learning is much higher.&lt;/p&gt;

&lt;p&gt;Putting these thoughts into the framework of cognitive theories of behavior, I suspect that prioritizing learning over doing is aligns your mindset with an attitude of &lt;a href=&quot;http://en.wikipedia.org/wiki/Practice_(learning_method)&quot;&gt;deliberate practice&lt;/a&gt;, key to becoming great at your craft.&lt;/p&gt;

&lt;p&gt;In this way, learning is complementary to doing. A strong burst of effort towards learning the fundamentals, the shortcuts and the heuristics are the precursors to getting a ton done.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/learning_curve.png&quot; alt=&quot;The Learning/Doing Curve&quot; /&gt;&lt;/p&gt;

&lt;p&gt;As the graph I made above shows, I suspect the beginning of a new project will require a heavy commitment towards learning. However, each of the dips in the curve represents a cycle of work that attempts to meet a project deadline happening at the trough of the curve. The oscillations that occur afterwards represent the various stumbling blocks that you encounter during the course of a project.&lt;/p&gt;

&lt;p&gt;As I progress I’ll have to keep in mind just how much I want to optimize towards learning versus doing, and try to feel out where the various inflection points are.&lt;/p&gt;

&lt;p&gt;##Footnotes&lt;/p&gt;

&lt;p&gt;[1] I’ve noticed that I’ve been in the state of ‘flow’ more often when I’m in the act of creation, such as through writing or coding, than when I’m passively absorbing information, such as through watching a talk. One intermediate point between these two different ends is that I’ve also noticed I can easily go into a state of ‘flow’ when I’m absorbing information through reading. However even when I read I find myself underlining key passages, jotting notes and counterpoints and actively thinking about the content.&lt;/p&gt;

&lt;p&gt;[2] No matter how much I focus on doing, I can’t escape my aesthetic preference for cleanly and well-factored code. I spent last night obsessing over how to speed a function. I left the office happily at 1:30am with the code running faster by a factor of about 8.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;I write weekly posts about data science applied to social causes. If you want to be notified when my next reflection is published, subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Week 4: Bringing Humans To The Data</title>
   <link href="http://carlshan.com/2014/06/30/dssg-week4.html"/>
   <updated>2014-06-30T00:00:00-07:00</updated>
   <id>http://carlshan.com/2014/06/30/dssg-week4</id>
   <content type="html">&lt;h1 id=&quot;week-4-bringing-humans-to-the-data&quot;&gt;Week 4: Bringing Humans To The Data&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;30 June 2014 - Chicago&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is the fourth in a series of posts chronicling my reflections on participating in the 2014 Data Science for Social Good Fellowship at the University of Chicago.&lt;/em&gt;
&lt;em&gt;You can read the last post here:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;http://www.carlshan.com/2014/06/22/dssg-week3.html&quot;&gt;&lt;em&gt;Week 3: Three Ways of Creating Value&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;To get automatically notified about new posts, you can subscribe by clicking &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;here&lt;/a&gt;. You can also subscribe via RSS to this blog to get updates.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Last week, my team spent the bulk of our time visiting a series of different locations in Chicago. We spent time at a social services community center, a homeless shelter and the headquarters of an anti-poverty non profit.&lt;/p&gt;

&lt;p&gt;Although each of the visits was remarkably distinctive in terms of atmosphere and people, they were all motivated by the same underlying motivation: to bring the human context to the data.&lt;/p&gt;

&lt;p&gt;From each visit, our team was exposed to a series of insights and understandings about the problem we were helping these organizations tackle. Each visit helped me build greater empathy for the human beings behind the numbers.&lt;/p&gt;

&lt;h2 id=&quot;empathy&quot;&gt;Empathy&lt;/h2&gt;

&lt;p&gt;Empathy is the ability to understand the feelings, beliefs, values, ideas and worldviews of other people.&lt;/p&gt;

&lt;p&gt;I believe that a great deal of problems in the world stem first and foremost from a lack of empathy.&lt;/p&gt;

&lt;p&gt;In my personal life, I notice that conflict and discord arise more frequently through muddled communication than through malicious intent.. In most conflicts, neither party truly seeks to harm the other, but can’t brush aside enough of their pride to apologize or admit wrongdoing. Misunderstanding and subsequently refusing to understand each other, both parties cast themselves as each other’s enemy.&lt;/p&gt;

&lt;p&gt;Similarly, in theology, each of the seven deadly sins are incubated by a self-idolatry that seduces us into an indulgent dismissiveness of others.&lt;/p&gt;

&lt;p&gt;In politics, misunderstanding and imperialism rampage when more effort is put behind sharpening weapons to fight than sharpening minds to understand.&lt;/p&gt;

&lt;p&gt;Even in technology, where engineers are often caricaturized as emotionless machines, empathy prevails as the superior strategy. Entrepreneur, hacker and investor Paul Graham wrote famously[1]:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Empathy is probably the single most important difference between a good hacker and a great one. Some hackers are quite smart, but when it comes to empathy are practically solipsists. It’s hard for such people to design great software, because they can’t see things from the user’s point of view.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I found that our visits to the various locations, speaking with various social workers, understanding the context the data was collected in, and seeing the constraints they operated within brought concreteness to abstraction. What before were just a sequence of rows in a table became tiny stories, each a brief glimpse into a slice of someone’s life.&lt;/p&gt;

&lt;p&gt;I found myself much more motivated to work on the problems that I could now directly connect with being part of someone else’s life.&lt;/p&gt;

&lt;h2 id=&quot;seeing-in-higher-dimensions&quot;&gt;Seeing In Higher Dimensions&lt;/h2&gt;

&lt;p&gt;As a data scientist, beyond generating empathetic awareness, spending time with the people behind the data I’m working with has a high degree of impact on the quality of work I produce as well.&lt;/p&gt;

&lt;p&gt;In short, hearing the stories and details uncovered during my teams’ visits added another dimension to our data set. Literally.&lt;/p&gt;

&lt;p&gt;In linear algebra, there exists the mathematical notion of a basis. Put simply, a basis is a set of vectors that encodes all the information in a particular dimension. For example, the vectors lying on the X and Y axis are sufficient to encode all information in two-dimensions – any point in 2D can be described by how far it is along the X and Y axes.&lt;/p&gt;

&lt;p&gt;In order for a set of vectors to encode the most amount of information, each vector to be sufficiently different from each other. Otherwise you have vectors that are so homogenous that it creates an “echo chamber” effect: the vectors look enough alike that they all repeat each other, bringing nothing unique to the table. In order to increase the amount of information a set of vector can encode (thereby also increasing the number of ‘dimensions’ that set of vectors is said to represent) you need a vector that is so different, it juts out perpendicular to the rest.[2]&lt;/p&gt;

&lt;p&gt;Similarly, I believe there is an analogous extension to the real world.&lt;/p&gt;

&lt;p&gt;Our visits brought forth a human context to the data that clarified why some outliers existed, when some rows had more nulls than others and also seeded some initial hypotheses that our team could then analyze. The new set of information we received from those we spoke with added complexity to the data, but it also contextualized it.&lt;/p&gt;

&lt;p&gt;When I have only my limited set of understandings about the world, I cannot conceive of those that are outside of the span of my worldviews. The set of vectors that represent my knowledge and experiences simply isn’t enough to accurately capture the full complexity of the world.&lt;/p&gt;

&lt;p&gt;However, by adding other individuals’ perspectives, my basis increases in size. By talking with friends who vehemently maintain that technology is eroding human civilization, for example, I am able to come to further clarity on their viewpoints and see sides of arguments I was blind to before.&lt;/p&gt;

&lt;p&gt;As a data scientist, I feel that I begin to see in higher dimensions when I add the human element.&lt;/p&gt;

&lt;h2 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h2&gt;

&lt;p&gt;[1] &lt;a href=&quot;http://www.paulgraham.com/hp.html&quot;&gt;Hackers and Painters&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[2] A friend described to me how you could think of the acquisition of knowledge as trying to approximate an infinite-dimensional space with a finite number of vectors. No matter how many unique and orthogonal vectors you have, you will never know everything.&lt;/p&gt;

&lt;p&gt;Thanks to Vrushank Vora and Michael Lai for giving feedback and comments on this essay.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;I write posts about data science applied to social causes. If you want to be notified when my next reflection is published, subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Week 3: Three Ways of Creating Value</title>
   <link href="http://carlshan.com/2014/06/22/dssg-week3.html"/>
   <updated>2014-06-22T00:00:00-07:00</updated>
   <id>http://carlshan.com/2014/06/22/dssg-week3</id>
   <content type="html">&lt;h1 id=&quot;week-3-three-ways-of-creating-value&quot;&gt;Week 3: Three Ways of Creating Value&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;22 June 2014 - Chicago&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is the third in a series of posts chronicling my reflections on participating in the 2014 Data Science for Social Good Fellowship at the University of Chicago.&lt;/em&gt;
&lt;em&gt;You can read the first two posts here:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;http://www.carlshan.com/2014/06/08/dssg-week1.html&quot;&gt;&lt;em&gt;Week 1: It’s not about the data, it’s about the problems you’re solving.&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://www.carlshan.com/2014/06/15/dssg-week2.html&quot;&gt;&lt;em&gt;Week 2: Why Doing Data Science in Non Profits is Different from Industry&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;To get automatically notified about new posts, you can subscribe by clicking &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;here&lt;/a&gt;. You can also subscribe via RSS to this blog to get updates.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Non profits and cutting edge.&lt;/p&gt;

&lt;p&gt;Two ideas that are almost never associated with one another.&lt;/p&gt;

&lt;p&gt;A few days ago, a friend and I were discussing exactly this issue: working in the non-profit sector typically means that you will not be working at the cutting edge of technology.&lt;/p&gt;
&lt;h2 id=&quot;non-profits-and-innovation&quot;&gt;Non Profits and Innovation&lt;/h2&gt;

&lt;p&gt;Most non-profits do not consider themselves technology organizations, and as a result they do not align their resources towards focusing on expanding the technical frontier. And even if a non-profit did set its sights on inventing the newest technological gadget, they would likely lack the bank account balance and technical talent to actualize this goal.[1]&lt;/p&gt;

&lt;p&gt;However, while it may be true that working in the non-profit space is unlikely to yield a return for those looking to work on The Next Big Invention, I believe that there is another compelling opportunity to create immense social value that is unique to the non-profit space.&lt;/p&gt;

&lt;p&gt;But in order to properly explain my idea, I will first need to elucidate two other ways that I see of creating value.&lt;/p&gt;

&lt;h2 id=&quot;value-through-new-organizational-models&quot;&gt;Value Through New Organizational Models&lt;/h2&gt;

&lt;p&gt;In 1996, when Rebecca Onie was a sophomore at Harvard College, she came across a brilliant strategy to improve the health conditions for underserved populations. While volunteering at an organization that provided legal counsel to the poor, Rebecca found herself speaking with many clients who had serious health issues stemming from their poverty. Sick children with terrible asthma were living in moldy homes. Impoverished families suffered from onsets of various diseases, all caused by a lack of adequate nutrition.&lt;/p&gt;

&lt;p&gt;Rebecca saw that modern medicine would be largely ineffective against these issues. Asthma medication, no matter how good, will not be able to counter the effects of a dusty and dilapidated apartment on a child’s lungs. Doctors were trained to treat medical ills, not social ones.&lt;/p&gt;

&lt;p&gt;This experience helped her crystallize an idea that would turn into an organization called Health Leads.[2]&lt;/p&gt;

&lt;p&gt;Nowadays, in clinics where Health Leads operates, doctors can “prescribe” patients to visit a Health Leads Advocate who will work one-on-one with each individual to ensure that they receive the basic resources (food, clean water, housing improvements) required to live a healthy life. In 2013, Health Leads worked with &lt;a href=&quot;https://healthleadsusa.org/what-we-do/strategy-impact/&quot;&gt;over 11,000 patients&lt;/a&gt; to help them acquire the resources they need. This is preventative health care at its finest.&lt;/p&gt;

&lt;p&gt;I believe this story clearly illuminates the first way of creating social value: creating organizations that, through a novel operational model, more accurately allocates resources towards solving a problem.&lt;/p&gt;

&lt;h2 id=&quot;value-through-new-technologies&quot;&gt;Value Through New Technologies&lt;/h2&gt;

&lt;p&gt;On the West Coast, another method of delivering value is being championed: creating technologies that serve human needs and wants in more efficient and effective ways.&lt;/p&gt;

&lt;p&gt;But first, a clarification. Up until this point I’ve been using the terms ‘technology’ and ‘value’ rather loosely.&lt;/p&gt;

&lt;p&gt;To be more explicit, by technology I mean tools and machinery created through scientific research. This is to be distinguished from new organizational or social structures, which unlike technology, arise from novel arrangements of human relationships rather than novel arrangements of atoms and bits. Thus, while democracy was certainly an innovative idea when conceptualized in 6th century Athens, it was not a new technology.&lt;/p&gt;

&lt;p&gt;And by value, I mean the successful fulfillment of human needs and wants, moderated by considerations ethics and morality. Therefore social value is simply the fulfillment of society’s needs and wants, under the same constraints.[3]&lt;/p&gt;

&lt;p&gt;Going back to the value that technology is producing, there are obvious cases to observe: Uber’s success has come from an app that enables users to quickly snag a ride with the driver closest to them. The Khan Academy leverages the power of the Internet to serve up thousands of bite-sized videos, tutoring students on topics ranging from trigonometry to art history. Google is developing a cadre of revolutionary technologies, from self-driving cars that will reduce traffic-related accidents &lt;a href=&quot;http://www.computerworld.com/s/article/9243518/Self_driving_cars_could_save_more_than_21_700_lives_450B_a_year&quot;&gt;by over 90%&lt;/a&gt; to &lt;a href=&quot;http://www.google.com/loon/&quot;&gt;hot air balloons that provide internet access&lt;/a&gt; in developing countries.&lt;/p&gt;

&lt;p&gt;Simply put: new technologies can improve quality of life through creating better products and services that serve social needs and wants.&lt;/p&gt;

&lt;p&gt;It is through the combination of these two ways of creating value that I’ve identified and will discuss a third.&lt;/p&gt;

&lt;h2 id=&quot;value-through-arbitrage&quot;&gt;Value Through Arbitrage&lt;/h2&gt;

&lt;p&gt;The writer William Gibson once observed, “The future is already here – it’s just not evenly distributed.”&lt;/p&gt;

&lt;p&gt;Most students who graduate from the best engineering or mathematics programs don’t think of going into the non profit space as their default job choice. The pay is lousy compared to for-profits, the field is generally regarded as slow-moving and it doesn’t seem to be a great way to kickstart a career.&lt;/p&gt;

&lt;p&gt;As a result, important public sectors such as education, government, health care and non profits see a &lt;a href=&quot;http://www.fordfoundation.org/pdfs/news/afutureoffailure.pdf&quot;&gt;dearth of technical talent&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Simultaneously, the field of data science has grown to becoming a relatively mainstream division of numerous technology companies. Yet incredibly, few non profits have even adopted the idea of establishing clear organizational metrics, let alone employing data scientists!&lt;/p&gt;

&lt;p&gt;This is precisely where an organization like the Data Science for Social Good Fellowship comes into the picture.&lt;/p&gt;

&lt;p&gt;Through recruiting individuals who have both a high degree of technical competence as well as a desire to address a social cause, the Fellowship finds a third way of creating value: through human arbitrage. By human arbitrage, I mean the transferring of individuals with valuable skills from one field where it might be common to possess these skills to another where it is rare.&lt;/p&gt;

&lt;p&gt;Thus the innovation DSSG discovered is a relatively simple one: create a human capital pipeline that funnels people with valuable skills and abilities into important domains, where their backgrounds and skills are incredibly rare. Machine learning and computer science aren’t novel things in themselves, but applying them to the non profit space certainly is.&lt;/p&gt;

&lt;p&gt;Rather than creating new operational models, or inventing new technologies, DSSG melds the two together to form a potent third force.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;The sheer simplicity of this idea belies its powerful punch. Through this third path, data scientists working in the non-profit space can create social value that few others in the field can – rather than optimizing ad retargeting models, they can focus on solving problems that truly matter.&lt;/p&gt;

&lt;p&gt;I may not be working at the cutting edge of technology, but I do feel like I’m working at the cutting edge of defining a new way of creating value in the world.&lt;/p&gt;

&lt;h2 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h2&gt;
&lt;p&gt;[1] To be fair, most for-profit companies would also fail for the same reasons.&lt;/p&gt;

&lt;p&gt;[2] This summer, my team and I will be working with Health Leads to improve engagement levels with the patients that come to the organization. You can read a more &lt;a href=&quot;http://dssg.io/projects/&quot;&gt;detailed description of my projects here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;[3] I recognize that the terms ‘ethics’ and ‘morality’ are themselves rather nebulous. However, rather than chasing down a rabbit hole of formally defining abstract terms until I reinvent the entire field of metaphysics, I would like to simply appeal to common understandings of these ideas.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;I write posts about data science applied to social causes. If you want to be notified when my next reflection is published, subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Week 2: Why Doing Data Science in Non Profits is Different from Industry</title>
   <link href="http://carlshan.com/2014/06/15/dssg-week2.html"/>
   <updated>2014-06-15T00:00:00-07:00</updated>
   <id>http://carlshan.com/2014/06/15/dssg-week2</id>
   <content type="html">&lt;h1 id=&quot;week-2-why-doing-data-science-in-non-profits-is-different-from-industry&quot;&gt;Week 2: Why Doing Data Science in Non Profits is Different from Industry&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;15 June 2014 - Chicago&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is the second in a series of posts chronicling my reflections on participating in the 2014 Data Science for Social Good Fellowship at the University of Chicago. You can read &lt;a href=&quot;http://www.carlshan.com/2014/06/08/dssg-week1.html&quot;&gt;the first post here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;To get automatically notified about new posts, you can subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;. You can also subscribe &lt;a href=&quot;http://feeds.feedburner.com/carlshan&quot;&gt;via RSS&lt;/a&gt; to this blog to get updates.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;One of the nonprofit partners my team is working with came in to the office this past week. They shared with us the overarching problem they’re facing as an organization, and how they hoped our team can shed some light onto potential answers.[1]&lt;/p&gt;

&lt;p&gt;After listening to and reflecting upon the needs of our partner, I’ve come to see that the solutions to their problems are likely to not be all that technically sophisticated. What the non profit needs is not some complex model feeding on thousands of variables and millions of data points. After speaking with a few of the other Fellows, this observation seems to be widely applicable to many of the other projects.&lt;/p&gt;

&lt;p&gt;Since then, I’ve been crystallizing a series of observations about how data science as practiced in the non profit space can be remarkably different from industry data science.&lt;/p&gt;

&lt;h2 id=&quot;needs&quot;&gt;Needs&lt;/h2&gt;

&lt;p&gt;In the technology industry, startups and companies can leverage data science to huge payoffs. Airbnb’s data science team mines the rich set of host-visitor data to build &lt;a href=&quot;http://nerds.airbnb.com/location-relevance/&quot;&gt;conditional-probability models&lt;/a&gt; estimating the likelihood that a user will book in a particular neighborhood. Education startup Knewton uses &lt;a href=&quot;http://www.knewton.com/tech/blog/2013/11/kalman-filter/&quot;&gt;Kalman filters to estimate student ability&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;However, generally speaking non profits interested in data science are not typically looking for technical wizardry. Rather, the needs of the these organizations seem very straightforward and can be solved with common techniques.&lt;/p&gt;

&lt;p&gt;My project partner is asking for an evaluation of the social support programs they put together, so that they can make informed decisions about where to allocate federal funding. Much of that means doing simple statistical analyses, such as looking at percentages, doing cohort analysis and calculating survival rates.&lt;/p&gt;

&lt;p&gt;There doesn’t seem to be any whiz-bang math to introduce here. There’s no need to break out the most cutting edge machine learning algorithms.&lt;/p&gt;

&lt;p&gt;To be fair this isn’t something that’s true only of the non profit space. Many for-profit companies also don’t need sophisticated data science teams. I’ve heard that even now, Dropbox’s data team comprises of only three people. Nevertheless, I would wager that having a PhD in machine learning would help you solve more problems at, say, Google than it would at the National Breast Cancer Foundation.&lt;/p&gt;

&lt;h2 id=&quot;wheres-the-data&quot;&gt;Where’s The Data?&lt;/h2&gt;

&lt;p&gt;Beyond the question of needs, a key component of effectively conducting data science is having a wealth of quality data. Even if the goals of non profits necessitated complex techniques, machine learning algorithms really succeed in finding patterns when they’re being fed large volumes (think giga or petabytes) of high-quality data.&lt;/p&gt;

&lt;p&gt;Unfortunately for many non profits (and for-profit companies as well), this is a dealbreaker.&lt;/p&gt;

&lt;p&gt;Non profits are often launched to advance a praiseworthy social cause. They raise money from individual donors, foundation grants or other charitable giving to fund their organization. However, as non profit are oftentimes not directly selling a product or service, they need to appeal to the emotional pathos or moralistic beliefs of donors. As a result, powerful personal stories and anecdotes are more potent forces in a non profits arsenal than metrics or data.&lt;/p&gt;

&lt;p&gt;And to be fair, this is very understandable. After all, it’s hard to measure exactly how much effective mentoring is happening as a result of your organization.[2] Or whether women are being effectively empowered. Or if people are living more self-actualized lives. These missions just don’t easily lend themselves to measurement.&lt;/p&gt;

&lt;p&gt;Nevertheless, all this poses a problem to applying data science.&lt;/p&gt;

&lt;p&gt;Our partner has only been focusing on reliably collecting data for the past two years. Before then, there was some data entry, but it was performed as more of an afterthought. The data collection process also only happens on rare occasions: when a patient either enters or leaves a social service program.&lt;/p&gt;

&lt;p&gt;There is little information about how an individual patient is doing during the program itself.&lt;/p&gt;

&lt;p&gt;As a result, the data we have is fairly sparse, dirty and can only provide a limited perspective in evaluating the effectiveness of social programs.&lt;/p&gt;

&lt;p&gt;(Fortunately, there are a few movements to attach greater importance to metrics in the non profit space. These include ones such as &lt;a href=&quot;http://www.gatesfoundation.org/Who-We-Are/Resources-and-Media/Annual-Letters-List/Annual-Letter-2013&quot;&gt;outcomes-driven philanthropy&lt;/a&gt; and &lt;a href=&quot;https://www.ted.com/talks/peter_singer_the_why_and_how_of_effective_altruism&quot;&gt;effective altruism&lt;/a&gt;.)&lt;/p&gt;

&lt;h2 id=&quot;black-boxes&quot;&gt;Black Boxes&lt;/h2&gt;

&lt;p&gt;In a seminal 1993 paper, machine learning researcher Robert Holte analyzed the performance of a variety of different machine learning techniques on common datasets. He surprisingly revealed that simple methods he studied, based off of analyzing only one variable in the data, “are almost as accurate as more complex rules” looking at many more variables! This led him to conclude that any “additional complexity must be justified” in machine learning models.[3]&lt;/p&gt;

&lt;p&gt;Nowhere has this statement been made more clear to me than working with non profits.&lt;/p&gt;

&lt;p&gt;It’s likely that most non profit organizations have never heard of any of the algorithms that are all the rage in the data science field today. As a result, individuals in these non profits may be reluctant to touch and intimidated by black box methods, potentially resulting in situations such as my partner never touching anything I build for them.&lt;/p&gt;

&lt;p&gt;Holte’s warnings about complexity without justification rings even more truly here. Even the most impressive tool is only valuable if others trust it enough to use it. This understanding motivates me to create something that meets non profits at their level, rather than indulging in sophistication for sophistication’s sake.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;As a DSSG Fellow this summer, I hope to create and ship something that can be successfully used in advancing the causes of the visionary non profits I’m working with.&lt;/p&gt;

&lt;p&gt;As I progress throughout my summer, it will important for me to keep in mind that there will be many difficulties particular to the non profit space that my team and I will need to carefully navigate. We must learn how to effectively deal with sparse and impoverished datasets, as well as negotiate the tension of building something that’s complex enough to solve a challenging problem, while being simple enough to be trusted by decision-makers.&lt;/p&gt;

&lt;h2 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h2&gt;

&lt;p&gt;[1] I’ll be able to share more details about both my project and my partners in future posts.&lt;/p&gt;

&lt;p&gt;[2] This comes from personal experience as someone who founded and directed a national non-profit peer mentoring program for 3 years.&lt;/p&gt;

&lt;p&gt;[3] Robert Holte, &lt;a href=&quot;http://webdocs.cs.ualberta.ca/~holte/Publications/simple_rules.pdf&quot;&gt;“Very Simple Classification Rules Perform Well
on Most Commonly Used Datasets” (1993)&lt;/a&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;I write posts about data science applied to social causes. If you want to be notified when my next reflection is published, subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Week 1: It’s not about the data, it’s about the problems you’re solving.</title>
   <link href="http://carlshan.com/2014/06/08/dssg-week1.html"/>
   <updated>2014-06-08T00:00:00-07:00</updated>
   <id>http://carlshan.com/2014/06/08/dssg-week1</id>
   <content type="html">&lt;h1 id=&quot;week-1-its-not-about-the-data-its-about-the-problems-youre-solving&quot;&gt;Week 1: It’s not about the data, it’s about the problems you’re solving.&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;8 June 2014 - Chicago&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is the first in a series of posts chronicling my reflections on participating in the 2014 Data Science for Social Good Fellowship at the University of Chicago.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;To get automatically notified about new posts, you can subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;. You can also subscribe via &lt;a href=&quot;http://feeds.feedburner.com/carlshan&quot;&gt;RSS&lt;/a&gt; to this blog to get updates.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;This summer I am participating in the 2014 Data Science for Social Good Fellowship at the University of Chicago.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;dssg.io&quot;&gt;Data Science for Social Good Fellowship&lt;/a&gt; is a summer fellowship sponsored by the Eric and Wendy Schmidt Foundation. It selects a cohort of individuals typically with quantitative or social science backgrounds to work in teams to tackle problems brought to them by non-profits, governments and NGOs.&lt;/p&gt;

&lt;p&gt;The inaugural program was conceived of and put together by former Chief Scientist for the Obama 2012 campaign, &lt;a href=&quot;https://twitter.com/rayidghani&quot;&gt;Rayid Ghani&lt;/a&gt;. DSSG launched in the summer of 2013, bringing 36 Fellows together to spend three months working in Chicago.&lt;/p&gt;

&lt;p&gt;The 2014 Fellowship, which I participated in, expanded its class to &lt;a href=&quot;dssg.io/people/&quot;&gt;48 Fellows&lt;/a&gt;, drawing a diverse set of graduate students, recent college graduates and working professionals coming from both the hard and social sciences.&lt;/p&gt;

&lt;p&gt;In the 2014 Fellowship, there were a total of 12 projects sponsored by a varied set of partners. The projects ranged from detecting collusion, to evaluating how to best allocate city funds.&lt;/p&gt;

&lt;p&gt;I’ll be able to provide more background and context on my own project as the fellowship progresses.&lt;/p&gt;

&lt;h2 id=&quot;my-background&quot;&gt;My Background&lt;/h2&gt;
&lt;p&gt;How I ended up as a Fellow is a combination of my academic and personal experience.&lt;/p&gt;

&lt;p&gt;I graduated with a bachelors in statistics from UC Berkeley in 2013 while also taking a smattering of coursework in computer and information science.&lt;/p&gt;

&lt;p&gt;While developing my knowledge of statistical and computational techniques, I also developed a desire to do socially meaningful work.&lt;/p&gt;

&lt;p&gt;During my first semester in college, I volunteered serving as a mentor to fifth graders from a local elementary school. What I encountered when I first began volunteering shocked me: I had attended a high-achieving math and science powerhouse high school with a 99% graduation rate; however the students I mentored came from circumstances made it immensely difficult for them to enjoy the same type opportunities that had fallen into my lap as a student. Over 40% of the students in the school were on free or reduced lunch.&lt;/p&gt;

&lt;p&gt;Throughout the course of my volunteering, it became more and more clear to me that the zip code you were born into could more or less determine whether you were able to live a full and rich life.&lt;/p&gt;

&lt;p&gt;This led me to ponder the role of education as the mechanism for both increasing social mobility, as well as providing life-altering opportunities.&lt;/p&gt;

&lt;p&gt;As a result, through college I worked in a variety of different educational organizations, hoping to increase my exposure to the field such that I could eventually improve life outcomes for students who, through the lottery of birth, did not receive the same lucky breaks I did.&lt;/p&gt;

&lt;p&gt;During this time, I recognized that my technical background provided a unique advantage to create social change: many individuals drawn to the education or non-profit fields typically do not come from statistics or computer science. This fact, combined with the simultaneous rise of the importance of data in augmenting human decisions gave me a glimpse that by perhaps mastering the techniques and tools of a data wrangler, I could increase my individual leverage in impacting the progress of human civilization.&lt;/p&gt;

&lt;p&gt;This all sounds rather high-minded, lofty, and abstract. Writing it makes me feel mildly embarrassed in the same way that I imagine self-aware salespeople feel when they use phrases like “synergistic partnership” or “paradigm shifting” – the phrases are so ambiguous and overused that they become self-mocking caricatures rather than signifying meaningful ideas.&lt;/p&gt;

&lt;p&gt;Yet it would be disingenuous of me to omit my motivations simply because they sound cheesy. It also seems imprudent to be unwilling to discuss beliefs that sound naive, yet are nevertheless ones I hold to be true.&lt;/p&gt;

&lt;p&gt;Coming into the Chicago Fellowship I hope to develop these skills in a hyper-applied context by working the exact problems I would want to be working on even post-Fellowship.&lt;/p&gt;

&lt;h2 id=&quot;the-first-week&quot;&gt;The First Week&lt;/h2&gt;
&lt;p&gt;I landed in Chicago the weekend before the Fellowship began. Flying in, I was slightly apprehensive – I did not feel like I had received an adequate amount of communication regarding program logistics such as what type of project I was assigned to, who my team members were, and an outline of the summer.&lt;/p&gt;

&lt;p&gt;In fact, I had only been emailed details regarding these things the day before my flight.&lt;/p&gt;

&lt;p&gt;However, once I began my first day as a Fellow I was happily surprised at the level of thought and organization that had been put into the summer.&lt;/p&gt;

&lt;p&gt;In the morning of the first day, all the Fellows wrote down different things they were hoping to get out of the summer on sticky notes and arranged them on an empty wall.  Examples included swimming in Lake Michigan, learning how to use cloud computing, and sampling Chicago city cuisine.&lt;/p&gt;

&lt;p&gt;One particularly eye-catching sticky read “Learn what the hell data science is!”&lt;/p&gt;

&lt;h2 id=&quot;values-and-culture&quot;&gt;Values and Culture&lt;/h2&gt;
&lt;p&gt;Afterwards, the program staff explained to us the key values of the Fellowship, which I’ve listed belong along with my own personal interpretation of each:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Open&lt;/strong&gt;: We share what we are learning and doing to help more people learn to do what we’re doing.
This value of openness shows up in the platforms used (like public Github repos) as well as within the interpersonal interaction amongst the Fellows.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Curious&lt;/strong&gt;: We keep asking questions because that’s the only way to progress towards meaningful answers.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Applied&lt;/strong&gt;: Everything that we do should drive towards some form of action that changes someone’s life. As we ask questions, something to always prompt ourselves with is: “Is the thing I’ve just learned something that ties back to the real world?”&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Rigorous&lt;/strong&gt;: We will emphasize meaningful results and specific outcomes, while documenting our methods and techniques.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Collaborative, not competitive&lt;/strong&gt;: Achieving our collective goals matters more than assigning who gets the credit.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I was very happy to see these cultural tenets explicitly stated. These values lined very much up with my own.&lt;/p&gt;

&lt;h2 id=&quot;goals&quot;&gt;Goals&lt;/h2&gt;
&lt;p&gt;On the first day, it was explained to us that while we were expected to work hard and produce useful output on our projects, it was not the goal of the program to actually build the most robust piece of software that solves our partners’ problem.&lt;/p&gt;

&lt;p&gt;Rather, the goal of the program was focused more around (a) helping Fellows learn how to use various techniques and tools and (b) developing each Fellow’s personal interests working towards social good, open government and open science.&lt;/p&gt;

&lt;p&gt;This surprised me, as one of my personal goals is to actually build and deploy a maintainable system for the partners I’m working with. Although I can see the sense in moderating my goal to being simply one of learning and growing, I’m reluctant to dial down my ambitions simply due to sake of feasibility.&lt;/p&gt;

&lt;p&gt;Not only do I think that it’s reasonable to build out some usable and maintainable software or tool in the summer, I also personally think gunning hard for this goal would be the most productive way to learn.&lt;/p&gt;

&lt;p&gt;The Fellows then got a chance to introduce themselves and intermingle. I learned that of the 48 Fellows, 33 were PhD or Masters students, 12 were recent undergraduates and 3 were working professionals.&lt;/p&gt;

&lt;p&gt;Although the majority of Fellows seemed to come from quantitative or technical backgrounds, there were also Fellows with backgrounds in architecture, psychology and public health.&lt;/p&gt;

&lt;p&gt;This was nice to see, as I subscribe to the belief that impact is produced through a combination of valuable skills and domain expertise. The Fellows with math, statistics or computer science backgrounds are the arsenal with which the social scientists could aim at the correct targets.&lt;/p&gt;

&lt;p&gt;We also had a variety of social events, including a picnic near the beach, a kickoff party on the rooftop of the Civic Opera Building and a boat tour of the city of Chicago. I was very glad to see these social events planned; I believe that great work can only be done within a strong scaffold of tightly-formed relationships, and all of these activities gave me a glimpse into the personal lives and thoughts of each of the Fellows and mentors.&lt;/p&gt;

&lt;p&gt;Overall, I ended the first week feeling fairly optimistic about the trajectory of the coming weeks.&lt;/p&gt;

&lt;h2 id=&quot;reflections&quot;&gt;Reflections&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;The Good&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Impact-oriented fellows&lt;/strong&gt;: I’ve been involved a variety of different internships and programs and at many of them, I’ve felt out of place. At these organizations, my co-workers were not interested in the same set of ideas I was hoping to discuss, and didn’t share many common values or worldviews.&lt;/p&gt;

&lt;p&gt;I was very happy to see that this was not true with the Fellowship. Many of the Fellows I spoke with shared common passions for social issues, and were interested in discussing ideas ranging from how to best optimize for making social impact to how to accelerate one’s rate of growth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Openness to growth and feedback&lt;/strong&gt;: The program directors seemed very open and receptive to feedback, creating a Hackpad document where Fellows could contribute ideas for how to improve the experience as well as holding a formal feedback session. This contributed towards building a culture of open transparency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Collaborative work setting&lt;/strong&gt;: The workspace is spacious, open (encouraging collaboration and conversation) and there’s a culture of casual collegiality between everyone.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Bad&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Direction-less events&lt;/strong&gt;: There was a planned event that did not seem to directly serve a purpose: a series of non-profits that were not partners came in and gave lengthy talks.&lt;/p&gt;

&lt;p&gt;I wasn’t sure as to whether these problems were ones I should be invested in helping solve, and with a single exception, the speakers didn’t share any useful mental models or suggestions for how to tackle our own projects.&lt;/p&gt;

&lt;p&gt;I admit that this issue may be the fact that the purpose of this event was simply not communicated well to the Fellows, giving an impression of aimlessness.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Ugly&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No wifi&lt;/strong&gt;: The wifi had not been installed when the Fellows arrived. As a result, we had to connect to DePaul university’s network (the building we worked in is owned by DePaul), causing traffic overload and nearly unusable connection speeds. This proved to be quite a bottleneck for teams looking to get set up quickly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No data&lt;/strong&gt;: A variety of different partners still had not yet shared data with their respective teams by the end of the week. Legal agreements were still in review, and processes were still being navigated. This had also happened in the previous year, although I was told that it was much worse then. However, this is a pretty serious showstopper. After all, it’s hard to do data science without data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Untargeted workshops&lt;/strong&gt;: The workshops should have been more segmented based on background – some workshops had students coming in from both the high and low end of the experience spectrum. Workshops also ran overtime, resulting in some Fellows being unable to attend a few due to an instructor having to leave to catch a flight.&lt;/p&gt;

&lt;p&gt;In the future, I think it would be useful to plan multiple workshops for Fellows of different experience levels, and to enforce time constraints.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;It’s important to balance everything I’ve said within the perspective that DSSG is still only in its second year of existence. Growing pains are still being massaged, and the level of thought and dedication that I’ve already seen being put forth by the Fellowship organizers assure me that my criticisms are ones that will not be true next year.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;I write posts about data science applied to social causes. If you want to be notified when my next reflection is published, subscribe by &lt;a href=&quot;https://carlshan.wufoo.com/forms/join-other-readers/&quot;&gt;clicking here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Writing is Distilling</title>
   <link href="http://carlshan.com/2014/05/03/writing-is-distilling.html"/>
   <updated>2014-05-03T00:00:00-07:00</updated>
   <id>http://carlshan.com/2014/05/03/writing-is-distilling</id>
   <content type="html">&lt;h1 id=&quot;writing-is-distilling&quot;&gt;Writing is Distilling&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;3 Mar 2014 - San Francisco&lt;/p&gt;

&lt;p&gt;There is the widely quoted heuristic, popularized by Malcolm Gladwell’s Outliers, that 10,000 hours of deliberate practice will make you an expert in a particular domain. I think that If I were to choose to spend 10,000 hours deliberating practicing one skill, it would be writing.&lt;/p&gt;

&lt;p&gt;Writing seems to be a process that distills thoughts. Thoughts tend to coalesce in the mind as nebulous and unstructured entities. They may exist as fragmentations of sentences, images, emotions, sounds or in other vague forms. Writing forces one to shine a light onto dusty, unclear thoughts. As I attempt to articulate myself through writing, I notice the cracks, smudges and smears in my mind.&lt;/p&gt;

&lt;p&gt;Writing forces oneself to be precise, choosing only the most appropriate words to define and clarify your ideas. By writing down my thoughts, I can then analyze them for weaknesses, inconsistencies or contradictions. As such, through writing on this blog, I hope to make my thoughts more robust and cogent.&lt;/p&gt;

&lt;p&gt;Blogging publically further compounds the advantage of writing: knowing that others will be reading my thoughts is one filter that implicitly causes me to carefully analyze and vet my thoughts before typing them.&lt;/p&gt;

&lt;p&gt;However, I do feel that there’s something innately limiting about writing. I think it has to do with the fact that words are artificial human constructs. They are often poor mediums of capturing the intensity of feeling, context and sensation that goes into truly understanding something. A rose by any other name may smell just as sweet, but no matter what evocative name we conjure up for it, the essence of its beauty or fragrance cannot be captured through just words.&lt;/p&gt;

&lt;p&gt;Forms of writing unburdened by the structure of prose, such as poetry, come closer towards approximating the uncapturable. We hear poems being described as ‘beautiful’ more often than we do pieces of prose.&lt;/p&gt;

&lt;p&gt;Nevertheless, I’m still excited by the benefits to the mind that writing will bring.&lt;/p&gt;

&lt;p&gt;Here’s to 10,000 hours.&lt;/p&gt;
</content>
 </entry>
 

</feed>
