Fernando Brito

Open Data Day 2016: João Pessoa, Brazil

Fernando Brito — Fri, 11 Mar 2016 13:08:28 +0000

Motivation

In 2010, the very first edition of the Open Data Day was announced, and João Pessoa was one of the few cities in Brazil to participate. Although we did not have many participants back then, we had good media coverage and the event made our developers community a little bit stronger.

This year, motivated by some friends, I decided to try to organize another edition of the event.

Organization

My plan was to have again a hackathon, like in 2010 and as I did not have much time to spend on the organization itself, I decided to do something modest.

After creating a shared document and a Facebook event, I sent messages to mailing lists and posted on a few Facebook groups, and soon I had a people registering and proposing ideas on what to do on the event day.

One of the images used to promote the event online

A research lab from my university (UFPB) contacted me offering help, and we decided to use their lab facilities to host the event. The lab is called Labtransp (Laboratório de Transparência Pública – Public Transparency Lab) and they provided some help in the organization. A few days later we discovered that there was an energy shutdown planned for the entire campus on the event day, so we had to move the event to the other university campus, which is located just 6 km away from the original place. We thanks the Centro de Informática (Informatics Faculty) for offering one of their computer labs for the event.

I also sent several emails to TV news and newspapers, but this year none of them showed interest, except the official university news agency, which posted a nice announcement on the official university website.

A few days before the event, I received an email from Open Knowledge and ILDA (The Latin American Open Data Initiative) stating that my proposal for their mini-grant was accepted. Good news, as we were able to provide free snacks for all participants with this money.

The event

There were 26 names filled the registration form, but only 13 people appeared. From my previous experience, this 50% no-show rate on free events is already something to expect. Most of the participants were computer science bachelor students (including colleagues of mine), but we also had a professor from Labtransp and an employee from an IT agency from our state (CODATA) attending the event.

My short presentation about the Open Data Day and Open Data

I started the event with a brief presentation about what the Open Data Day is, what to expect and the schedule for the following hours. Afterwards, we discussed and shared interesting public data sources and formed teams to work together. There was plenty of food and drinks for the participants, who were able to focus on exploring and working with open data.

Thanks to Open Knowledge and ILDA, we were able to provide snacks for the participants

One of the teams focused on data from Bolsa Família, one of the biggest social welfare programs from Brazil to reduce poverty. Their goal was to cross regional and temporal data from this program with social and economic indicators. Tools such as Pentaho and CartoDB were used.

Another team explored datasets related to consumer complaints, in order to try to find which companies were getting most of the complaints and which regions were most dissatisfied with them. This team used mainly R to explore and process the data and Google Charts to try to create visualizations for it.

Teams working on their projects

One individual decided to analyze party affiliations and public payroll data. His goal was to see if affiliating to specific parties would imply in getting fired or getting a public job.

Lastly, my team explored data from João Pessoa’s City Council. We tried to categorize the bills proposed by each city councilor, as we noticed that many of them were petitions for things such as modifying the name of streets, fixing holes on the street and granting badges and medals to citizens. We developed a scraper (hosted on GitHub) to collect the data from the official website and made a brief website to show this data.

Website developed with the data scraped by my team

Post-Mortem

Our main goal was to increase the awareness around open data and to build a local community of developers and enthusiasts. Although we had only 13 participants, this was the first contact of almost all of them with Open Data.

Unfortunately we planned the event from 8am to 1pm (including the initial presentation and time to form teams), but this was clearly not enough to come up with a final product. None of the teams could deliver or show something ready at the end. For the next edition, we plan to allocate more time and start later (9am or 10am). My team continued the development during the following days in order to fully implement our idea.

The event venue (a computer lab in our university) was fine, with enough desks and chairs, datashow and good Internet connection, but getting there by public transport was not that easy. As far as I could notice, all participants arrived by private transport. For the next event, we plan to choose a more central and accessible venue.

Our venue: classroom at a public university where I study

Finally, we should definitely start the organization a few weeks earlier, so we have more time to get in touch with TV news and newspapers.

Overall, we consider this edition of the event an small but important step in building a stronger open data enthusiasts local community, and we are looking forward the Open Data Day 2017. We would also like to ask other groups and individuals to give it a try in organizing their own Open Data Day edition in the years to come. It takes just a few hours and a couple of email messages to organizing a small edition.

Once again, we thank Labtransp, for helping in the organization, and the Open Knowledge and ILDA for the mini-grant. More pictures can be found in our Dropbox folder.

Open Data Day João Pessoa 2016

Data analysis on Mitacs Globalink 2015 projects: Part 2 – Word Cloud

Fernando Brito — Sat, 27 Sep 2014 13:50:06 +0000

Introduction

Mitacs Globalink Research Internships is a project from Mitacs which allows undergraduate students from Brazil, France, China, India, Mexico, Saudi Arabia, Turkey or Vietnam to perform a 3 months internship in some university lab research in Canada.

This series of post is a personal attempt to perform some basic data analysis over projects information, such as projects title and description. Check Part 1 to see my saga on collecting the data.

Motivation

First question I wanted to answer was: “What are most of the projects about?”. I decided that generating an word cloud over the projects title would be an easy and quick way to get an overview on the keywords and topics used to describe the projects.

On Part 1 of this series, I posted a link to a text file containing, one per line, all projects title, exactly as they were written on Mitacs application platform. This was the data I chose to work with.

I have a professor that really likes word clouds. He uses it on the first lecture of every course he teaches in order to give his students an overview of what the class is going to be about. I asked him which tool he uses and he recommended me Wordle.net. Creating a word cloud on this website is as simple as copying and pasting your data and tweaking the colors and fonts.

The first word cloud I generated ended up looking like this:

Version 1 of projects title word cloud

Even this initial version already surprised me. Some words, like “cancer”, were totally unexpected to me. However, after a quick glance, I noticed some room for improvements.

Also, it is important to note the importance of looking at the words count before drawing any conclusion. The size of each word is relative to the count of other words, and not to the total count of words. On our example, we may think that hundreds of projects have the word “development” on its title, but this is true only for 82 projects of our data, out of 1700+. This show us how heterogeneous our data is.

In other words, a word cloud alone show us only which words appears the most, but not how much they actually appears on our data. This may vary a lot depending on our input.

Preprocessing

Wordle already does some kind of preprocessing for us, which is very nice. It removes stop words (common words, such as “of”, “and”, “for”, etc) and it is case insensitive (“ANALYSIS” and “analysis” will be grouped).

However, looking at the image above, you may notice we have on the left side, the word “systems”, and on the right side, “system”. I would like to count them as only one word. As the data is not that big, I could manually “search and replace” all the words I would like to group, but this would require me to inspect each generated word cloud and repeat the process many times.

An automatic way to perform this can be done by using a preprocessing technique called stemming, which reduces words to their “root” form (stem), removing plurals, conjugations and derivations. Stemming is not present on Wordle, so I had to do this kind of preprocessing on an external tool. For didactic purposes, I used KNIME, an open-source data analytic tool which has a graphical user interface.

On KNIME I was able to remove French stop words (as Wordle would allow me to remove either English or French stop words, but not both). This was important because some project titles are in French. Moreover, I applied English stemming. KNIME works with workflows. This is the one I made:

My KNIME workflow

Now that I made the preprocessing I wanted, I returned to Wordle to make a bew word cloud. KNIME is also able to create word clouds, but the layout options offered by Wordle are far more appealing (in my opinion). My the word cloud “version 2″ looked like this:

Version 2 of projects title word cloud: stemming and French stop words removal

Some words got bigger (like “system” and “model”), but it is somehow harder to read this word cloud, as it is now made of “stems”. Take for example the stem “applic”. We are not used to read this “word”.

Conclusion

So, which version is “better”? The first one or the second one? Which one should you use?

Overall, I believe each version has its own value. Applying stemming can reveal some interesting information, but it may also hide some other things. Now both “mobile” and “mobility” appears as “mobil”, even though they may refer to very different things in different contexts.

I would recommend playing will different settings (stemming, no stemming, etc) and comparing all results, instead of just looking for a “final absolute” version.

Looking at both the word clouds, I was happy with the results. Stems like “cancer”, “health”, “energi”, “polymer” and “sustain” helped me understand what kind of projects are on Mitacs this year. I hope you also had the chance to learn something new on this post. On the next post I will try to find which provinces are offering most projects, drawing the results on a map.

Data analysis on Mitacs Globalink 2015 projects: Part 1 – The Data

Fernando Brito — Mon, 18 Aug 2014 22:17:04 +0000

Introduction

I am interested in taking part of the program, and one of the application process steps is to choose between 3 and 7 projects from their 1.782 projects list (as I write this article). Using their website, you can filter those projects by university, province, language and by keywords.

Motivation

I started performing queries with keywords such as “web” and other areas I am familiar with, but soon I realized that there were many other cool projects I could also apply to, so I ended up manually looking into all 1.700+ projects title and writing in a text file the ones I should spend more time reading the prerequisites and description.

Mitacs Globalink 2015 projects list.

When I was done, I got really curious about the data. “Which province is offering more projects?”, “What is the average amount of projects being offered per professor?“, “What would a word cloud with projects titles look like?”

Getting the data

I could not find any “export” link on the page where they list all the projects. I have some experience with web scraping, but unfortunately their platform is made on Flash :(.

How could I get the data I needed? First thought was: “Hmmm. Maybe it is possible to do some reverse engineering on Flash!“, but I realized they pull the data asynchronously, as soon as I hit the “Projects” page. This data must be coming from somewhere, so I decided to try to monitor the network.

It is important to note that, while the page that loads the Flash component is using HTTPS, the requests made by the application itself are using plain HTTP! This allowed me to see the traffic on Wireshark, for example. I decided, however, to use a “higher-level” approach.

On Chrome’s Developer Tools I was able to find the request I was looking for. Things would have been a lot easier if this data came in plain json or xml, so I would be able to sneak around, but headers informed me they were using “Content-Type:application/x-amf“.

I am not familiar with Flash development, so I had no idea what amf was and if it was possible to open it. It turns out that amf (Action Message Format) “is a binary format used to serialize object graphs such as ActionScript objects and XML, or send messages between an Adobe Flash client and a remote service” (source: Wikipedia).

A quick search on Google revealed me lots of tools to decode amf, but none of them worked for this particular file. I ended up trying lots of tools, including: Charles proxy, a Ruby library (rocketamf), three JavaScript libraries (JSAMF, amf, and amfjs), two python libraries (pyamf and amfast), two Firebug plugins (AMF Explorer and Flashbug), two PHP libraries (Amfphp and SabreAMF), a JMeter plugin, Wireshark, two Fiddler plugins (AMFParser and Fiddle AMF Parser), ServiceCapture web proxy, WebScarab web proxy, minerva, FlashDevelop and some others!

For each tool, I had to read how to install it and how to use it. On some libraries, there was no manual, so I had to look directly into their source code or unit tests.

All tools failed to some extent. Almost all of them gave some sort of decoding error with no further details. Some of the libraries were more specific, stating problems with DSQ externalizable class. The only tool that kept my hope was Charles. Charles was able to decode the amf message and show it on their user interface, but there was no way to export the data!

Charles proxy showing the decoded amf.

After almost 8 hours using all the tools mentioned previously, I decided to give a try to something else. On Charles I was able to copy selected elements from the tree, but I had to manually expand the nodes and select them. I spent an additional hour recording macros to expand nodes from Charles, select them and paste in a text file, but Charles trial version would close every 30 minutes, and this would take forever!

It was late in the night, so I decided to throw in the towel. It was time to ask for help from the gods. I posted an question in StackOverflow.

Next day morning, no answer to my question. I was close to give up. But then, I thought: what if this were an Mitacs project assignment? Would I just give up like that? Of course not!

More research, more tools, and no success.

I was very close to start writing my own amf deserializer, when I found FlashFirebug, a professional tool for debugging Flash applications. The license costs $34.99/yr, but you can get a 2-days trial for $0.50. At this point I thought, “Why not?” and paid for the trial version.

The amf decoder, once again, failed. But, messing around, I noticed that FlashFirebug allowed me to inspect the Flash object, just as an HTML page! Soon I found a DataGrid element holding all the data I needed! But how to export this data? The tool also provided me an ActionScript live console. Few minutes on Google and I was able to write my first ever ActionScript snippet: a loop iterating over the data used to fill the projects table!

FlashFirebug ActionScript3 console.

Boy, I was happy! After almost 10h work, I finally had what I needed to start the real work, which is analyzing the data. What for me is usually the most trivial part (getting the data), this time turned out to be a real challenge. A lot was learned during the process, but I am glad it is over.

Coming up, I will try to extract some useful information from all this data.

Sharing the data

From the loop written on last screenshot, I was able to generate a text file with all project titles, one per line. Using this code, I was able to generate a XML file with all the information I needed, including project descriptions, university name and professors name.

Just as a reminder: this is data gathered from Mitacs Student plataform. You do not even need to log in to see this data. I am just “reorganizing” it. Also, this post was written on August 19th 2014. It seems that more projects may have been added to the list afterwards.

Analysing a Facebook friendship network

Fernando Brito — Sun, 13 Oct 2013 22:25:02 +0000

I am taking a course on social networks this semester. As our first assignment, the teacher asked us to analyze our Facebook friendship network so we would get used to working with some tools.

If you never heard about social networks as a research topic and have no idea on what “analyze a friendship network” could possibly mean, take a look on this picture:

Random Facebook friendship network I found on Google. Source: GriffsGraphs.

Here is a small tutorial on how I did my assignment.

The data

First of all, we need to collect the data that will be used on the analysis. There is an app on Facebook that does all the hard work: netvizz. After allowing it to access some of your information, look for a link to a gdf file of your “personal friend network“.

The tool

Now that we have our raw data, it is time to get our hands dirty. Download the tool Gephi and import the file you just downloaded. On the import wizard, choose “Graph Type: Undirected“, as Facebook friendships are bidirectional. A very boring representation of your friendship network will appear, but fear not!

At this point it would be a good idea read some of the Gephi documentation. I will quickly report what I did, but by no means you should take it as the best or only way of doing this analysis.

The numbers

On the right side of the window, there should be a tab called “Statistics“. Run the “Modularity” module (!), to use some bad-ass algorithm to detect communities and the “Avg. Path Length” to find out who plays an important role on your network with respect to connecting different groups. The results of those calculations are stored as attributes on each node (or edge, if it is the case).

The visualization. Part I: Science

Time to make your graph look a little bit better. On the left side of the window, use the “Partition” tab to color the nodes according to its “Modularity Class” and the “Ranking Tab” to change its size according to the “Betweenness centrality“.

If you are lucky, you should have something that looks like this:

Ugly graph

The visualization. Part 2: Art

So far, things were pretty straightforward. Click here, press this, and so on. Now it is time to unleash your artistic side. Also on the left side, under “Layout“, there are different algorithms that will make your nodes literally dance on your screen. Play with all of them until you get something you like, and don’t forget to use the “Random Layout” as a mean of resetting your work.

After you get something nice, you can also use tools to manually move nodes on your graph. There is also a tool to inspect nodes, so you can see the name of the person represented by a specific node. In case you want to show those names on the graph, there is a button on the lower toolbar for that. Moreover, this page teaches you how to show labels only on some filtered nodes.

The gran finale

So far you have been on the Overview mode, which allows you to edit your graph. There is also a Preview mode which lets you export your beautiful graph to an image file.

Here is what I got:

My Facebook friendship network

The name of the communities (in Portuguese, sorry), were added manually by me on a image editing software.

The interpretation

This graph allows you to easily identify communities among your friendships and see people that play a key role by connecting different communities.

Bonus: interactive graph on the web

Gephi lets you export your graph as a gexf file and the javascript library sigma.js has a gexf parser!

On my first attempt to make it work, Gephi (version 0.8.2) did not add an id to the edges on the exported gexf file, and sigma.js did not like that. To fix this issue, I wrote a small ruby snippet to insert unique id’s on all edges. The source code is on GitHub.

The complete integration with sigma.js can be found on a JsFiddle, embedded below (may take some time to load). And yes, I also noticed some encoding problems on the labels.

Building your own Lucene Scorer

Fernando Brito — Fri, 26 Oct 2012 00:45:40 +0000

This post is about Apache Lucene, which is a “high-performance, full-featured text search engine library written entirely in Java”. If you have no idea on what I am talking about, this tutorial is not for you :). Be advised that this is my first month using Lucene, so there is still a chance that everything I say here is just plain wrong :P. Also, I am currently using version 3.6.1.

Doing an assignment from my Information Retrieval class I was faced with the problem of creating my own Scorer class on Lucene. When you create a new IndexSearcher, by default Lucene uses DefaultSimilarity, which is actually cosine similarity (in a Vector Space Model) with different weights such as boosts given when indexing, boosts given in the query, tf*idf and document length norm. A description on how it works exactly can be found on Similarity class documentation and on Lucene Score documentation.

The guys from Lucene have put a lot of effort into finding a good similarity function and their DefaultSimilarity works quite well on most of the cases. However, for one reason or another you may still want to use your own custom function.

When searching on how to do this, all results I found were about customizing the previous similarity function by extending the class and overriding its methods to change (or “disable”) some of the weights. Such examples can be found on LuceneTutorial.com and on blog posts and they all work like this:

IndexSearcher searcher = new IndexSearcher(reader);

searcher.setSimilarity(new DefaultSimilarity() {
    @Override
    public float computeNorm(String field, FieldInvertState state) {
        return 1.0f;
    }
} );

However, I wanted to implement my own simple similarity function totally from scratch. It was not clear to me which single method my class had to implement/override and how I should add it to Lucene’s workflow.

After spending some hours looking on the internet, I finally found a solution in the book “Lucene in Action”, by Erik Hatcher. Do not ask me where I did find an online copy of book.

public static class MyOwnScoreQuery extends CustomScoreQuery {
    private Query query;

    public MyOwnScoreQuery(Query query) {
        super(query);
        this.query = query;
    }

    @Override
    public CustomScoreProvider getCustomScoreProvider(final IndexReader reader) {
        return new CustomScoreProvider(reader) {
            @Override
            public float customScore(int doc,
                    float subQueryScore,
                    float valSrcScore) throws IOException {

                // Insert your math here

                return 1f;
            }
        };
    }
}

Looking to it right now, it does not sound very complicated. But coming to this only by reading the documentation was impossible for me. As a proof-of-concept I implemented a score function that is just a sum of the frequency of the terms of the query in the document.

@Override
public CustomScoreProvider getCustomScoreProvider(final IndexReader reader) {
    return new CustomScoreProvider(reader) {
        @Override
        public float customScore(int doc,
                float subQueryScore,
                float valSrcScore) throws IOException {
            TermFreqVector freqVector = reader.getTermFreqVector(doc, "contents");
            int freqs[] = freqVector.getTermFrequencies();

            Set terms = new HashSet<>();
            query.extractTerms(terms);

            int total = 0;

            for (Term term : terms) {
                int index = freqVector.indexOf(term.text());

                if (index != -1) {
                    total += freqs[index];
                }
            }

            return total;
        }
    };
}

Note that I am still verifying the index for -1 (term not found) because by default QueryParser uses an OR operand, so there is no guarantee that all terms from the query are going to be present in all of our retrieved documents.

Now that you have your own brand-new Scorer class, it is time to use it! Here is small example on how you can apply your new class on your program.

IndexReader reader = IndexReader.open(FSDirectory.open(new File(INDEX_PATH)));
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_36);

QueryParser parser = new QueryParser(Version.LUCENE_36, FIELD, analyzer);

Query query = parser.parse("searching something");

CustomScoreQuery customQuery = new MyOwnScoreQuery(query);

ScoreDoc[] hits = searcher.search(customQuery.createWeight(searcher), null, numHits).scoreDocs;

for (int i = 0; i < hits.length; i++) {
   // iterating over the results
   // hits[i].doc gives you the doc
}

As also mentioned in the book, you can use your CustomScoreQuery to apply boosting depending on some rules. This can be done by having queries in a chain:

QueryParser parser = new QueryParser(Version.LUCENE_36, FIELD, analyzer);
Query q = parser.parse("original query");
Query q2 = new MyOwnScoreQuery(q, ..., ...);

The scores obtained from the previous scorers can be obtained with the float subQueryScore that you see in the customScore method that we override, as you can see in its documentation.

That’s it! As it took me really long to figure this out I thought it would be a nice idea to share it. Please let me know if this was useful to you or if you find any mistakes on what I have said.

PS: Interesting presentation about Lucene that I found while writing this post: http://www.slideshare.net/nitin_stephens/lucene-basics

Zen To Done and me – Part 3

Fernando Brito — Sat, 23 Jun 2012 14:42:40 +0000

This is the third (and last) part of my review on the Zen to Done (ZTD) productivity system, where I write about what it means to me and how I think I can make the most out of it. Expect to find a mix of both the original ZTD habits and my experience applying them. You should read part one and part two before proceeding.

7. Review

On each week you should do a Weekly Review. This is a good time to reflect a little bit and to go over your weekly, monthly and yearly goals and see if you made any progress on them.

The author also ask you to write down your life goals, but I had a hard time trying to figure out what are exactly my life goals. If picking yearly goals is already difficult, think about life goals. I ended up with a rather abstract list that has things like read more, do more physical exercises, spend less time working, practice meditation and so on. A more appropriate name for my list would be Guidelines for Living. However, writing them down and reviewing them every week seems like a great idea!

After reviewing your goals, take some time to organize all the notes you did during the week, to review your calendar, to review your lists and finally, to set the Big Rocks and the Most Important Tasks of the days for the next week. You can find on page 80 a small checklist to help you in your Weekly Review.

Personally, I find this habit very nice. After working and studying during the whole week, spending a small amount of time to review and organize all my tasks feels great. I am trying to wake up early on the Saturdays, take a walk at the beach, take a shower, put some nice music and do my Weekly Review.

8. Simplify

Always try to simplify and minimize your to-do lists. Cross out old tasks, delegate some of them when it is possible and keep only what is essential.

Learn to say no in order to reduce your commitments and remember that you can’t do everything you want at the same time. End your current projects before entering in new ones.

I have to admit that saying no is very hard to me, but it is something that I am getting better. Better saying no at the very beginning than being not able to honor your commitments later.

9. Set Routines

This habit is about writing down small tasks that you have to do every day or week, like dishwashing, taking the dog out for a walk, go to the gym, do your Weekly Review, tide your room, etc. Try to have a routine list for your work, for your studies and for your personal life.

In my case, the improvement comes from writing down the tasks. Everything that frees my mind is welcome.

10. Find Your Passion

Do what you love. Period. This video may inspire you (All Work and All Play).

If you study something that you dislike or have a job that you hate, it’s time to rethink. Working (and studying) in the field that I love has always seemed very natural to me, so I have not much to add beside what is written in the ZTD book.

Conclusion

That’s it. In the ZTD book you can still find 3 more chapters: A Day with Zen to Done, the ZTD FAQ and Resources. I recommend you reading them.

I’ve written this series to share my experience with the ZTD methodology, so other people get to know it and to make a personal “long-term memory”, so when I forget something I can come back to my posts instead of having to read the whole book again. Moreover, writing down these habits and sharing them stimulates me even more into following it.

Hopefully this can be to other people as useful as it was to me. Feel free to share your productivity tips on the comments.

Zen To Done and me – Part 2

Fernando Brito — Thu, 07 Jun 2012 13:28:01 +0000

This is the second part of my review on the Zen to Done (ZTD) productivity system, where I write about what it means to me and how I think I can make the most out of it. Expect to find a mix of both the original ZTD habits and my experience applying them. You should read part one before proceeding.

4. Do

This is where the magic happens. After all, all those tasks on your to-do list aren’t going anywhere by themselves.

ZTD gives you some good old advises like eliminating all distractions and interruptions and rewarding yourself. You should check them out at page 37.

Personally, the most difficult stages to me are starting and keeping focused. Sometimes I had short tasks where I was spending more time between deciding I was going to do it and really starting than on the task itself. This is a bad habit that I am still trying to overcome. Realizing the time that I have been spending with this sort of micro procrastination was my first step. Looking at the positive aspects of a task my also help you starting your tasks.

To minimize distractions, I decided to turn off as many notifications systems as possible, like the ones on my Twitter client (a Google Chrome extension). If are working on a team and really have to keep an instant messaging client opened (GTalk, IRC…), avoid sharing them with your personal accounts, as you don’t want to get interrupted by friends and family. I also used to keep my GMail permanently opened as a pinned tab on Chrome, but its flashing started to distract me.

Another important tip is, as a friend told me, keeping the context. If you are working and someone comes to talk with you, write down what the person said and go back to your task as soon as possible. If you notice that the interruption is going to last more than a few minutes, write down instead what exactly you had been doing on your task. Not losing the context can save you some time.

You surely already know that you shouldn’t be looking your Facebook or Twitter account while you work, but you insist on doing it. Those tips are already a commonplace, however, following them are sometimes quite hard and demands some effort. Police yourself and try to track your distractions.

5. Simple, Trusted System

All in all, this habit is about having a very simple to-do list and calendar and making sure that you check it everyday. You may also want to split your list into categories like work, personal and calls to make, so when you want to pick a new task you don’t get overwhelmed by looking at a huge list.

ZTD book makes it clear that its focus is on the doing, and not on the tools, and gives you some nice recommendations on specific tools and softwares.

Personally, I have been using Remember the Milk, Google Calendar and a small notebook that I always carry with me. This is more a matter of personal taste, so make sure that try different systems in order to find the ones the fits you the best.

6. Organize

Getting out a little bit of to-do lists and tasks, this habit is about being organized in your daily life as a whole. You can see it as process that has rules that must be followed in order to work.

One of the biggest signs that you need help is when you notice that all the flat surfaces from your room/office/house are pilled with random papers and objects.

A general thumb rule is “a place for everything, and everything in its place”. Your goal is to be able to tell right now where all your things are and to have a system to deal new objects. Do you have a place for your car keys? For your dirty clothes? For your documents (like passport)? You should.

I don’t have much to tell beside what is written in the ZTD book. Getting organized always seemed very natural to me, but sometimes I keep realizing how awesome it can be. For example when I want to find something that I haven’t used for years, but when I look for it, it is exactly where it should be.

What’s next

On the next part of this post I will write about the last 4 habits: Review, Simplify, Set Routines and Find Your Passion. You can read it here.

Zen To Done and me – Part 1

Fernando Brito — Sun, 03 Jun 2012 17:15:54 +0000

This week I read a post called 6 blog tips for busy academics by Matt Might. It is interesting how I am not even subscribing to his RSS feed but I end up visiting his blog every week or so, mainly by clicking on links posted on my Twitter timeline. One of my favorites article there is What every CS major should know.

I can pretty much say that his blog tips inspired me into starting a new blog, particularly when he recommends us to “Blog as long-term memory”.

Several weeks ago I found the Zen to Done (ZTD) productivity system. As you can see in the link, the book has 80+ pages, but the pages are small, the font is big and the text is really easy to read. Although I have read most of the book, I wanted to set practical goals and check points to help me measuring what have I successfully incorporated into my routine so far and what do I still want to try out.

So, this post is a personal “long-term memory” review of what ZTD means to me and how I think I can make the most out of it. The following text is a mix of both the original ZTD habits and my experience applying them. This post is going to be written in 3 parts (part 2, part 3).

Introduction and Why ZTD

Zen To Done is a set of 10 habits to help you get organized and to simplify your life. It started like a mix of Getting Things Done (GTD) and the 7 Habits of Successful People, but with simplicity in mind and focus on doing, instead of on the system.

You should follow only the habits that makes you fell comfortable, and you should pick only one or two at a time.

1. Collect

One of the most important habits to me. When you have an idea or a new task to do, you should write things down immediately. When I say immediately, I really mean it. Our brain is already overwhelmed with lots of things, and it can trick us sometimes. When you get back to your home or office, you should transfer your notes to your master to-do list.

This habit is also about centralizing your inboxes and collecting them together (letters, papers from work, university and so on), so your the next habit (processing) can take place.

It may seem too simple or stupid to always carry a small notebook with you, but I have already seen the benefits of doing so. One protip that I can give you is: right before going to bed, write down the tasks that are bothering you (“tomorrow I must not forget to do this, and this and that”). This will help you cleaning your head (really important) and won’t let you forget those tasks.

2. Process

So now that you have a single pile with papers on your table, some unread emails on your inbox and some notes and tasks on a small notebook on your pocket, it’s time to process them. It is important to not let those piles overflow.

As the author says, “in all cases, don’t leave the item in your inbox“. Make quick and immediate decisions. If an item on your inbox requires you to do something, first check if you are the person who should be doing this. If not, delegate it to the right person as soon as possible. If you are, and the task will take 5 minutes or less, do it immediately. If the task demands more time, add it to your to-do list and remove the item from your inbox.

From my personal experience, using your email inbox as a to-do list is not a very good idea. Your goal is to have your inboxes always empty. I used to read an email and then mark it as unread to remember me of doing the tasks that the email required later (even if the task only required some minutes), but this didn’t work very well for me, probably because I was mixing new unread emails with emails that I had already read but marked as unread.

In one sentence: when you read an email, delegate it, do it right away or create a task on your master to-do list in order to remove this email from your inbox.

3. Plan

After collecting papers and emails, processing your inboxes and converting the things you have to do to actual tasks on your to-do list, it’s time to choose where to start.

Here the author recommends you to pick, at the beginning of the week, the most 4-6 important tasks (named Big Rocks, from the 7 Habits of Highly Effective People) that you want to accomplish this week. Don’t forget to take at least a task that will help you achieving your yearly goals (more on that later).

Each morning, choose the Most Important Tasks (MIT) of the day (this could include a Big Rock task) and get them done as earlier as possible, so other tasks don’t pop up before you are able to work on them. When you are done with your the MIT, go to your to-do list and check what you can do next.

Personally, besides the funny names, simply choosing the most important tasks and writing them on a piece of paper that I carry on my pocket throughout the day has helped me a lot. Sometimes I used to had simple tasks like getting a document somewhere on my university campus, but I kept procrastinating it until the day that I had it as a Most Important Task of the day.

Make sure to include your important tasks there and not working on anything else before accomplishing it. Reviewing this list with your completed tasks at the end of the day makes you feel nice.

What’s next

On the next part of this post I write about the next 3 habits: Do, Simple Trusted System and Organize. You can read it here.