Sahil Badyal

The Whats, Whys, and Hows of Reinforcement Learning.

Sat, 07 May 2022 00:00:00 +0000

This post is for anyone who wants to understand the three big questions (WWH) of Reinforcement Learning. Whether you are a beginner or a machine learning veteran, chances are that you have heard about Reinforcement learning and you always had this inkling to understand this less known sibling of the popular machine learning paradigms like Supervised and Un-Supervised ML. I will try my best to provide a concise and hopefully interesting answers to these basic questions through this blog post.

Back Story - How do we learn?

From the moment a child is born into this world, it is bombarded with a ton of information through the sense organs like eyes, skin, nose etc and the learning process starts. Our nervous system processes these sensations or stimuli and then controls our physiological system to perform tasks in the real world. During this learning process, we often make mistakes and learn from them. Once we get good at these tasks we usually get a positive reinforcement like applause and appreciation (at times self-appreciation). These implicit/explicit feedbacks help us continuously gauge and improve our performance at any task or challenge.

Let’s take riding a bicycle for example. Just recall the first time you rode a bicycle and the struggle of you trying to balance it. Initially, It would have felt impossible to learn, but eventually, your brain learned to control the posture, pedaling, and movements to effortlessly bike (and now you can flaunt your biking skills). Each time you fell, your brain learned how not to make the same or similar mistake. This happens because our brain maps these sensations a.k.a stimulus to appropriate actions through a trial and error process. Most of us can recall that the frequency of falling from the bike reduced drastically with each trial. As a matter of fact, if we dissect this process or any learning process in general, we would find that there are a couple of essential elements involved. The environment i.e the bike, road (and the world around), the agent i.e you, and a reward or cost signal i.e falling or successful. I would argue that these three are crucial elements in any form of learning. This is also the motivation for Reinforcement Learning.

The What ??

Reinforcement Learning (RL) is one of the major paradigms of machine learning and artificial intelligence. It comprises of methods that can learn behaviors to optimize the performance of an agent (e.g. a robot or an algorithm) at a task by maximizing rewards (or minimizing costs) received from the environment to reach the desired goal. This is unlike other learning paradigms like supervised and unsupervised learning, where learning is dependent on the collected data. These behaviors, also known as policies, are defined by the controls or actions taken by the agent in various situations (system states) of the environment.

There are primarily two basic elements of Reinforcement Learning namely an agent and an environment. The entire domain of RL works on the interaction between these two elements as we show in the illustration. These basic elements further contain some sub-elements and together they provide a general framework for all reinforcement learning problems.

Environment is defined by a system that contains the basic building blocks of the problem like a state (current situation), an interface for the agent to interact (like using robot arms), the manifestation of the task itself (pick a ball), ability to change its state depending upon the external interactions or internal process. Additionally, it also refers to the world in which our agent resides and defines the boundaries or limits to the execution of actions. Mathematically, it is defined as a set of states that are affected both by some internal known/unknown process and the interactions with the agent. The objective of an agent is to use its controls or actions to reach a desired goal state in the environment.

Agent refers to the actor responsible for solving the problem or achieving the desired objective. It might be a computer, an actuator, a robot, or a virtual game player. An agent is a complex entity that has four important sub-elements namely policy, cost/reward, cost/value function, and optionally a model. We now discuss them in detail.

Policy refers to the mapping of a state to action or control and this defines the behavior of the agent. Although it is important to note here that the agent doesn't need to know the actual state of the environment, (for example in the partially observable Markov decision process (POMDP), the agents just have a belief state which is a probabilistic estimation of the actual environment state). The agent estimates the state (perfect or probabilistic) of the environment as observation through the sensors and then uses the policy learned through the RL algorithm to reach the goal. It is sufficient to say that the goal of any RL method is to learn an optimal policy.

Cost/Reward is an important signal required to estimate the quality of the learned policy. Generally, every state transition yields an associated reward or cost (except the final state that has a terminal cost) that is determined by the agent based on some pre-defined metric of the problem statement. For instance, in a search and rescue scenario, the reward could be defined by the proximity to the lost person, whereas in pipeline repair the cost could be defined by the number and the disrepair of damaged sites. It is also important to note that reward and cost are essentially the same signals with their sign reversed i.e. \\(\mathbf{cost = -reward}\\). For the sake of simplicity, we will now use cost in the remaining post. Designing the cost signal, i.e. defining the cost of a state transition is of fundamental importance in formulating an RL problem. It could be easy for some problems, and tricky for others (thinking in terms of the quantity or parameter that needs to be optimized might help).

Cost/Value Function maps to the overall quality of the state by taking into account the future course of actions and consequently states, that can be reached from the current state through the policy. A state might have a high immediate cost but a low value of cost function suggesting better suitability. The cost (also known as stage cost) helps to gauge the immediate desirability of a state, whereas the cost function is used to determine the long-term desirability. Another noteworthy difference is that cost is a part of RL problem formulation whereas cost function is dependent on the algorithm used to estimate it. It would be an understatement to state that the quality of an RL method depends on the quality of its cost function estimation.

Model is an important but optional sub-element used for planning, that helps us to predict the state transition of an environment in simulation. While there are some problems where it may not be required, as the agent can learn through multiple trials in the real environment, in most practical applications a model is desired. Most RL problems can be formulated as Markov Decision Processes (MDP), which I will discuss in detail in the next blog.

The agent enters the system and observes the so-called initial state, then uses a policy to reach the desired goal state in the environment. Once the desired state is reached we terminate the execution of the agent and call this an episode.

The Why ??

I feel that this is an important question, considering that there are many learning methods available today. Especially with the tremendous success of supervised and semi-supervised machine learning in the last decade with Deep Neural Networks, fast computing machines, and GPUs, this question has become more relevant. But the answer in my opinion is very straightforward - supervision is costly and unnatural (but still necessary and relevant). Let me elaborate on this using these points.

Annotation is costly Most of the supervised ML algorithms rely on the annotated dataset with explicit supervised target signals. More often than not, these efforts require human expertise, and hence creation of such datasets is costly. Simply put, these algorithms are successful only where there is a straightforward mapping between input and output.

Annotation is unnatural This might be controversial to some, but I also believe that learning the mappings explicitly is also a form of unnatural learning, in the sense that we do not specify the objectives or tasks for such (supervised and unsupervised) algorithms nor do we have some environment to interact with. (one could argue that minimizing loss is an objective). But, I do want to acknowledge that this may be a sufficient form of learning. We do not need to imitate the bird to build a mechanical bird (airplane). Reinforcement learning, on the other hand, is a more natural way of learning i.e. using trial and error in an environment. I think that RL is more closer to the concept of General AI than its supervised counterparts.

Annotation is limited in scope This might be the most controversial of all but I believe that algorithms based on annotated systems cannot reach general AI (even in theory). But they are still necessary and invaluable.

RL is responsive by design i.e. the same algorithm can adapt to new environment and changes in input using the cost signal using the process of failing and learning. For its supervised counterparts, it might require creating a new dataset.

RL is practical in application i.e. we do not need to learn mappings that are statistically rare. If a set of states are very rare, the agent will likely never encounter it and hence it does not need to learn it.

That being said, there are also practical issues with RL like modeling the environment, reset-problem, defining objectives, costs, etc. Again, I do not want to undermine the importance of supervised learning and I believe that success in Supervised methods is crucial to the eventual success of RL and then General AI.

The How ??

Finally, let’s get to the question of how to get started in RL. I created this list of resources to help you learn the basics and get you off the ground.

P.S. I will also be posting a simple quick start guide to RL in my next post.

This post is derived from my MS thesis at ASU. I would highly appreciate you going through it and providing me a feedback.

Tensorflow Eager vs Pytorch - A systems comparison

Mon, 10 Feb 2020 00:00:00 +0000

Deep Learning has changed how we look at Artificial Intelligence. Once studied by a few researchers in the four walls of AI Labs of the universities has now become banal and ubiquitous in the software industry. Most people attribute Moore’s Law and increase in the memory on our computing devices along with the development of GPU for this feat, but there is another significant yet underrated and under-discussed factor that has equal if not more impact on the adaptation and popularity of deep learning. These are the deep learning frameworks. You might know them as Tensorflow, PyTorch, Caffe, etc. but whatever is your favorite I bet you know how important these auto-differentiation software are today in research work in the industry or academia.

Today Tensorflow and PyTorch are the most widely used and have found their niche in the industry and academia respectively. But why is that? To answer this I recently tried to compare these frameworks with respect to their system architecture and design philosophy. So, this article is not about which one is better in performance (or will train your model 10 seconds than the other), it is more about why they are better than the other in some aspects. Tensorflow’s latest version is Tensorflow Eager or 2.0 which supports eager execution and hence is more close to its counterpart than ever.

Here is why this comparison is important more than ever:

Both are highly popular systems which have been successful in penetrating the market of domain-specific languages (DSLs) for differentiable programs through their features.
As mentioned above as well, both are more closer than ever as they now support imperative style of programming.

Along with these two reasons, their similarity now also brings their differences and performance comparisons to light. The predilection of the research community towards PyTorch can now be understood and explained in a better way.

Tensorflow Eager

Tensorflow Eager is an upgraded version of the Tensorflow [4], which supports eager execution that was not available in the predecessor. The reason for its development is rooted in the wide popularity of PyTorch in the research community for rapid prototyping which Tensorflow lacked. Tensorflow Eager works as an imperative front-end to Tensorflow that executes the operations immediately and it also includes a JIT (just in time) tracer that translates the python functions to data flow graphs. It is available as an opt-in extension to Tensorflow.

Pros

Imperative and Pythonic programming along with the support for thee arlier declarative style.
Since it is still has all features of Tensorflow, it supports distributed training architecture with the ability to control the device placements etc.
It has an excellent abstraction layer for the underlying hardware like CPU, GPU, TPU, etc.

Cons

Since its an extension to the predecessor, still it adds an additional cost of learning the declarative style of Tensorflow or at the very least be familiar with it.
In the paper its shown that Tensorflow Eager performs at par with Tensorflow, it means that it has low throughput as Tensorflow.

PyTorch

PyTorch is one of the most successful DSLs in the research community because of its imperative design and a high performing C++ back-end. Their design philosophy is centered around enabling experimentation and research (though sometimes) at the cost of some performance.

Pros

Imperative and Pythonic programming.
Performance is as good as (if not the best) one of the best DSL around.Outperforms Tensorflow on throughput.
Simplicity

Cons

Since it is centered around research and rapid prototyping, its deployment and inference model is not as good as Tensorflow. This means that it is not a first choice of industry based on applications.
The Garbage collector is designed to work only around languages that utilize reference counting mechanism like Swift, Python, C++, etc. but not scripting languages like Lua).
There is a one pool stream design assumption which can break if there aremultiple streams of tensors (which according to them never happens).

Similarities in the frameworks

Design Principles

Both champion a Pythonic design.

Features

Auto differentiation
Imperative and Pythonic programming
Both provide parallel execution. PyTorch relies on multi-processing module of Python where as Tensorflow inherently supports parallelism due to its data flow graph design.
Both now use simple control flows i.e while loops and if statements are now simpler.

Differences in the frameworks

Design Principles

PyTorch believes in the philosophy of ”Worse is better”, where as Tensorflow Eager design principle is to stage imperative code as dataflow graphs.
PyTorch is designed for the research community in mind whereas Tensor-flow Eager still focuses on the industrial applications.

Features

The design of control and data flow implementation is very different forboth the networks. PyTorch uses Python for control and and C++ counterpart LibTorch for data flow.
Tensorflow Eager still can use staged dataflow graph based computation whereas PyTorch has no such feature.

Learnings

Reading these papers I have a few learnings which I would like to share:

Design Principles have a huge impact on the final product and its market (niche, share etc).
Usability and Performance although are a part of tradeoff but still withsome good design and engineering can be both increased.
Declarative Style of Programming seems to be taking a back seat as evident with Tensorflow that most programmers just want to use the languages they are most comfortable in.

In the end I want to conclude that both these frameworks are excellent examples of engineering and it will be interesting to see the future trends in their market share.I hope this has been a useful read for you. Please leave your feedback in the comment section below.

P.S.

Here are a few links which I found useful:

Distributed Tensorflow using AWS S3 and EMR

Mon, 28 Jan 2019 00:00:00 +0000

In this data age, there is no doubt that getting information and valuable insights has become one of the most crucial tasks. Not only the companies like Google, Facebook, Microsoft who are leading the research in deep learning and artificial intelligence but virtually every tech startup today has an ML pipeline setup for easing their business processes, getting insights on customer behavior, improving customer experience etc. The huge data, combined with the power of GPU, deep learning has taken us a step closer to achieving human-like performance in some of the tasks like image classification, speech recognition, self-driving cars etc. But since the power of hardware on a single machine is limited (costly as well), going distributed seems the only way to go ahead further. Majority of the ML frameworks have already provided distributed support but when it comes to scalability and production ready scenarios Distributed Tensorflow is probably the one in the lead. Cloud services like Amazon AWS, Google GCP have support for running distributed Tensorflow cluster, which undoubtedly has made things quite easy and standard. So, if you are dealing with the problem of horizontally scaling your ML training pipeline, you may continue reading this post and my learnings would hopefully prove useful.

First and foremost, you would require a cluster of machines for running distributed training (I would recommend AWS EMR for its just pricing on spot instances). Once you have the cluster up and running with Tensorflow with version no less than 1.12, things are pretty much easier to implement. Tensorflow has eased the process of running its distributed training if you use the TF Estimator API. Estimator API abstracts all the low-level session related nuances and hence you can focus only on your model architecture and training logic. The only thing left for us to make it distributed is to set up an environment variable TF_CONFIG and also set up the data pipeline using tf.data API. Before configuring TF_CONFIG environment variable lets first understand some important TF distributed terms:

Worker: Workers are the machines which store stateless nodes and perform compute intensive operations using local CPU/GPU.
Chief: Chief is the master server of the distributed Tensorflow architecture and coordinates the distributed training strategy while also acting as a worker.
Parameter Server (ps): This server stores all the variables required and workers interact with PS and network bandwidth between them is an important parameter to decide the number of parameter servers to be used.
Cluster Spec: This is the specification which tells the master which node is assuming which role in the cluster. TF_CONFIG contains cluster spec marked by “cluster” key.
Evaluator: Evaluator is not a part of cluster spec and a separate task needs to be assigned to a separate machine in Tensorflow.
Job/Task: Task in Tensorflow can be of a chief, worker, ps or evaluator job type. Task has an index.
Index: Index is the identification of the machine wrt the cluster. For example, Worker1 would have index 0, Worker2 would have index 1. Same goes for PS and evaluators.
Client: Client is a program which generated a Tf.graph and calls tf.Session.

Typical TF architecture looks like this:

Src:Official Tensorflow Documentation

More information about architecture here

Distributed Traning in Tensorflow:

Tensorflow exploits data parallelism through graph replications

Types of Replication:

In-graph Replication: Single client (usually on the master server) builds the tf.graph and coordinates with ps and workers.
Between-Graph Replication: Each worker has a client and similar tf.graph for itself and uses the parameter server to store and get variables. This is the default replication type in Tensorflow.

Types of training:

Synchronous Training: In this type of training, each client reads the same variables from the ps and then applies computations and then synchronously writes the updates to the ps. This is compatible with both replication types.
Asynchronous Training: Each client runs a training loop independently and updates the parameters in the ps. This is also compatible with both replications. This is the default training type

More information here

So TF_CONFIG variable looks like this on my Master machine (in ~/.bashrc):

export TF_CONFIG='{ "cluster": { "chief": ["172.30.11.50:2222"], "worker": ["172.30.11.219:2222","172.30.11.11:2222","172.30.11.127:2222","172.30.11.108:2222","172.30.11.195:2222","172.30.11.215:2222","172.30.11.249:2222"], "ps": ["172.30.11.95:2222","172.30.11.149:2222"] }, "task": {"type": "chief", "index": 0} }'

Remember that the environment variable TF_CONFIG would be different for every machine especially the “index” bases on the role or task the machine is performing

This environment variable needs to be set on every machine of the cluster and the corresponding index should be set. I used this simple boot script on my AWS EMR cluster. You also need to make sure your data pipeline is ready for distribution.

Data Pipeline

Following things need to be done:

1. Input data stored on S3/HDFS/(Any other filesystem) (so that every machine can access ).

2. Sharding the data, so that every worker gets its unique subset of data.

To shard dataset use:

dataset = dataset.shard(TOTAL_WORKERS, WORKER_INDEX)

WORKER_INDEX here is not the task index in TF_CONFIG, because we need to take into account that chief is also a worker so its index would be 0 and Worker1 index would be 1, so on and so forth. This is an important step as this ensures true data parallelism. Here are the best practices for data pipeline.

3. Implement the rest of the data pipeline as you like and call estimator train and evaluate API.

4. Storing the model/result in S3/HDFS/(Any other filesystem) (accessible from the cluster)

The good thing with Tensorflow is that surprisingly it has a good S3 connector, so I recommend using that. To use Tensorflow with S3, just add the following:

In your ~/.bashrc:

 export AWS_REGION=<your region>

 export S3_ENDPOINT=s3.<your region>.amazonaws.com

Your AWS credentials in ~/.aws/credentials file

Now all you need to do is run Tensorflow on all the machines (again I recommend using a script as I did here) and voila! you will enter the world of distributed deep learning.

Also to run Tensorboard on this distributed cluster, just pass the path to the model output directory (S3).

On AWS EMR this would look like this:

python3 -m tensorboard.main --logdir=s3://<path-to-model-output>

I hope this has been a useful read for you. Please leave your feedback in the comment section below.

P.S.

Here are a few links which I found useful:

My Notes on The Business of 21st Century

Thu, 28 Jun 2018 00:00:00 +0000

I recently got a chance to read a book by Robert T. Kiyosaki, an American businessman and author. I thought this knowledge was worth sharing so I am publishing my notes and a brief review on this book. This book talks primarily about how you can start towards attaining financial freedom and champions network marketing to be an excellent starting point due to its flexibility and lack of dependency on capital. I hope this knowledge proves useful to you at some point of your life.

The author Robert T. Kiyosaki, starts with few excerpts of his early life wherein he talks about his rich dad (his friend’s dad), poor dad (his own dad) and their teachings and impact on his life. He lets us understand the life he has led so far, and how, his adamant take on changing his life made him who he is today, and why we as readers should read this book “The Business of 21st Century”.

Part 1

The first part of the book “Take control of your life” talks about how the present economic situation of the world is creating a perfect “Opportunity” for people to start a business and earn their financial freedom. In doing so, he explains how 20th century Industrial Age has given way to new 21st Century Information Age, and how the stale concepts of job security, pensions, social security are just not valid for this age. Citing examples of Microsoft and Disney, he talks about how in all these economic downtimes, there is a silver lining; a heap of opportunities waiting to be explored. Another takeaway is author’s delineation of four cashflow quadrants. He tries to model the behavior of people in these quadrants and in doing so explains why people are reluctant to change from E/S to B/I. Finally, he talks about the entrepreneurial mindset and how it can help you achieve financial freedom and the key lies in harnessing passive income.

E = Employee
S = Self-employed or Small-business owner
B = Business owner
I = Investor

Part 2

The second part “One Business—Eight Wealth-Building Assets” is basically what this book is all about because it is one thing to talk about concepts and theories and another to talk about a practical approach or a solution to financial freedom. Here we are introduced with network marketing as a way to start. John Fleming, a friend and a person with 45 years of experience in network marketing helps us visualize the benefits of network marketing or multilevel direct sales. He explains, how the requirement of capital, makes the transition to B/I quadrant challenging and how network marketing leverages your time and not money instead. He also differentiates it from sales, as this is based on duplication rather than excellence or distinction. He talks about assets that generate income as opposed to active income (The income that one has to work/spend his time on). Finally, we get to explore 8 benefits or pros of network marketing but in doing so he terms them “assets”:

A Real-World Business Education
Author and John both explain how it provides one of the most practical business education which cannot be learned at any business school. One can get a financial education too as this is never taught. As John explains key essential elements of real-world education:
- An attitude of success
- Dressing for success
- Overcoming personal fears, doubts, and lack of confidence
- Overcoming the fear of rejection
- Communication skills
- People skills
- Time-management skills
- Accountability skills
- Practical goal-setting
- Money-management skills
- Investing skills
A Profitable Path of Personal Development
Here he talks more about how it helps you develop soft skills and changes your approach to life as a whole and makes you a better person.
A Circle of Friends Who Share Your Dreams and Values
Since it involves a network, the author here shows how it can help you make real connections with like-minded people, who would help you develop, as your goals are similar and not conflicting.
The Power of Your Own Network
The author shows us the power of the network and how powerful it is because of Metcalfe’s Law

v=n^2
Network’s output grows exponentially and not linearly.
A Duplicable, Fully Scalable Business
Here author talks us through the how’s and why’s of network marketing. He explains the power of duplication the essence of a scalable business and cites examples of Henry Ford and Edison for creating networks through simple and duplicatable designs.
Incomparable Leadership Skills
He also talks about leadership skills and their development through network marketing. He explains how helping and motivating others to achieve their goals creates a leader inside of you.
A Mechanism for Genuine Wealth Creation
Here the author gives his mantra of wealth creation and separates the idea of wealth and money. Wealth is measured in time and not money. Following is his four-step path to financial freedom:
- Build a business
- Reinvest in your business
- Invest in real estate
- Let your assets buy luxuries
Big Dreams and the Capacity to Live Them
Finally, the author stresses the value of dreams and also models different types of dreamers. He talks about how network marketing is an ideal path to start living a dream instead of just dreaming, though he stresses on this not being the only one.

There is also a section where his wife Kim Kiyosaki talks in a motivational tone about how women’s innate ability to connect with others makes them ideal and excel in network marketing business.

Part 3

In the final section titled “Your Future Starts Now”, the author finally sets us free with a knowledge and few things to care while pursuing this path (i.e. entering into network marketing business).

Who’s running the ship?
Does the company offer a proven plan of action?
Does the company embrace both business skills and personal development as a regular part of its educational and training programs?
Does the company have a strong, high-quality, and highly marketable product line that you can be passionate about?

Read here for the things to keep in mind.The author also tells us that it takes no degree, no resignation from a regular job, no genius but plain honesty to achieve real wealth. He tells us how network marketing is a democratic way of wealth creation and it takes only persistence and sincerity towards one’s own dream to achieve it. He hopes for the readers to take a small step towards their happiness as real wealth creation is not just about money but the quality of life.

My Review

The book is quite engaging and the author does not make you feel like you are reading it. It has more of a TED tone(motivational speech) to it. I personally found this book an eye opener. I do not support or advertise network marketing and encourage you all to read more before making a decision. Here is a contradictory blog. Nevertheless, there were a lot of takeaways, but the one that I feel strongly about is to make an effort to start something. As kids we are provided an education to be employees(or at max to be freelancers/self-employed), so breaking this barrier and moving towards attaining real wealth (not just money), can only be achieved from books, experiences of others and most importantly taking the first baby steps. Most of us don’t have the friends/company to tell us how to be financially free, as rich hangout with the rich; poor with the poor and middle class with the middle class.

” I don’t not have a rich dad to tell me the secret reciepe of real wealth, thats why I have books “

How we developed a Sentiment Prediction model at Ameyo

Mon, 26 Mar 2018 00:00:00 +0000

In June 2016, the ML team at Drishti Soft Solutions Pvt. Ltd called “Singularity” took upon the challenge to make a highly accurate sentiment prediction engine, which would be used to prioritize the inbound emails at the contact centers. Since the company’s core product Ameyo is a contact center software with omni channel communication interface for channels like email, social media, voice calls in both inbound and outbound settings, getting email data was not a huge challenge. The initial research and comparative study was done to use different machine learning methods of getting high accuracy in the task. It was finally decided that using neural networks was the way to go.

I joined the team in September and was straight away assigned to the task but I had very limited knowledge and understanding of the field So, my mentor Bikramjit Roy, suggested me a course by Stanford University titled “Natural Language Processing with Deep Learning (CS224n)”. It wouldn’t be wrong to say that this course proved immensely helpful for the achievement of this objective. The proof of concept of using GloVe representation and a rnn network was already done by two interns before me, nevertheless I still had to research ways to increase the accuracy and implement the final solution on a production setup.

First task was getting the email data labelled/tagged for sentiments. Every person in this field would agree that labeling the data is one of the most important and time consuming (in some ways the most boring) task in the process. Since this task required human judgements, we first thought of using data labeling platforms like figure-eight, payment, mechanical turk (to name a few) but we soon realized, that would unnecessarily increase the project timelines and budget. Hence we came up with a next best approach - use the power of existing predictive models for tagging. I know at first it might sound absurd, but the idea was to deploy the model, make sure it has a decent accuracy, get few clients using it, and then slowly and iteratively involve human tagging for better results. This was a part of fail fast approach which we all agreed upon. Hence, we used Google, Azure, IBM watson for tagging the emails and finally added high confidence (which at least two of the three agree on) examples to our dataset. Using this technique we prepared a dataset of 50000 high confidence email- sentiment label pairs. The data set was balanced with same number of sentences in positive and negative sentiment categories.

Next challenge was cleaning the emails. Email cleaning can be tricky as emails generally are in HTML format with content wrapped in various tags, ids and classes, which are again dependent on the source of the email, browser, mailbox etc. Also they contain some greetings like “Hello”, “Regards” etc which do not contribute to the real sentiment of the email at times. We used a java library called Jsoup which makes html parsing smooth and easy. As our preprocessing and feature extraction pipeline was on a java server, jsoup proved to be an important tool as we extracted email subject and body easily through this while removing all the tags. After we extracted the email text, we then removed the signatures and greetings from the email using a set of words representing frequent email greetings. This completed the data cleaning process. The feature extraction just included getting the Glove representation of the words in a single email. We used the 350 dimension vectors generated through common crawl having 50 billion tokens. We limited the input layer size by taking only first 200 words from the email, as we verified that in most of the negative sentiment emails the sentiment is evident from the initial 100-200 words.

The network architecture was rather simple and consisted of a single layer of 64 LSTM units, followed by a fully connected layer which had a single neuron at the output. Using “tanh” nonlinearity the output was being squashed in -1 to 1 interval to get the sentiment along with a degree measure. We implemented this network in tensorflow as it had been tested in production environments and scales well. I already had apt knowledge of working on tensorflow and knew its serving architecture, but it was the first time I was about to implement this keeping in mind the scalability and production scenarios.

The training was done on the dataset split into train, cross-validation and test datasets. The model showed excellent ability to predict the sentiments with an F1 score greater than 0.85 in both positive and negative sentiment categories on the test set. Even on the production server,the new data being tagged with an impressive 88% accuracy (of-course this accuracy was with respect to the 3-way high confidence tagging and not human judgement).

This was indeed an enriching experience but I feel a lot more can be explored in this problem to even further increase the accuracy. I am keen on improving my knowledge in this domain and I hope that reading this post will give you an idea to go about and start on this problem.

Here are a few links which I found useful:

If the evolution was a deeplearning algo

Sat, 24 Feb 2018 00:00:00 +0000

This bizarre thought came to me when I stumbled upon a tweet by some famous celebrity whom I follow on Instagram. The post was about the importance of daily workout, due to the fact that Human body has been evolved to walk 10 miles per day (on an average of course), and if we do not spend enough energy, our metabolism alters and we suffer from consequences we all know.

But our lifestyle has changed in the last century with the advancements in transportation and communication technology. We are no longer accustomed to walking 10 miles, I highly doubt even 1-2 miles. With our bodies not genetically evolved for this change, are we moving towards the doom of homo sapiens? With technologies like VR, it’s quite possible we might never have to walk again. We could comfortably sit on our sofa and be anywhere we want; do anything we wish. Or like the ever-evolving creatures that we are, our genes will learn and adapt to this lifestyle. It is this “learning capability” of all life forms that made me pen down this analogy. But I am not the first one, historically AI has been inspired by evolution, notably, Genetic Algorithms exploit this amazing phenomenon to explore search space and finding an optimal solution. But, this article is a reverse analogy and maps deep learning and evolution.

You might want to know about deeplearning if you are unaware. So, let’s assume life is indeed a big deep learning model, trying to optimize a cost function. But what? Answering this is the root of this analogy. But let’s defer it for now and move on to evolution first. Wikipedia defines evolution as “a change in the heritable characteristics of biological populations over successive generations”. Notice here, how words “change”, “heritable characteristics of biological populations”, “over successive generations” easily map to “change”,”parameters”,”epocs/iterations”. So, the very definition is easily mapped to a deep learning. This is a good start, time to dig deep into this analogy.

Darwin in his book “The origin of Species” talks about two main points:

All life on Earth is connected and related to each other
Modifications of populations by natural selection, where some traits were favored in an environment over others

What this means is that firstly, all life forms are the product of a single algorithm, i.e there is only one model being trained/optimized. Secondly, the difference in life forms is due to external inputs (environment). So, all present life forms are just intermediate representations in this big evolution model. Some representations are grouped under mammals, others under reptiles, the ones underwater are fishes, skies belong to the birds and rest are amphibians. We are all related as we originated from same initial representations during our initial epocs. This deep learning model stores all its information in Genes of beings, these genes contain all the information about life like the parameters of a model.

But we cannot define epoc until we define the output. The output is the present state of life. But if this is true, then it means there are parallel epocs running. All these parallel epocs, share some parameters and change the internal representation according to the inputs they are being exposed to. This branching model of epocs with some different parameters and some shared parameters leads to change in outcomes. Else, the outcome would have been static.

This model, does not need separate training as it continuously trains by the birth of an instance/epoc, using/sharing same parameters as previous instance/epoc or instances/epocs. This core gene (or parameters) are getting trained/improved as the environment is changing to optimize a function called survival function.

Survival function has to be maximized for any epoc. The ultimate aim of evolution is to increase the survival. Loss function will be inverse of Survival function which is extinction function. But like every algorithm, some features/representations are unable to optimize this function and hence are erased and replaced by new representations. This extinction function drives the backpropagation into the network and causes the features and parameters to change and adapt.

Since this entire architecture is having layers of parameters, with every layer capturing more and more complex feature representations, and having tree-like structure, it is quite apt to call it a deep learning algorithm. I am quite sure we will keep evolving and adjusting until this model is killed by OOM. In that case I hope we will find a new machine to run this model.

In the words of Elon Musk

“It’s OK to have your eggs in one basket as long as you control what happens to that basket.”

Japanese 5-S methodology - A clearer approach to workspace management

Fri, 24 Feb 2017 00:00:00 +0000

I had never been so excited about a new year as I was in the December of 2016. The reason was that my paper on Automated Space Layout Generation had been selected for a presentation at IEEE Conference (AMIAMS-2017), MNNIT Allahabad. The presentation was scheduled on 4th of February and being my first publication, my excitement was at peak levels as it was my time to speak and for the first time, I was a matter of interest to the great intellectuals whom I have always listened while growing up. These were the people who have been changing the course of the human race for about 200,000 years now. The curious ones they say, our researchers, scientists, and engineers. Little did I know, that this conference would not only give me a platform to speak but also open different domains and perspectives which I had never explored.

On my first day at the conference, there was a hype about a guest lecture by Dr. CC Tan. I had no idea that I was about to experience an energy which would inspire me and open the gates of a new domain. There I was sitting among 100-120 others as he took on the stage. I witnessed the energy and enthusiasm of lecture delivery. He was there to talk about workplace management, basically, how can we increase organizational productivity by applying the Japanese methodology of 5S. He was jumping, running, even kicking at times, utilizing the entire stage to present such a powerful yet simple 5S methodology.

5S is a roman transliteration of Japanese words: seiri,seiton,seiso,seiketsu and shitsuke. But since this would not make any sense to people with no Japanese understanding, let me unfold this philosophy for you. In English, these 5S can be understood as Sort, Structurize, Sanitize, Standardize and Self-Discipline (Sustain). Let’s understand how they can affect our management by increasing speed and productivity.

Seiri (Sort): This is the first step to having a smooth workflow. We need to sort things out. Remove distractions, remove all unnecessary and unwanted workflows, items, obstacles, and processes. For example: If you are in RnD workspace, you wouldn’t want your contact center teams/agents to be seated and taking calls as it would cause distractions. Everything must have its own clean environment(a clean workspace).

Seiton (Structurize) This step involves setting things in order or a structure. Things must be arranged in order at specific places. It prevents loss/waste of time in arrangement and searching things. For example, Organisation must be divided into teams having common goals and they must be seated together for efficient collaboration. Another example can be separate space for all hardware items in an organization. All these small yet important steps make the workflow easier and this improves the speed and productivity of a workplace.

Seiso (Sanitize) It means cleaning in a literal sense. If your workplace is not clean, you won’t be able to mentally attach yourself to the place and perform. One must continuously clean our workplace and ensure that it is pleasing to work in. An example can be a daily cleaning activity at an organization, dusting, cleaning boards, laptops, machines etc.

Seiketsu (Standardize) This is an important step in a workplace management as it creates a benchmark of best practices in a working area. We must maintain high standards of our organization in terms of processes, workflows. For example, Every build test request for a feature must not have more than 2 iterations. Another can be every request to a team should be acknowledged in an hour. In short, every process must have a standard associated.

Shitzuke (Sustain though Self Discipline) This is one of the most important aspects of any management as it applies to the people/participants in a process or workplace. All the above-mentioned principles must sustain and it can only happen if the people following them are self-disciplined and ensure that all these steps are iteratively followed for sustenance. It requires an effort on all our parts and hence most important aspect. Any process can only be as good as the people involved in it.

All these principles are not just theories, many corporate giants knowingly or unknowingly follow them. Toyota, for instance, uses Hirano which comprises of 4S.The only thing that amazed me was the 5th S and how it defines the sustainability of other 4S. This methodology can not only be used for workplaces but in fact at almost everything we do. This does not necessarily support monotonicity as some might think, we are ought to evolve, learn and get better, hence it is iterative in nature. I hope this was a thought provoking read for you.

For more interested ones here is a wikipedia link.

Its just the realization that matters. Once we understand our impact and role in any process, we can better execute our roles. All we need is a big picture to understand Why do we work? What do we contribute? How are we impacting? and most importantly Who are we ?

Summer Internship 2015

Fri, 24 Jul 2015 00:00:00 +0000

In the beginning of may, we as a team decided not to go for a conventional internship during our summer internship period in June-July. Instead we opted for contributing to the society by solving a socially relevant problem statement. We had heard about the summer school being organised by Sristi, but had no idea how we can convert this programme into an internship. So, our journey began by a series of conversation with Shivam Chawla (a final year student of Mechanical Engg.) which ultimately led us to a conversation with Hiranmay Mahanta, the MD, Techpedia. He guided and showed us how we can devote our summer internship time, by working on this problem statement (Education Application for mobile Platform ).

After, his briefing about the whole project, we finally did a few brainstorming sessions about this problem statement at our Innovation Space at NIT Hamirpur and finally, we knew how would this app be designed and built. But, this was not enough. We still had no idea whether this application was a necessity for the current education scenario. So in order to know more about product viability and its design, we as a team attended the online video sessions by Prof. Anil K Gupta, Prof Kate Bissett Johnson, Swinburne, and most importantly the Orientation on Field work by Prof Rashmi Korjan. You can find more about these sessions here.

Field Visit and Requirement Elicitation :

Once we were confident and prepared enough about the filed visit, we planned it to Gurukul School in Hamirpur and Govt. Degree College Hamirpur. These two visits gave us a whole new perspective to the app we were making. And the requirements were noted down and more brainstorming sessions were done. Find out more here.

Development Phase :

Finally we were ready for the development phase.It started on 15th June We divided the team into three parts :

Front End :
This team was required to be creative because entire user interface was their effort and responsibility. Divyanshu Maithani, Himanshu Singh and Devesh Rohan were assigned the work, which they executed with extreme brilliance.
Back End :
This team had to be the back bone of the application as the entire server side configuration was on their hands. Akshendra Pratap Singh, Sahil Badyal were responsible for the same.
Knowledge and Data Collection :
This was probably the most important team as they had to find the data and learn about both front and back end and had to do extensive research . We could have none better than Sagar Karira and Shashi Dhiman.

Also besides these teams we had Sagar Karira and Sahil Badyal working on documentation and blog posting.

This phase was the longest and completed on 17th July. We were in constant touch with Adhish Patel and Hiranmay Mahanta during entire development and had their constant support.

Beta Phase and Deployment :

The app has currently entered its beta phase and is undergoing rigorous testing. Hopefully we will solve maximum bugs possible in this phase.

This internship has been an eye opener and great experience, as we now have a project which could very well be our startup. I want to thank everyone who has supported and assisted us in our endeavour, especially our parents and God for being kind and merciful.