<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Develop Freedom</title>
    <description>Develop Freedom - Blog by Shubham Chaudhary</description>
    <link>https://shubham.chaudhary.xyz/blog/</link>
    <atom:link href="https://shubham.chaudhary.xyz/blog/feed.xml" rel="self" type="application/rss+xml" />
    <pubDate>Tue, 24 Sep 2024 21:15:25 +0000</pubDate>
    <lastBuildDate>Tue, 24 Sep 2024 21:15:25 +0000</lastBuildDate>
    <generator>Jekyll v3.10.0</generator>
    
      <item>
        <title>Sapiens at the Crossroads of the Intelligence Age</title>
        <description>&lt;p&gt;Humanity has always been shaped by &lt;a href=&quot;https://www.britannica.com/story/history-of-technology-timeline&quot;&gt;the tools we create&lt;/a&gt;. From fire and the wheel to agriculture and steam engines, each technological leap has redefined what it means to be human. These advancements didn’t just make life easier; they fundamentally reshaped societies, economies, and power structures. As we enter &lt;a href=&quot;https://ia.samaltman.com&quot;&gt;the &lt;em&gt;Intelligence Age&lt;/em&gt;&lt;/a&gt;, where &lt;a href=&quot;https://www.britannica.com/science/history-of-artificial-intelligence&quot;&gt;artificial intelligence&lt;/a&gt; becomes an &lt;a href=&quot;https://x.com/sama/status/1813984333352649087&quot;&gt;omnipresent resource&lt;/a&gt;, we stand on the cusp of a transformation as significant as the Cognitive Revolution that set Homo sapiens apart from other species. Just as the ability to imagine and communicate abstract ideas allowed humans to dominate the planet, AI has the potential to redefine intelligence itself—both human and artificial—ushering in a new era of progress, but also deep disruption.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ai/intelligence-age/o1-ai-reasoning-performance.webp&quot; alt=&quot;o1-ai-reasoning-performance&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Reflecting on the last decade, working at the intersection of AI and machine learning since the &lt;a href=&quot;https://www.britannica.com/technology/neural-network&quot;&gt;early breakthroughs in deep learning&lt;/a&gt;, I’ve witnessed firsthand the rapid acceleration of what machines can accomplish. What was once a field of theoretical promise became an engine for real-world transformation, especially as deep learning models began scaling with access to more data and computational power. Having contributed to this evolution, building systems ranging from recommendations &amp;amp; search systems to complex data understanding ML models &amp;amp; data pipelines, I’ve seen how AI’s growth doesn’t just affect the technical domain—it profoundly shapes economies, societies, and global geopolitics.&lt;/p&gt;

&lt;p&gt;As we enter this Intelligence Age, we must ask ourselves: How will these changes ripple through the human experience, as earlier revolutions have done? The challenge isn’t simply technical, but philosophical—how we manage this profound shift will determine the future of humanity.&lt;/p&gt;

&lt;h3 id=&quot;economic-shifts-decoupling-labor-from-value&quot;&gt;Economic Shifts: Decoupling Labor from Value&lt;/h3&gt;

&lt;p&gt;From the dawn of human history, labor has been the bedrock of economic value. Agriculture required physical effort to produce food, just as industry needed human hands to run factories. But AI introduces a new economic paradigm: a world where &lt;a href=&quot;https://openai.com/index/learning-to-reason-with-llms/&quot;&gt;cognitive labor&lt;/a&gt;—the very thing that defined human superiority—can be automated. As machines take on more complex tasks, from financial modeling to legal analysis, we begin to see the decoupling of human labor from economic productivity.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ai/intelligence-age/sapiens-timeline-of-history.jpg&quot; alt=&quot;sapiens-timeline-of-history&quot; /&gt;&lt;/p&gt;

&lt;p&gt;For those of us who have worked within this field, building and scaling AI systems, the implications are clear. AI doesn’t just automate repetitive tasks; it can now handle &lt;a href=&quot;https://openai.com/index/openai-o1-system-card/&quot;&gt;nuanced decision-making processes&lt;/a&gt;, fundamentally altering the role of human workers. In this new economy, value creation increasingly shifts from labor to ownership of technology and data. The control of AI systems and the data that powers them could lead to a concentration of wealth and power in the hands of a few, deepening inequality unless economic models are rethought.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ai/intelligence-age/boston-dynamics-spot-and-stretch.webp&quot; alt=&quot;boston-dynamics-spot-and-stretch&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This decoupling also introduces the possibility of deflationary pressures. As the cost of goods and services declines due to AI-driven efficiencies, traditional economic growth models—based on consumption and inflation—may need to be reevaluated. Central banks and policymakers must rethink how they manage an economy where productivity surges but human labor plays a diminishing role.&lt;/p&gt;

&lt;h3 id=&quot;social-implications-redefining-work-and-human-identity&quot;&gt;Social Implications: Redefining Work and Human Identity&lt;/h3&gt;

&lt;p&gt;Every major technological shift has also reshaped human identity. Just as the Agricultural Revolution changed nomadic tribes into settled communities, and the Industrial Revolution turned craftspeople into factory workers, the Intelligence Age will redefine what it means to work and live. Throughout my career in AI, I’ve seen how automation is already transforming the workplace, shifting labor from routine tasks to more creative or emotional roles. But this is only the beginning.&lt;/p&gt;

&lt;p&gt;Work has historically provided more than just income—it has been central to our sense of purpose and identity. What happens when AI automates many of the tasks that once defined professions? As AI takes over roles in healthcare, finance, manufacturing and even creative industries, society will need to find new ways to derive meaning and purpose outside of traditional employment.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ai/intelligence-age/openai-stages-of-ai.webp&quot; alt=&quot;openai-stages-of-ai&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This shift will require a fundamental rethinking of education and skill development. The skills that AI cannot replicate—creativity, empathy, emotional intelligence—will become the most valuable human traits. Educational systems, long focused on rote learning and standardized testing, will need to adapt to prepare individuals for roles that involve higher-order thinking, adaptability, and emotional labor. AI itself could play a role in this transformation, offering personalized learning experiences tailored to each individual’s needs, democratizing access to high-quality education.&lt;/p&gt;

&lt;p&gt;But here, too, lies the risk of deepening social divides. In societies where access to AI-driven education and tools is unequal, those without access could fall further behind, exacerbating existing inequalities. Just as the Industrial Revolution created a divide between factory owners and workers, the Intelligence Age could create a divide between those who can harness AI and those left behind by it.&lt;/p&gt;

&lt;h3 id=&quot;geopolitical-repercussions-ai-as-the-new-global-power&quot;&gt;Geopolitical Repercussions: AI as the New Global Power&lt;/h3&gt;

&lt;p&gt;AI’s influence won’t stop at the borders of the workplace or the classroom. On the global stage, AI is becoming a tool of geopolitical power, much like nuclear energy or oil once were. Nations that dominate AI research and control the infrastructure that powers it are positioning themselves for economic and military supremacy. I’ve observed how governments and tech companies alike are racing to develop AI systems that can provide strategic advantages in everything from cybersecurity to autonomous weapons systems.&lt;/p&gt;

&lt;p&gt;In many ways, the AI arms race is already underway. Countries with access to vast computational resources and data ecosystems are outpacing those without, setting the stage for a new kind of geopolitical tension. Control over AI isn’t just about economic productivity; it’s about national security, surveillance capabilities, and even the ability to shape global narratives through AI-generated media and information.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ai/intelligence-age/ai-startups-around-world.webp&quot; alt=&quot;ai-startups-around-world&quot; /&gt;&lt;/p&gt;

&lt;p&gt;But AI’s geopolitical influence also presents opportunities for collaboration. Shared AI resources could help nations address global challenges, from climate change to pandemics. AI models are already being used to optimize energy grids, predict environmental changes, and model disease spread. These applications offer a glimpse of how AI, if managed collaboratively, could lead to collective global benefits rather than competitive domination.&lt;/p&gt;

&lt;p&gt;However, without global standards and ethical frameworks, AI’s rise could also deepen global inequalities. Less developed nations, without access to AI infrastructure, could find themselves dependent on AI superpowers, much like resource-poor countries relied on industrialized nations in the past. Navigating this new world order will require not only technological innovation but also diplomacy and international cooperation.&lt;/p&gt;

&lt;h3 id=&quot;ethical-imperatives-navigating-the-future-of-ai&quot;&gt;Ethical Imperatives: Navigating the Future of AI&lt;/h3&gt;

&lt;p&gt;In all of this, one thread remains constant: the ethical responsibility of those who build, deploy, and control AI systems. Throughout my career, I’ve seen the tension between innovation and regulation, between pushing the boundaries of what AI can do and ensuring it is used responsibly. As AI becomes more integrated into society, these ethical questions become even more pressing.&lt;/p&gt;

&lt;p&gt;Bias in AI systems—rooted in the data they are trained on—can perpetuate and even amplify social inequalities. Privacy concerns grow as AI systems rely on vast amounts of personal data to function effectively. These challenges require more than just technical solutions; they demand a shift in how we think about transparency, accountability, and fairness in AI development.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ai/intelligence-age/ai-ethical-perspectives.webp&quot; alt=&quot;ai-ethical-perspectives&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Ethical AI is not just a technical problem but a societal one. Ensuring that AI benefits humanity as a whole, rather than entrenching the power of a few, will require new regulatory frameworks and governance models. This is not about slowing innovation but about guiding it in a way that aligns with our collective values.&lt;/p&gt;

&lt;h3 id=&quot;conclusion-the-next-leap-in-human-evolution&quot;&gt;Conclusion: The Next Leap in Human Evolution&lt;/h3&gt;

&lt;p&gt;Just as the Cognitive Revolution set Homo sapiens apart from other species, the Intelligence Age represents a new chapter in human evolution. AI is not just another tool—it is a new form of intelligence that can augment and, in some cases, replace human &lt;a href=&quot;https://openai.com/index/introducing-openai-o1-preview/&quot;&gt;decision-making&lt;/a&gt;. But as with every leap forward, the stakes are high. How we manage this new form of intelligence will determine whether the Intelligence Age becomes an era of unprecedented human flourishing or deep social and economic disruption.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://ourworldindata.org/technology-long-run&quot;&gt;journey&lt;/a&gt; from the Cognitive Revolution to the Agricultural Revolution, to the Industrial Age, and now to the Intelligence Age has always been one of adaptation. Our species has thrived because of our ability to not only develop new tools but to reshape our societies around them. The question now is not whether AI will change the world—it undoubtedly will—but whether we are prepared to shape that change in ways that are equitable, ethical, and sustainable.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ai/intelligence-age/long-term-technology-timeline.webp&quot; alt=&quot;long-term-technology-timeline&quot; /&gt;&lt;/p&gt;

&lt;p&gt;As we navigate this &lt;a href=&quot;https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/&quot;&gt;new era&lt;/a&gt;, we must keep asking the big questions—about labor, value, identity, power, and ethics. Only then can we ensure that the Intelligence Age enhances, rather than diminishes, the human experience. This is the next stage in our evolutionary story, and how we respond will define the future of humanity.&lt;/p&gt;
</description>
        <pubDate>Tue, 24 Sep 2024 08:22:26 +0000</pubDate>
        <link>https://shubham.chaudhary.xyz/blog/ai/sapiens-intelligence-age</link>
        <guid isPermaLink="true">https://shubham.chaudhary.xyz/blog/ai/sapiens-intelligence-age</guid>
        
        <category>artificial-intelligence</category>
        
        <category>ai</category>
        
        <category>ml</category>
        
        
      </item>
    
      <item>
        <title>Why We Sleep - Book Review</title>
        <description>&lt;h2 id=&quot;-rating&quot;&gt;📚 Rating&lt;/h2&gt;
&lt;p&gt;4.5 ⭐ / 5 🌟&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.goodreads.com/book/show/34466963-why-we-sleep&quot; style=&quot;float: left; padding-right: 20px&quot;&gt;&lt;img border=&quot;0&quot; alt=&quot;Why We Sleep: Unlocking the Power of Sleep and Dreams&quot; src=&quot;https://i.gr-assets.com/images/S/compressed.photo.goodreads.com/books/1556604137l/34466963._SX98_.jpg&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;https://www.goodreads.com/book/show/34466963-why-we-sleep&quot;&gt;Why We Sleep: Unlocking the Power of Sleep and
Dreams&lt;/a&gt; by &lt;a href=&quot;https://www.goodreads.com/author/show/17598726.Matthew_Walker&quot;&gt;Matthew Walker&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;My goodreads rating: &lt;a href=&quot;https://www.goodreads.com/review/show/3198659584&quot;&gt;5 of 5 stars&lt;/a&gt;&lt;br /&gt;
&lt;a href=&quot;https://www.goodreads.com/review/list/38698703-shubham-chaudhary&quot;&gt;View all my reviews&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;review&quot;&gt;Review&lt;/h2&gt;
&lt;p&gt;I started reading this book last year. I read the first three chapters and that gave me a good enough idea that I stopped reading the book and started giving myself full 8 hour uninterrupted sleeping time. Now with a whole year of experiencing regular 8 hour sleep, as I finish this book I feel really good for picking this up in the first place.&lt;/p&gt;

&lt;p&gt;Matthew Walker is a professor of neuroscience and psychology, runs a Sleep lab in UC Berkeley, has a twenty-plus-year of research career and the book is grounded in research. The tone at times gets very serious, given the ignorance of sleep in our life, that makes sense. The author tries hard to convince the reader to sleep enough hours and at least based on my experience, they were able to accomplish this goal.&lt;/p&gt;

&lt;p&gt;Chapter 1 discuss sleep in general. Chapter 2 goes into circadian rhythm, sleep pressure, affects of caffeine, jetlag, etc. on our sleep. Chapter 3 goes into the details of sleep cycles, research, etc.&lt;/p&gt;

&lt;h2 id=&quot;details&quot;&gt;Details&lt;/h2&gt;
&lt;p&gt;Personally, after starting the regular 8 hour uninterrupted sleep schedule, I have been able to wake up without any alarms in exact 7.5 hours daily. I have personally felt the effects of caffeine, as a results I stopped drinking diet coke and switched to decaffeinated coffee beans. Half life of caffeine is 5-7 hours 🤯, don’t drink caffeine, definitely don’t drink caffeine after 2-3 pm. The temperature schedule in thermostat to be lower at night and higher around the wake up time has also been super useful.&lt;/p&gt;

&lt;h3 id=&quot;sleep-architecture&quot;&gt;Sleep Architecture&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/review/book/why-we-sleep/sleep-architecture.jpg&quot; alt=&quot;sleep-architecture&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Understanding the architecture of sleep also comes in handy (Figure 8), a hypnogram.
We sleep in 1.5 hour cycles where initial cycles are mainly filled with NREM sleep and the later cycles are filled with REM sleep.
We typically go into the waking stage around 3, 6 and 7.5 hour mark.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/review/book/why-we-sleep/sleep-brainwaves.jpg&quot; alt=&quot;sleep-brainwaves&quot; /&gt;&lt;/p&gt;

&lt;p&gt;NREM sleep performs the work of weeding out and removing unnecessary connections.
REM sleep is when we save our learning, strengthen the neural connections and dream.
The last 1.5 hour section is when we get the maximum amount of REM sleep.
This also means sleep hours aren’t equally divided, you can’t wake up early and finish your sleep by taking a nap.
After fixing my sleep deficiency, I almost always wake up around the 6 hour mark (4*1.5hr) and I understand now and sleep more to finish the last 1.5hr part of the sleep cycle.&lt;/p&gt;

&lt;h3 id=&quot;sleep-cycle&quot;&gt;Sleep Cycle&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/review/book/why-we-sleep/sleep-cycle.jpg&quot; alt=&quot;sleep-cycle&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Understanding the 24 hour circadian rhythm of the superchiasmatic nucleus and the sleep pressure of adenosine has also come in handy (Figure 4, 5, 6).
There’s genetic evidence for the chronotype aka night owl vs morning larks.
It’s important to understand which one are you, not everyone benefits from waking up at 5 am, everyone definitely benefits from sleeping full 5 sleep cycles.
Also, you can’t fix your sleep schedule by staying awake the entire night, trust me I’ve tried, the book explains why.
You have to shift your sleep schedule by slowly sleeping 0.5/1 hour earlier every day.&lt;/p&gt;

&lt;h3 id=&quot;other&quot;&gt;Other&lt;/h3&gt;

&lt;p&gt;There were some great tips along those later chapters like what to do when you are sleepy and you have to drive, answer: DON’T.
There are tips about how to reduce sleep pressure but in general avoid driving in such situation.
Drowsy driving vehicular accidents exceed those caused by alcohol and drugs. It’s crazy that we don’t see enough warning about lack of sleep as much as we see for alcohol.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/review/book/why-we-sleep/sleep-loss.jpg&quot; alt=&quot;sleep-loss&quot; /&gt;&lt;/p&gt;

&lt;p&gt;After reading the chapter about age and sleep, I started recommending my parents to also start sleeping full 8 hours.&lt;/p&gt;

&lt;h3 id=&quot;how-to-consume&quot;&gt;How to consume?&lt;/h3&gt;

&lt;p&gt;It is great to combine this with ebook and audiobook.
All the chapters are self contained and can be read in any order.
Last few chapters are great candidates for listening via audiobook.&lt;/p&gt;

&lt;p&gt;Scribd offers subscription service where you can read books, audiobooks and a lot more.
You can find &lt;a href=&quot;https://www.scribd.com/book/359457264/Why-We-Sleep-Unlocking-the-Power-of-Sleep-and-Dreams&quot;&gt;books&lt;/a&gt; and &lt;a href=&quot;https://www.scribd.com/audiobook/360543909/Why-We-Sleep-Unlocking-the-Power-of-Sleep-and-Dreams&quot;&gt;audiobooks&lt;/a&gt; for Why We Sleep on Scribd.
You can use this &lt;a href=&quot;https://www.scribd.com/g/6s7m7u&quot;&gt;promo link to get free 60 days of subscription&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;what-worked-for-me-to-get-good-sleep&quot;&gt;What worked for me to get good sleep&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Understand circadian rhythm and sleep pressure relation.&lt;/li&gt;
  &lt;li&gt;Lower temperature of the room to around 20C.&lt;/li&gt;
  &lt;li&gt;Stick to a sleep schedule, even if you can’t, still give yourself 8 hours. iPhones have a great sleep schedule + DND feature.&lt;/li&gt;
  &lt;li&gt;Sleeping 8 hours in parts is not the same as sleep for continuous 8 hours. First half is filled with NREM sleep, second half is filled with REM sleep.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;twelve-tips-for-healthy-sleep&quot;&gt;Twelve Tips for Healthy Sleep&lt;/h3&gt;
&lt;p&gt;These are the twelve tips from the book apendix.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Stick to a sleep schedule. Set an alarm for bedtime.&lt;/li&gt;
  &lt;li&gt;Try to exercise at least thirty minutes on most days but not later than two to three hours before your bedtime.&lt;/li&gt;
  &lt;li&gt;Avoid caffeine and nicotine. Coffee, colas, certain teas,
and chocolate contain the stimulant caffeine, and its
effects can take as long as eight hours to wear off fully.
Therefore, a cup of coffee in the late afternoon can make
it hard for you to fall asleep at night.&lt;/li&gt;
  &lt;li&gt;Avoid alcoholic drinks before bed. Having a nightcap or
alcoholic beverage before sleep may help you relax, but
heavy use robs you of REM sleep, keeping you in the
lighter stages of sleep.&lt;/li&gt;
  &lt;li&gt;Avoid large meals and beverages late at night.&lt;/li&gt;
  &lt;li&gt;If possible, avoid medicines that delay or disrupt your sleep.&lt;/li&gt;
  &lt;li&gt;Don’t take naps after 3 p.m. Naps can help make up for
lost sleep, but late afternoon naps can make it harder to
fall asleep at night.&lt;/li&gt;
  &lt;li&gt;Relax before bed. Don’t overschedule your day so that no
time is left for unwinding. A relaxing activity, such as
reading or listening to music, should be part of your
bedtime ritual.&lt;/li&gt;
  &lt;li&gt;Take a hot bath before bed. The drop in body
temperature after getting out of the bath may help you
feel sleepy, and the bath can help you relax and slow
down so you’re more ready to sleep.&lt;/li&gt;
  &lt;li&gt;Dark bedroom, cool bedroom, gadget-free bedroom. Get
rid of anything in your bedroom that might distract you
from sleep, such as noises, bright lights, an
uncomfortable bed, or warm temperatures. You sleep
better if the temperature in the room is kept on the cool
side.&lt;/li&gt;
  &lt;li&gt;Have the right sunlight exposure.
Daylight is key to regulating daily sleep patterns.
Get in natural sunlight for at least thirty minutes each day, wake up with the sun.&lt;/li&gt;
  &lt;li&gt;Don’t lie in bed awake. If you find yourself still awake after
staying in bed for more than twenty minutes or if you are
starting to feel anxious or worried, get up and do some
relaxing activity until you feel sleepy.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;quotes&quot;&gt;Quotes&lt;/h2&gt;
&lt;ul&gt;
  &lt;li&gt;“There does not seem to be one major organ within the body, or process within the brain, that isn’t optimally enhanced by sleep (and detrimentally impaired when we don’t get enough)”&lt;/li&gt;
  &lt;li&gt;“The uneven back-and-forth interplay between NREM and REM sleep is necessary to elegantly remodel and update our neural circuits at night, and in doing so manage the finite storage space within the brain. Forced by the known storage capacity imposed by a set number of neurons and connections within their memory structures, our brains must find the “sweet spot” between retention of old information and leaving sufficient room for the new. Balancing this storage equation requires identifying which memories are fresh and salient, and which memories that currently exist are overlapping, redundant, or simply no longer relevant.”&lt;/li&gt;
  &lt;li&gt;“A key function of deep NREM sleep, which predominates early in the night, is to do the work of weeding out and removing unnecessary neural connections. In contrast, the dreaming stage of REM sleep, which prevails later in the night, plays a role in strengthening those connections.”&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
  &lt;p&gt;Excerpt From
Why We Sleep: Unlocking the Power of Sleep and Dreams
Matthew Walker
This material may be protected by copyright.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
        <pubDate>Sat, 05 Mar 2022 01:56:37 +0000</pubDate>
        <link>https://shubham.chaudhary.xyz/blog/review/book/why-we-sleep</link>
        <guid isPermaLink="true">https://shubham.chaudhary.xyz/blog/review/book/why-we-sleep</guid>
        
        <category>books</category>
        
        <category>book-review</category>
        
        
      </item>
    
      <item>
        <title>Autonomous and Electric Vehicle Space</title>
        <description>&lt;p&gt;A quarter of the global energy-related greenhouse gas emissions are from transit vehicles.
With global carbon emissions on the rise, electric cars are going to be a crucial part of the future.
Self-Driving cars are going to be more efficient and spend less time/fuel in traffic.
While reading Bill Gate’s new book - &lt;a href=&quot;https://www.goodreads.com/book/show/52275335-how-to-avoid-a-climate-disaster&quot;&gt;‘How to Avoid a Climate Disaster: The Solutions We Have and the Breakthroughs We Need’&lt;/a&gt;,
I’ve been researching about the Self-Driving and Electric Vehicle market space and its economics.&lt;/p&gt;

&lt;h2 id=&quot;market-leaders&quot;&gt;Market Leaders&lt;/h2&gt;
&lt;p&gt;Tesla clearly has a first-to-market advantage, but it is not the only savior and by all metrics it is highly overvalued.
No one has capitalized more on the EV hype than Tesla. $TSLA has &lt;a href=&quot;https://wolfstreet.com/2021/01/02/tesla-finally-almost-hit-500000-deliveries-2-years-behind-its-2016-promise-for-a-global-market-share-of-0-7/&quot;&gt;more market cap&lt;/a&gt; as the next 9 major car manufacturers &lt;a href=&quot;https://www.cnbc.com/2020/12/14/tesla-valuation-more-than-nine-largest-carmakers-combined-why.html&quot;&gt;combined&lt;/a&gt;, the likes of $F, $GM, $TM, etc.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ev/tsla-market-cap.png&quot; alt=&quot;TSLA Market Cap&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Currently, Ford and GM are the biggest competitors with multiple electric vehicles coming out soon.
Ford is coming out with an electric model of its extremely famous F150 by end of 2021.
GM plans to sell only electric cars by 2035.
Ford plans to be all electric cars in Europe by 2030.
Several other successful auto-makers with a proven track record have plans to build more and more electric vehicles - Daimler/Mercedes-Benz, BMW, Jaguar.&lt;/p&gt;

&lt;h2 id=&quot;challenges&quot;&gt;Challenges&lt;/h2&gt;
&lt;p&gt;To make the EV’s a de-facto standard, there are several challenges that need to be solved.&lt;/p&gt;

&lt;h3 id=&quot;energy-density-problem&quot;&gt;Energy Density Problem&lt;/h3&gt;
&lt;p&gt;Interestingly, EV powertrains are a lot cheaper and simpler than ICE powertrains (Internal Combustion Engine), also means less maintenance.
Battery energy density is the biggest hurdle to successfully replacing fossil fuel powered vehicles.
It becomes impractical to build large/high-range electric vehicles because the weight of battery packs just makes the vehicle too heavy.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ev/battery-energy-density.jpg&quot; alt=&quot;battery energy density&quot; /&gt;&lt;/p&gt;

&lt;p&gt;To make a practically successful electric vehicle, you need to match the range that gas-powered vehicles provide.
Gas-powered vehicles might be less efficient at utilizing the fuel completely but equivalent battery cells required for same range weigh a lot more.&lt;/p&gt;

&lt;p&gt;Panasonic is the leading battery cell manufacturer and energy-density is a limitation that all EV car manufacturers face.
That’s the reason why major breakthrough EV adoption has been in the short-to-medium range vehicles and why long-haul electric truck manufacturing is harder.&lt;/p&gt;

&lt;h3 id=&quot;charging&quot;&gt;Charging&lt;/h3&gt;
&lt;p&gt;No one wants to wait for an hour every few 100 miles, so you need more charging stations and to charge batteries faster.
To ensure that users can drive long range without having to worry about charging, a good quality the charging station infrastructure is needed.
Charging stations cost a lot of money to build, making it difficult to cover larger areas.
&lt;a href=&quot;https://www.forbes.com/sites/bradtempleton/2019/12/19/competing-electric-car-charging-standards-can-be-easily-fixed/&quot;&gt;Lack of a standard&lt;/a&gt; charging plug is no help.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ev/ev-charging-plugs.jpg&quot; alt=&quot;ev charging plugs&quot; /&gt;&lt;/p&gt;

&lt;p&gt;China is going to be a big market for EV manufacturers.
Nio has approached the charging problem from a unique angle in China - Battery-as-a-Service subscription model.
They are building cars with battery-swapping capability and charging-stations that can swap battery pack in &lt;a href=&quot;https://www.caranddriver.com/news/a33670482/nio-swappable-batteries-lease/&quot;&gt;just 5 minutes&lt;/a&gt;.
Starting in Nov 2018, it has already completed &lt;a href=&quot;https://insideevs.com/news/448165/nio-completed-1-millionth-battery-swap/&quot;&gt;a million swaps&lt;/a&gt; by end of 2020.&lt;/p&gt;

&lt;video muted=&quot;&quot; autoplay=&quot;&quot; controls=&quot;&quot;&gt;
  &lt;source src=&quot;/blog/img/ev/battery-swap.mp4&quot; type=&quot;video/mp4&quot; alt=&quot;Battery Swap in Action&quot; /&gt;
&lt;/video&gt;

&lt;h2 id=&quot;autonomous-vehicle&quot;&gt;Autonomous Vehicle&lt;/h2&gt;
&lt;p&gt;Driving is a very tedious and pretty lack-luster task, especially on long routes.
Given a choice, I’d rather read a book than to drive myself.
Still, you need to get from point A to point B somehow.
Humans make mistakes, they lose attention, get tired.
So much so that human error play a part in 90% of crashes.
Machines on the other hand can do this task without getting tired at all.&lt;/p&gt;

&lt;p&gt;Attempt at autonomy aren’t new but recent advances in sensors, computing power and machine learning are making is more possible now.
Lex Fridman has a very good &lt;a href=&quot;https://www.youtube.com/playlist?list=PLrAXtmErZgOeY0lkVCIVafdGFOTi45amq&quot;&gt;lectures on deep-learning&lt;/a&gt; for self-driving vehicles and deep-learning.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ev/autonomy-history.webp&quot; alt=&quot;History of Autonomy&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Autonomous vehicles are going to be the standard in the future.
Self-driving cars are going to change society more than anything else in recent past.
Attaining semi-autonomy is withing reach but fully autonomy will require significant advances.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ev/semi-vs-full-autonomy.webp&quot; alt=&quot;Semi vs Full Autonomy&quot; /&gt;&lt;/p&gt;

&lt;p&gt;With all &lt;a href=&quot;https://www.theglobeandmail.com/globe-drive/self-driving-cars-are-going-to-dramatically-change-our-world-so-when-does-the-revolution-begin/article32650833/&quot;&gt;level-5 self-driving vehicles&lt;/a&gt; on road, car crashes will be a thing of the past.
Traffic can also be significantly reduced by coordination of fully autonomous vehicles driving on road.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube-nocookie.com/embed/4CZc3erc_l4&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;According to Navigant’s &lt;a href=&quot;https://www.cnet.com/roadshow/news/self-driving-study-navigant-research-tesla-waymo-cruise/&quot;&gt;research&lt;/a&gt;, Ford and GM are among the leaders in the race for autonomous vehicle tech.
TSLA has hype on its side, F &amp;amp; GM have experience, scale, higher earnings/profits and likely the tech too.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ev/2020-navigant-av-leaderboard.webp&quot; alt=&quot;TSLA vs F vs GM autonomous driving&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Apple is teaming with Hyundai but it’s still years away from production.&lt;/p&gt;

&lt;p&gt;The primary intent of owning a car is to go from point A to point B.
With autonomous taxi service, you can reduce the cost of transportation significantly, something which hasn’t changed much since the first Ford rolled out of assembly line.
Google’s Waymo has teamed up with Chrysler and it is &lt;a href=&quot;https://blog.waymo.com/2020/10/waymo-is-opening-its-fully-driverless.html&quot;&gt;targeting&lt;/a&gt; ride-hailing market, already offering driverless rides in Arizona.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/ev/waymo-on-road.gif&quot; alt=&quot;Waymo taxi&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Electric Vehicles need to be the go-to choice for consumers.
EV is a hot space with a lot of hype and upcoming competition.
There are several other areas that need innovation to make electric cars the de-facto choice.
Tesla doesn’t have all the answers to the problems our planet faces.
It was a driver for EV growth and helped move the legacy auto-makers.
A fully autonomous shared ride-hailing vehicle fleet can reduce the total cost of ownership, dramatically reduce the cost of transportation and reduce traffic/congestions.
It is going to be one of many options, we need more innovation and for it to happen sooner!&lt;/p&gt;

</description>
        <pubDate>Sun, 28 Feb 2021 04:55:46 +0000</pubDate>
        <link>https://shubham.chaudhary.xyz/blog/autonomous-electric-vehicles</link>
        <guid isPermaLink="true">https://shubham.chaudhary.xyz/blog/autonomous-electric-vehicles</guid>
        
        <category>thoughts</category>
        
        <category>personal</category>
        
        <category>opinion</category>
        
        <category>electric-vehicle</category>
        
        <category>autonomous-vehicle</category>
        
        <category>self-driving</category>
        
        <category>deep-learning</category>
        
        <category>machine-learning</category>
        
        <category>ml</category>
        
        
      </item>
    
      <item>
        <title>Deep Learning for Image Classification</title>
        <description>&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;

&lt;p&gt;At &lt;a href=&quot;https://www.zomato.com&quot;&gt;Zomato&lt;/a&gt;, we manage more than half a billion images, which are integral to various aspects of our platform. Every day, we process close to 100,000 new images, contributing to petabytes of data, with a daily influx of approximately 500 GB of fresh visual content.&lt;/p&gt;

&lt;p&gt;In this blog, we delve into how we built a neural network-based machine learning model to classify these images into categories like food, ambiance, menu, and more. This post is particularly valuable for professionals at startups deploying their first machine learning use case, students eager to learn how to turn innovative ideas into practical solutions, and anyone interested in the technical nuances of scaling deep learning models in a production environment.&lt;/p&gt;

&lt;h3 id=&quot;what-youll-learn&quot;&gt;What You’ll Learn&lt;/h3&gt;

&lt;p&gt;This post is driven by our journey of taking a deep learning model from concept to production, addressing the unique challenges of operating at Zomato’s scale. Whether you’re a data scientist, an engineer refining your deployment strategies, a tech enthusiast, or a professional exploring the practical applications of AI, the insights shared here are relevant across various stages of a machine learning project.&lt;/p&gt;

&lt;p&gt;By the end of this blog, you will have gained:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Comprehensive Understanding of Image Classification&lt;/strong&gt;: Explore how we systematically approached the categorization of vast amounts of visual data into meaningful categories using neural networks.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Valuable Practical Insights on Deployment&lt;/strong&gt;: Understand the challenges we faced when deploying our first deep learning model at scale, and learn how we successfully navigated these obstacles.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Key Lessons from Real-World Implementation&lt;/strong&gt;: Discover practical lessons from our experience deploying this model in production, including reflections on what we would do differently if we were to undertake this project today.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Whether you’re just starting out, looking to refine your machine learning deployment process, or simply interested in the application of deep learning at scale, this blog post offers actionable insights to help you navigate the complexities of building and deploying AI models effectively.&lt;/p&gt;

&lt;h2 id=&quot;the-need-for-image-classification&quot;&gt;The Need for Image Classification&lt;/h2&gt;

&lt;p&gt;As a restaurant search, discovery, and delivery platform, Zomato’s primary sources of images are:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;User-Uploaded Images&lt;/strong&gt;: These are images uploaded by users when they visit or order from a restaurant and write reviews.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Team-Collected Images&lt;/strong&gt;: These are images our team gathers from restaurants while listing them on the platform.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Image classification serves several critical functions at Zomato:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Enhanced User Experience&lt;/strong&gt;: By categorizing images into collections such as food and ambiance, we can help users quickly find ambiance images. Previously, we manually tagged around 10-20 images per restaurant as food or ambiance shots. To enhance the user experience, we aimed to categorize all images uploaded across the platform.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Content Balance&lt;/strong&gt;: The majority of images uploaded to Zomato are food-related, which can overshadow ambiance images. Classifying images allows us to surface ambiance shots more effectively, improving the visual balance of restaurant pages.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Content Quality Assurance&lt;/strong&gt;: The quality of content on our platform is paramount. We have a dedicated team of moderators who work tirelessly to ensure that only the best content is showcased to our users. Automated tagging, such as identifying human faces or selfies, can significantly improve our photo moderation turnaround time.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Menu Management&lt;/strong&gt;: Similarly, if an image appears to be a menu, we want our content team to review it to ensure only the highest quality menu images—those manually verified by our data collection team are shown to users.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;building-the-classifier&quot;&gt;Building the Classifier&lt;/h2&gt;

&lt;p&gt;Image classification is fairly straightforward from a technical standpoint, especially when working in a Jupyter notebook. However, our challenge was magnified by the fact that this was our first deep learning project to be deployed in production, and the scale was daunting. We needed to build a system capable of moderating nearly half a million images daily. The initial model was trained in 2016, and this blog post not only recounts our experience from that time but also provides insights into how we would approach retraining today.&lt;/p&gt;

&lt;p&gt;To streamline the entire process—from data gathering to preprocessing, model training, and validation—we utilized &lt;a href=&quot;https://github.com/spotify/luigi&quot;&gt;Luigi&lt;/a&gt;. Luigi allowed us to create a DAG-based pipeline, ensuring that each step was dependent on the completion of the previous ones. This approach was crucial for maintaining the integrity and flow of the pipeline. Luigi also provided a user-friendly visual interface, which made it easier to monitor the progress of our data and model pipeline.&lt;/p&gt;

&lt;h3 id=&quot;dataset-gathering&quot;&gt;Dataset Gathering&lt;/h3&gt;

&lt;p&gt;Before we could demonstrate the effectiveness of this “new deep learning” approach to our PMs, we needed a substantial amount of labeled data. We started with four primary labels: food, ambiance, menu, and human. In the future, we planned to expand these categories to include indoor shots, outdoor shots, drinks, and dishes.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/clazzify/fahm-collage.png&quot; alt=&quot;food, ambiance, menu, human image collage&quot; /&gt;&lt;/p&gt;

&lt;h4 id=&quot;food--ambiance&quot;&gt;Food &amp;amp; Ambiance&lt;/h4&gt;

&lt;p&gt;At Zomato, we had manually tagged images classified as food and ambiance shots. We downloaded 50,000 images for each category to build our classification dataset.&lt;/p&gt;

&lt;h4 id=&quot;menu&quot;&gt;Menu&lt;/h4&gt;

&lt;p&gt;Generating the dataset for menus was the most straightforward task. Given Zomato’s vast collection of manually tagged and categorized menus (one of the foundational elements of the company), we downloaded 50,000 menu images from S3, distributed across randomly selected restaurants.&lt;/p&gt;

&lt;h4 id=&quot;humans&quot;&gt;Humans&lt;/h4&gt;

&lt;p&gt;Curating the dataset for humans was more challenging. We initially used the &lt;a href=&quot;http://www.cs.ucf.edu/~liujg/YouTube_Action_dataset.html&quot;&gt;YouTube dataset&lt;/a&gt;, which includes images with mixed scenes. For example, some images contain humans, but they might also exhibit characteristics of an ambiance shot, leading to potential misclassifications. Our strategy was to train a basic model with this dataset, generate approximate labels, and have our internal moderation team quickly correct them—significantly speeding up the labeling process compared to starting from scratch.&lt;/p&gt;

&lt;p&gt;To address the need for face shots, which were limited in the YouTube dataset, we incorporated the &lt;a href=&quot;http://vis-www.cs.umass.edu/lfw/&quot;&gt;LFW dataset by UMass&lt;/a&gt;, also known as the Labeled Faces in the Wild dataset.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/clazzify/lfw_six_face_panels.jpg&quot; alt=&quot;lfw images preview&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;dataset-preprocessing&quot;&gt;Dataset Preprocessing&lt;/h3&gt;

&lt;p&gt;After gathering the data, our next step was preprocessing. We had a large collection of images categorized into food, ambiance, menu, and human. For model training, it was essential to iterate over this data efficiently and feed it into Keras.&lt;/p&gt;

&lt;p&gt;To handle this, we used the &lt;a href=&quot;https://en.wikipedia.org/wiki/Hierarchical_Data_Format&quot;&gt;Hierarchical Data Format&lt;/a&gt; (&lt;a href=&quot;https://www.h5py.org/&quot;&gt;HDF5&lt;/a&gt;) to create an out-of-memory iterable dataframe. With the &lt;a href=&quot;http://docs.h5py.org/en/stable/quick.html&quot;&gt;pythonic interface&lt;/a&gt; provided by &lt;a href=&quot;https://github.com/h5py/h5py&quot;&gt;h5py&lt;/a&gt;, we could slice and manipulate terabytes of data as if they were numpy arrays in memory.&lt;/p&gt;

&lt;p&gt;We resized each image to 227x227 pixels and performed several cleaning steps. Additionally, we augmented the dataset by creating multiple variations of each image through rotation, scaling, zooming, and cropping. In future retraining efforts, we plan to explore using the RecordIO format for storing images in classification tasks.&lt;/p&gt;

&lt;h3 id=&quot;training-the-model&quot;&gt;Training the Model&lt;/h3&gt;

&lt;p&gt;We began our journey with &lt;a href=&quot;https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf&quot;&gt;AlexNet&lt;/a&gt;, a well-established model in 2016 with multiple &lt;a href=&quot;https://github.com/Zomato/convnets-keras&quot;&gt;open source implementations&lt;/a&gt; available. Alongside AlexNet, we experimented with other architectures like &lt;a href=&quot;https://arxiv.org/pdf/1512.00567.pdf&quot;&gt;Inception v3&lt;/a&gt; and &lt;a href=&quot;https://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf&quot;&gt;Google LeNet&lt;/a&gt;. While these models served us well at the time, today there are more accurate and efficient options available, such as ResNet, MobileNet, and others.&lt;/p&gt;

&lt;p&gt;We chose &lt;a href=&quot;https://keras.io/&quot;&gt;Keras&lt;/a&gt; as our framework due to its flexibility, particularly its ability to switch backend engines (e.g., Theano, TensorFlow) in the future. In 2016, installing TensorFlow wasn’t as straightforward as it is today (pip install tensorflow), so we opted for &lt;a href=&quot;https://github.com/Theano/Theano&quot;&gt;Theano&lt;/a&gt; as our backend engine. Theano provided reliable and consistent results and was easier to set up with Keras during that period. Although Keras remains our preferred choice for writing models, if we were to do this now, we would leverage a platform like &lt;a href=&quot;https://aws.amazon.com/sagemaker/&quot;&gt;AWS Sagemaker&lt;/a&gt; for training.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/clazzify/alexnet-layers.png&quot; alt=&quot;AlexNet layers description image&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We initially trained our models on in-house GPU servers before transitioning to &lt;a href=&quot;https://aws.amazon.com/ec2/instance-types/#Accelerated_Computing&quot;&gt;AWS GPU p2.xlarge instances&lt;/a&gt; to scale our efforts. Rather than using transfer learning on an existing ImageNet model, we trained our models from scratch to better fit the unique characteristics of our restaurant industry domain photos. We worked with 50,000 images for each of our four classes: food, ambiance, menu, and human. As illustrated in the graph below, our efforts resulted in achieving approximately 92% validation accuracy.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/clazzify/accuracy-loss-graph.png&quot; alt=&quot;Accuracy-Loss Graph&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;production-deployment&quot;&gt;Production Deployment&lt;/h3&gt;

&lt;p&gt;For serving the model, we developed an internal API using Flask. We enhanced it with authentication layers and deployed it within our internal VPC network. While today, tools like ONNX and TensorFlow Serving are commonly used for model inference, back in 2016, the landscape for ML model inference was still maturing. As a result, we chose to proceed with a Flask-based API.&lt;/p&gt;

&lt;p&gt;We containerized the API using Docker, with a Miniconda3 base image. After every code merge, Jenkins would run unit tests and build the final Docker image, which included both the application code and the latest version of the model. Automated tests were then executed on this image to validate the inference accuracy on a predefined set of images. Once these tests passed, the Docker image was deployed to AWS Elastic Beanstalk, where the API could automatically scale based on incoming request load.&lt;/p&gt;

&lt;p&gt;Once the API was live, every time an image was uploaded to Zomato, it was queued for processing. Multiple workers would pick the image from the queue, request inference scores from the API, and save these scores in our database.&lt;/p&gt;

&lt;p&gt;Initially, we utilized this setup on the backend for moderation and various other internal use cases. On the product side, we made this &lt;a href=&quot;https://twitter.com/ylogx/status/844817269297311744&quot;&gt;live&lt;/a&gt; for Food &amp;amp; Ambiance classification. It was first integrated into our web platform, with upcoming releases soon adding it to our mobile apps. The image below illustrates the impact of using image classification, showing the results before and after its implementation.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/clazzify/food-ambiance-in-product.png&quot; alt=&quot;Food Ambiance - results before and after classification&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This example highlights how image classification can make it easier to find ambiance shots, especially when the initial images on the restaurant page are predominantly food shots.&lt;/p&gt;

&lt;h2 id=&quot;evolution&quot;&gt;Evolution&lt;/h2&gt;

&lt;p&gt;From our first model, we learned to streamline our data gathering and model training processes significantly to reduce the TAT from an idea to the model generation, reducing time-to-deployment. Future blog posts will cover our evolving ML training processes and other models in production. Stay tuned for updates.&lt;/p&gt;

&lt;p&gt;We are rapidly expanding our machine learning team and have grown in number by 5x in just last year. Check out our &lt;a href=&quot;https://www.zomato.com/careers&quot;&gt;careers page&lt;/a&gt; if you’re interested in joining us.&lt;/p&gt;

</description>
        <pubDate>Sat, 02 Feb 2019 18:37:40 +0000</pubDate>
        <link>https://shubham.chaudhary.xyz/blog/zomato/ml/images/classification</link>
        <guid isPermaLink="true">https://shubham.chaudhary.xyz/blog/zomato/ml/images/classification</guid>
        
        <category>zomato</category>
        
        <category>ml</category>
        
        <category>classification</category>
        
        <category>deep learning</category>
        
        <category>neural network</category>
        
        <category>alexnet</category>
        
        
      </item>
    
      <item>
        <title>Use In-Memory Buffers to Avoid Disk IO</title>
        <description>&lt;p&gt;At &lt;a href=&quot;https://www.zomato.com/blog/&quot;&gt;Zomato&lt;/a&gt;, we handle a vast number of images, with close to a hundred thousand new images daily. Often, we need to download, process, and then pass these images to our models. The traditional workflow involves fetching an image from a URL, saving it to a file, and then passing that file path for further processing.&lt;/p&gt;

&lt;h3 id=&quot;traditional-workflow&quot;&gt;Traditional Workflow&lt;/h3&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;logging&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tempfile&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;cv2&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;requests&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;download_image&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Downloading image from url: %s&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;response_object&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requests&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;file_descriptor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filename&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tempfile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mkstemp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prefix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;image-&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;suffix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;.jpg&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Saving file: %s&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file_descriptor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;wb&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;response_object&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filename&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;https://chaudhary.page.link/test-zomato-img&apos;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;image_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;download_image&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cv2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;imread&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;resized_img&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cv2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;299&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;299&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# preprocess(resized_img)
# prediction_score = model.predict(resized_img)
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;While this approach works for a few images, it creates significant unnecessary disk IO when processing millions of images at Zomato’s scale. Additionally, in a dockerized environment, it results in numerous temporary files.&lt;/p&gt;

&lt;h3 id=&quot;optimized-workflow-with-in-memory-buffers&quot;&gt;Optimized Workflow with In-Memory Buffers&lt;/h3&gt;

&lt;p&gt;To eliminate unnecessary disk IO, we can use in-memory buffers. In Python, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;io.BytesIO&lt;/code&gt; allows you to create a buffer in RAM, which can be used like a file pointer and is automatically deleted when closed or goes out of context when using context manager.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BytesIO&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;cv2&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;numpy&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;requests&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;https://chaudhary.page.link/test-zomato-img&apos;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;response_object&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requests&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;image_data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BytesIO&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;response_object&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;file_bytes&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;asarray&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;bytearray&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;uint8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cv2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;imdecode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file_bytes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cv2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IMREAD_COLOR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;image_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;resized_img&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cv2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;299&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;299&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# preprocess(resized_img)
# prediction_score = model.predict(resized_img)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;imdecode&lt;/code&gt;, we can simplify the process further, eliminating the need for a bytes IO buffer.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;cv2&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;numpy&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;requests&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;https://chaudhary.page.link/test-zomato-img&apos;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;response_object&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requests&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;file_bytes&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;asarray&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;bytearray&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;response_object&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;uint8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cv2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;imdecode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file_bytes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cv2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IMREAD_COLOR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;resized_img&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cv2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;299&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;299&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# preprocess(resized_img)
# prediction_score = model.predict(resized_img)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;performance-analysis&quot;&gt;Performance Analysis&lt;/h3&gt;

&lt;p&gt;To analyze the performance of these methods, I conducted a simple test. Here are the results on my system:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;With File IO: 35.4 ms ± 2.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
With Bytes IO: 35.1 ms ± 3.05 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
With Direct Decode: 34.6 ms ± 1.74 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The Bytes IO reduce unnecessary disk IO, which isn’t measured in this test, even though the performance difference is minimal. Splitting the process into multiple scripts and adding &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;strace&lt;/code&gt; can help see the number of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OPEN&lt;/code&gt; calls, which will be lower in the in-memory methods.&lt;/p&gt;

&lt;p&gt;You can find the code to generate these performance numbers &lt;a href=&quot;https://gist.github.com/7b5d7f0957a4aa3c84c010f3d7f27643&quot;&gt;here&lt;/a&gt;. Let me know if you achieve similar results.&lt;/p&gt;

&lt;!--
&lt;script src=&quot;https://gist.github.com/7b5d7f0957a4aa3c84c010f3d7f27643.js&quot;&gt;&lt;/script&gt;
--&gt;

&lt;h3 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h3&gt;

&lt;p&gt;Using in-memory buffers can significantly optimize image processing workflows by reducing disk IO. This approach is especially beneficial at large scales, such as at Zomato, where it can lead to considerable performance improvements and resource savings.&lt;/p&gt;
</description>
        <pubDate>Mon, 12 Nov 2018 18:29:16 +0000</pubDate>
        <link>https://shubham.chaudhary.xyz/blog/in-memory-buffers</link>
        <guid isPermaLink="true">https://shubham.chaudhary.xyz/blog/in-memory-buffers</guid>
        
        <category>python</category>
        
        <category>BytesIO</category>
        
        <category>buffers</category>
        
        <category>in-memory buffers</category>
        
        <category>zomato</category>
        
        
      </item>
    
      <item>
        <title>KDE Containerization Talk at Akademy 2018</title>
        <description>&lt;p&gt;This July I got the opportunity to be a part of the biggest gathering of KDE developers - &lt;a href=&quot;https://akademy.kde.org&quot;&gt;Akademy&lt;/a&gt; &lt;a href=&quot;https://akademy.kde.org/2018&quot;&gt;2018&lt;/a&gt;.
The akademy conference gathers hundereds of KDE developers together for almost an entire week.&lt;/p&gt;

&lt;p&gt;It was held at TU Wien (Techincal University of Vienna) in the beautiful city &lt;a href=&quot;https://en.wikipedia.org/wiki/Vienna&quot;&gt;Vienna&lt;/a&gt;, Austria from 
Saturday 11th to Friday 17th August 2018.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/akademy/2018/tu-wien-front.JPG&quot; alt=&quot;TU Wien Front Gate&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The akademy conference as usual has 2 days of talks by KDE contributors
followed by the rest of the week comprising of BoF informal sessions, team outing and a lot more.&lt;/p&gt;

&lt;!-- 
BoF stands for birds of feather sessions.
For example in KDE India BoF, we talked about &lt;!-- TODO: Add about india bof --&gt;

&lt;h2 id=&quot;talk-containerizing-kde&quot;&gt;Talk: Containerizing KDE&lt;/h2&gt;

&lt;p&gt;At the conference &lt;a href=&quot;https://anumittal.in&quot;&gt;Anu&lt;/a&gt; amazing enough to let me be a part of her talk.
You should definitely go subscribe to Anu’s &lt;a href=&quot;https://anumittal.in&quot;&gt;blog&lt;/a&gt;.
We presented a &lt;a href=&quot;https://youtu.be/DuVWaCq_Cz4?t=14m45s&quot;&gt;talk&lt;/a&gt; regarding containerization of KDE applications.&lt;/p&gt;

&lt;p&gt;In this talk we discussed various containerization techniques.
We also demonstrated how containerization of KDE can be useful for developers and end users.&lt;/p&gt;

&lt;iframe width=&quot;700&quot; height=&quot;390&quot; src=&quot;https://www.youtube-nocookie.com/embed/DuVWaCq_Cz4?start=885&quot; frameborder=&quot;0&quot; allow=&quot;autoplay; encrypted-media&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;h3 id=&quot;overview&quot;&gt;Overview&lt;/h3&gt;
&lt;p&gt;Setting up a development environment for a software can be time-consuming and at times a bit confusing.
There are many libraries and packages that need to be installed and which might also cause conflict with the existing system packages.
There are various ways to containerize an application, we discussed two major approaches - &lt;a href=&quot;https://www.docker.com&quot;&gt;Docker&lt;/a&gt; and &lt;a href=&quot;https://www.flatpak.org/&quot;&gt;Flatpak&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;docker&quot;&gt;Docker&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;https://www.docker.com&quot;&gt;Docker&lt;/a&gt; helps a developer by setting up a sandboxed development environment in a container which can be used for debugging, testing or developing a new feature.
You can run multiple such environments in parallel e.g. stable &amp;amp; development environment.&lt;/p&gt;

&lt;h4 id=&quot;installing-docker&quot;&gt;Installing Docker&lt;/h4&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get &lt;span class=&quot;nb&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    apt-transport-https &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    ca-certificates &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    curl &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    software-properties-common
    
curl &lt;span class=&quot;nt&quot;&gt;-fsSL&lt;/span&gt; https://download.docker.com/linux/ubuntu/gpg | &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-key add -
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;add-apt-repository &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
   &lt;span class=&quot;s2&quot;&gt;&quot;deb [arch=amd64] https://download.docker.com/linux/ubuntu &lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;
   &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;lsb_release &lt;span class=&quot;nt&quot;&gt;-cs&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; &lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;
   stable&quot;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get update
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;docker-ce

&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;docker run hello-world
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can checkout more specific information on docker website &lt;a href=&quot;https://docs.docker.com/install/linux/docker-ce/ubuntu/#install-using-the-repository&quot;&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;h4 id=&quot;running-kde-applications-using-docker&quot;&gt;Running KDE applications using docker&lt;/h4&gt;
&lt;p&gt;KDE Neon is a project focused on building tooling to make it easy to run KDE applications on docker.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;wget https://cgit.kde.org/docker-neon.git/plain/neondocker/neondocker.rb
&lt;span class=&quot;nb&quot;&gt;chmod&lt;/span&gt; +x neondocker.rb
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;gem &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;docker-api
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;ruby-dev

./neondocker.rb okular
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can find out more information &lt;a href=&quot;https://community.kde.org/Neon/Docker&quot;&gt;here&lt;/a&gt; about KDE Neon Dockerization.&lt;/p&gt;

&lt;h3 id=&quot;flatpak&quot;&gt;Flatpak&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;https://www.flatpak.org/&quot;&gt;Flatpak&lt;/a&gt; provides a sandbox environment in which users can run applications in isolation from the rest of the system.
Flatpak is tightly coupled with linux and mainly focuses on bundling and sandboxing of desktop applications on linux hosts.&lt;/p&gt;

&lt;h4 id=&quot;installing-flatpak&quot;&gt;Installing Flatpak&lt;/h4&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;add-apt-repository ppa:alexlarsson/flatpak
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt update
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;flatpak
    
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;gnome-software-plugin-flatpak
flatpak remote-add &lt;span class=&quot;nt&quot;&gt;--if-not-exists&lt;/span&gt; flathub https://flathub.org/repo/flathub.flatpakrepo

&lt;span class=&quot;c&quot;&gt;# sudo reboot now&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can checkout more specific information on flatpak website &lt;a href=&quot;https://www.flatpak.org/setup/&quot;&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;h4 id=&quot;running-kde-applications-using-flatpak&quot;&gt;Running KDE applications using flatpak&lt;/h4&gt;

&lt;p&gt;There is a &lt;a href=&quot;https://github.com/KDE/flatpak-kde-applications&quot;&gt;wide list&lt;/a&gt; of KDE applications available via flatpak.
To run an application like Okular, you need to run just one command:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;flatpak remote-add &lt;span class=&quot;nt&quot;&gt;--if-not-exists&lt;/span&gt; kdeapps &lt;span class=&quot;nt&quot;&gt;--from&lt;/span&gt; https://distribute.kde.org/kdeapps.flatpakrepo
flatpak &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;kdeapps org.kde.okular
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can find out more information about KDE and Flatpak &lt;a href=&quot;https://community.kde.org/Guidelines_and_HOWTOs/Flatpak&quot;&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;come-be-a-part-of-the-kde-community-&quot;&gt;Come be a part of the KDE community :)&lt;/h2&gt;

&lt;p&gt;Coding is not the only way to contribute to KDE. &lt;img src=&quot;/blog/img/akademy/2018/vienna-dessert.jpg&quot; alt=&quot;Vienna, Austria&quot; width=&quot;100&quot; /&gt;
You can find out many many different ways in which you can contribute to KDE. I can name like 10 things:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Bug Reporting&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging&quot;&gt;Bug Triaging&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.kde.org/community/donations/index.php&quot;&gt;Donation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://community.kde.org/Get_Involved/translation&quot;&gt;Translation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Visual and Human Interface Design&lt;/li&gt;
  &lt;li&gt;Documentation&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://community.kde.org/Get_Involved/promotion&quot;&gt;Promotion&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Accessibility&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://community.kde.org/Get_Involved/development&quot;&gt;Development&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Add your project to KDE &lt;a href=&quot;https://community.kde.org/Incubator&quot;&gt;Incubator&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Checkout the community wiki for more information about &lt;a href=&quot;https://community.kde.org/Get_Involved&quot;&gt;contributing to KDE&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/img/akademy/2018/kde.india.jpg&quot; alt=&quot;KDE India&quot; /&gt;&lt;/p&gt;

</description>
        <pubDate>Thu, 27 Sep 2018 20:02:31 +0000</pubDate>
        <link>https://shubham.chaudhary.xyz/blog/akademy/2018</link>
        <guid isPermaLink="true">https://shubham.chaudhary.xyz/blog/akademy/2018</guid>
        
        <category>kde</category>
        
        <category>kubuntu</category>
        
        <category>akademy</category>
        
        <category>2018</category>
        
        
      </item>
    
      <item>
        <title>Models - Support Vector Machine</title>
        <description>&lt;p&gt;Classifying data is a common task in machine learning.
Suppose some given data points each belong to one of two classes, and the goal is to decide which class a new data point will be in.
In the case of support vector machines, a data point is viewed as a p-dimensional vector (a list of p numbers), and we want to know whether we can separate such points with a (p-1)-dimensional hyperplane. This is called a linear classifier.&lt;/p&gt;

&lt;p&gt;There are many hyperplanes that might classify the data. One reasonable choice as the best hyperplane is the one that represents the largest separation, or margin, between the two classes.
So we choose the hyperplane so that the distance from it to the nearest data point on each side is maximized.&lt;/p&gt;

&lt;p&gt;If such a hyperplane exists, it is known as the maximum-margin hyperplane and the linear classifier it defines is known as a maximum margin classifier; or equivalently, the perceptron of optimal stability.&lt;/p&gt;

&lt;h4 id=&quot;usages&quot;&gt;Usages&lt;/h4&gt;
&lt;ul&gt;
  &lt;li&gt;Classification&lt;/li&gt;
  &lt;li&gt;Regression&lt;/li&gt;
&lt;/ul&gt;

&lt;h6 id=&quot;pros&quot;&gt;Pros&lt;/h6&gt;

&lt;ul&gt;
  &lt;li&gt;Accuracy&lt;/li&gt;
  &lt;li&gt;Works well on smaller cleaner datasets&lt;/li&gt;
  &lt;li&gt;It can be more efficient because it uses a subset of training points&lt;/li&gt;
&lt;/ul&gt;

&lt;h6 id=&quot;cons&quot;&gt;Cons&lt;/h6&gt;

&lt;ul&gt;
  &lt;li&gt;Isn’t suited to larger datasets as the training time with SVMs can be high&lt;/li&gt;
  &lt;li&gt;Less effective on noisier datasets with overlapping classes&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;maximum-margin&quot;&gt;Maximum Margin&lt;/h3&gt;

&lt;p&gt;Refer to &lt;a href=&quot;https://www.youtube.com/watch?v=_PwhiWxHK8o&quot;&gt;this lecture&lt;/a&gt; by MIT OCW&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;    &lt;span class=&quot;c1&quot;&gt;# Decision rule
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;u&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Then Positive
&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Constraints
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# for gutter
&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Width
&lt;/span&gt;    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Min (|w|) - min(1/2 |w|2)
&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Minimization expression only depends on (xi.xj)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;h3 id=&quot;kernel-trick&quot;&gt;Kernel Trick&lt;/h3&gt;
&lt;p&gt;In addition to performing linear classification, SVMs can efficiently perform a non-linear classification using what is called the kernel trick,
implicitly mapping their inputs into high-dimensional feature spaces.&lt;/p&gt;

&lt;h5 id=&quot;how-to-select-support-vector-machine-kernels&quot;&gt;How to Select Support Vector Machine Kernels&lt;/h5&gt;

&lt;p&gt;When to use linear:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;../img/svm/separable_linear.png&quot; alt=&quot;&quot; /&gt;
&lt;img src=&quot;../img/svm/separable_rbf.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;When to use rbf:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;../img/svm/circle_linear.png&quot; alt=&quot;&quot; /&gt;
&lt;img src=&quot;../img/svm/circle_rbf.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;How rbf did this?&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;../img/svm/circle_rbf_dimension_explaination.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The RBF kernel SVM decision region is actually also a linear decision region. What RBF kernel SVM actually does is to create non-linear combinations of your features to uplift your samples onto a higher-dimensional feature space where you can use a linear decision boundary to separate your classes:&lt;/p&gt;

&lt;h3 id=&quot;python&quot;&gt;Python&lt;/h3&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;    &lt;span class=&quot;c1&quot;&gt;#Import Library
&lt;/span&gt;    &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sklearn&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;svm&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset
&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Create SVM classification object
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;model&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;svm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;svc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;linear&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gamma&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# there is various option associated with it, like changing kernel, gamma and C value. Will discuss more # about it in next section.Train the model using the training sets and check score
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;#Predict Output
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;predicted&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;predict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x_test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

</description>
        <pubDate>Sun, 03 Jun 2018 18:30:00 +0000</pubDate>
        <link>https://shubham.chaudhary.xyz/blog/model/svm</link>
        <guid isPermaLink="true">https://shubham.chaudhary.xyz/blog/model/svm</guid>
        
        <category>models</category>
        
        <category>svm</category>
        
        <category>support vector machine</category>
        
        <category>ml</category>
        
        
      </item>
    
      <item>
        <title>Paper Summary - End to End Interpretation of French Street Name Signs Dataset</title>
        <description>&lt;p&gt;This is a summary for the research paper &lt;a href=&quot;https://arxiv.org/abs/1702.03970&quot;&gt;End-to-End Interpretation of the French Street Name Signs Dataset&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is a model that takes the multiple shots of street view signs as input and outputs the name in the format that will be directly shown in google maps.
A full end to end model. This includes reading image, parsing text, converting text for google maps standard and combining text from multiple images into the
most accurate version. Pretty interesting problem and solution. This is one of the inspiration paper for Tesseract LSTM model.
&lt;!--This presents multiple ideas that can be applied to our problem.--&gt;&lt;/p&gt;

&lt;p&gt;First of all they broke the street sign transcription (img to text) into a simpler problem for their human moderators.
They detected the street signs using a neural network that gave the bounding box of street signs. Then they collected
multiple views of same sign using image capture geo coordinates. Then each image was transcribed using ocr,
recaptcha and human respectively. OCR gave basic data for recaptcha, humans verifies recaptcha input, incorrect
transcriptions were forwarded to humans. They never transcribed the text as it was shown in image, but the was
they wanted it to be shown in Google Maps.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/U9gaBqG.png&quot; alt=&quot;img-fsns-tiles&quot; /&gt;&lt;/p&gt;

&lt;!--We can do something similar where we break the task of parsing menu from parsing entire menu to validating a blob of text.--&gt;

&lt;h2 id=&quot;recurrent-model---street&quot;&gt;Recurrent Model - STREET&lt;/h2&gt;
&lt;p&gt;Then using this dataset they trained the &lt;a href=&quot;https://github.com/tensorflow/models/tree/master/street&quot;&gt;STREET model&lt;/a&gt; (StreetView Tensorflow Recurrent End-to-End
Transcription) for the end to end problem, from using a set of 4 views of street sign as input to transcribing the
street name to be used in Maps as output.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/gu4JEjs.png&quot; alt=&quot;image fsns network&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;cnn&quot;&gt;CNN&lt;/h3&gt;
&lt;p&gt;Images are detiled into 4 images from single image, 2 convolution with max pooling is applied to reduce
dimensions from 150x150 to 25x25.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/tpyIs9I.png&quot; alt=&quot;img-fsns-conv&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;text-finding--reading&quot;&gt;Text Finding &amp;amp; Reading&lt;/h3&gt;
&lt;p&gt;Vertically summarizing Long Short-Term Memory (LSTM) cells are used to find text lines.
A vertically summarizing LSTM is a summarizing LSTM that scans the input &lt;strong&gt;vertically&lt;/strong&gt;.
It is thus expected to compute a vertical summary of its input, which will be taken from the last vertical timestep.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/GWEZWhb.png&quot; alt=&quot;img-fsns-lstm&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Three different vertical summarizations are done and then combined later:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Upward to find the top textline.&lt;/li&gt;
  &lt;li&gt;Separate upward and downward LSTMs, with depth-concatenated outputs, to find the middle
textline.&lt;/li&gt;
  &lt;li&gt;Downward to find the bottom textline.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Although each vertically summarizing LSTM sees the same input, and could theoretically summarize
the entirety of what it sees, they are organized this way so that they only have to produce a summary
of the most recently seen information.&lt;/p&gt;

&lt;p&gt;Since the &lt;em&gt;middle line is harder to find, that gets two LSTMs working in opposite directions&lt;/em&gt;.
Each output from the CNN layers are passed to a separate bi-directional horizontal LSTM to recognize the text.
Bidirectional LSTMs have been shown to be able to read text with high accuracy.
The outputs of the bidirectional LSTMs are concatenated in the &lt;strong&gt;x-dimension&lt;/strong&gt;, to string the text lines out in
reading order.&lt;/p&gt;

&lt;h3 id=&quot;character-position-normalization-and-combination-of-individual-outputs&quot;&gt;Character Position Normalization and Combination of individual outputs&lt;/h3&gt;
&lt;p&gt;All four input images may have text positioned differently, the network is provided ability to shuffle data in x
dimension by adding two more LSTM layers - scanning left to right &amp;amp; right to left.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/Y7JHNKd.png&quot; alt=&quot;img-fsns-cpn&quot; /&gt;&lt;/p&gt;

&lt;p&gt;After this a unidirectional LSTM is used to combine the four views of each input image to produce the most accurate
text. This is the layer that will also learn the Title Case normalization. A 50% dropout if added b/w reshape for
regularization.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/FP4ebp7.png&quot; alt=&quot;img-fsns-comb&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;final-network&quot;&gt;Final Network&lt;/h2&gt;

&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/UBTlcBE.png&quot; alt=&quot;img-fsns-layers&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;paper&quot;&gt;Paper&lt;/h2&gt;
&lt;p&gt;Here’s the full research paper with the important parts highlighted by me.&lt;/p&gt;

&lt;iframe class=&quot;scribd_iframe_embed&quot; title=&quot;End-To-End Interpretation of the French Street Name Signs Dataset - 1702.03970&quot; src=&quot;https://www.scribd.com/embeds/460525114/content?start_page=1&amp;amp;view_mode=scroll&amp;amp;access_key=key-5KQPMHWBafa6zoyUUk3c&quot; data-auto-height=&quot;false&quot; data-aspect-ratio=&quot;0.7535505430242272&quot; scrolling=&quot;no&quot; width=&quot;100%&quot; height=&quot;850&quot; frameborder=&quot;0&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;You can &lt;a href=&quot;https://www.scribd.com/document/460525114/End-To-End-Interpretation-of-the-French-Street-Name-Signs-Dataset-1702-03970#download&quot;&gt;download the pdf&lt;/a&gt; for free from Scribd.&lt;/p&gt;

</description>
        <pubDate>Wed, 21 Feb 2018 05:30:51 +0000</pubDate>
        <link>https://shubham.chaudhary.xyz/blog/papers/ml/fsns</link>
        <guid isPermaLink="true">https://shubham.chaudhary.xyz/blog/papers/ml/fsns</guid>
        
        <category>paper</category>
        
        <category>summary</category>
        
        <category>lstm</category>
        
        <category>cnn</category>
        
        <category>ml</category>
        
        
      </item>
    
      <item>
        <title>Spark Streaming</title>
        <description>&lt;h2 id=&quot;extract-transform-load-etl&quot;&gt;Extract Transform Load (ETL)&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Extract,_transform,_load&quot;&gt;ETL&lt;/a&gt; process is to &lt;em&gt;fetch&lt;/em&gt; data from different types of systems, &lt;em&gt;structure&lt;/em&gt; it and &lt;em&gt;save&lt;/em&gt; it into the destination database.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/xyD2KsE.jpg&quot; alt=&quot;ETL Pipeline&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;batch&quot;&gt;Batch&lt;/h2&gt;
&lt;p&gt;In the case of a batch job, the query will be run on the data saved at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;source-path&lt;/code&gt; and the transformed data will be saved at the destination &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dest-path&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/I7uQvCT.png&quot; alt=&quot;Batch Job&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;streaming&quot;&gt;Streaming&lt;/h2&gt;
&lt;p&gt;In the case of a streaming job, the query will run on the data continuously from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;source-path&lt;/code&gt; and transformed data will be appended in the destination &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dest-path&lt;/code&gt; again and again as data comes in.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/SYOgWWV.png&quot; alt=&quot;Batch Job converted to streaming&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;merging-static-data-db-with-streaming-data&quot;&gt;Merging static data (DB) with streaming data&lt;/h3&gt;
&lt;p&gt;There might be use cases where you want to merge static data (e.g. MySQL) with the streaming data. You can do this as follows:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/8tyNqcT.png&quot; alt=&quot;Joining Streaming Data&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;executing-the-job&quot;&gt;Executing the Job&lt;/h2&gt;

&lt;h3 id=&quot;batch-execution&quot;&gt;Batch Execution&lt;/h3&gt;
&lt;p&gt;&lt;img src=&quot;https://i.imgur.com/21afWnk.png&quot; alt=&quot;Batch Plan&quot; /&gt;
&lt;img src=&quot;https://i.imgur.com/6dXWnmn.png&quot; alt=&quot;Batch Plan Execution&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;stream-execution&quot;&gt;Stream Execution&lt;/h3&gt;
&lt;p&gt;With the planner’s logical plan, incremental execution plan is generated on top of it
&lt;img src=&quot;https://i.imgur.com/JV1wQcb.png&quot; alt=&quot;Incremental&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;resources&quot;&gt;Resources&lt;/h2&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=rl8dIzTpxrI&quot;&gt;Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
        <pubDate>Tue, 25 Jul 2017 19:05:00 +0000</pubDate>
        <link>https://shubham.chaudhary.xyz/blog/spark-streaming</link>
        <guid isPermaLink="true">https://shubham.chaudhary.xyz/blog/spark-streaming</guid>
        
        <category>zomato</category>
        
        <category>ml</category>
        
        <category>machine learning</category>
        
        <category>spark</category>
        
        <category>streaming</category>
        
        <category>spark streaming</category>
        
        
      </item>
    
      <item>
        <title>Docker 101</title>
        <description>&lt;p&gt;When working in multi node environment like Spark/Hadoop clusters, docker diminishes the barrier to entry. By barrier to entry, I mean the need to have a constantly running EMR cluster, when you are still in development phase. With Docker, you can quickly setup a 4-5 node cluster on a single machine and start coding your spark job. You can understand &lt;a href=&quot;https://www.docker.com/what-docker&quot;&gt;what Docker is&lt;/a&gt; and &lt;a href=&quot;https://www.docker.com/use-cases&quot;&gt;why you would use Docker&lt;/a&gt; on these links.&lt;/p&gt;

&lt;h3 id=&quot;benefits&quot;&gt;Benefits&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;You can very easily version control your environment&lt;/li&gt;
  &lt;li&gt;Barrier to entry for working with clusters (Spark/Hadoop) etc. reduces a lot. You no longer need EMR access to run a cluster which will have a cost associated with it.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;installing-docker&quot;&gt;Installing Docker&lt;/h2&gt;
&lt;p&gt;Follow &lt;a href=&quot;https://docs.docker.com/engine/installation/&quot;&gt;this official guide&lt;/a&gt;&lt;/p&gt;

&lt;h4 id=&quot;manual&quot;&gt;Manual&lt;/h4&gt;
&lt;p&gt;Quickly for &lt;a href=&quot;https://docs.docker.com/engine/installation/linux/docker-ce/ubuntu/#install-docker-ce&quot;&gt;ubuntu&lt;/a&gt; steps are:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sudo apt-get update
sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository \
   &quot;deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable&quot;
sudo apt-get update
sudo apt-get install docker-ce
sudo docker run hello-world
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;with-ansible&quot;&gt;With Ansible&lt;/h4&gt;
&lt;p&gt;You can use following command to use ansible to install docker for you&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sudo python2.7 -m pip install ansible \
  &amp;amp;&amp;amp; sudo ansible-galaxy install --force angstwad.docker_ubuntu \
  &amp;amp;&amp;amp; echo &apos;- hosts: all
  roles:
      - angstwad.docker_ubuntu
  &apos; &amp;gt; /tmp/docker_ubuntu.yml \
  &amp;amp;&amp;amp; sudo ansible-playbook /tmp/docker_ubuntu.yml -c local -i &apos;localhost,&apos;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;setting-up-a-cluster&quot;&gt;Setting up a cluster&lt;/h2&gt;
&lt;p&gt;Follow &lt;a href=&quot;https://bigdatagurus.wordpress.com/2017/03/01/how-to-start-spark-cluster-in-minutes/&quot;&gt;this post&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You will be able to run a local spark cluster with 4 commands.&lt;/p&gt;

&lt;h4 id=&quot;quick-overview&quot;&gt;Quick overview:&lt;/h4&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;mkdir spark_cluster; cd spark_cluster

echo &apos;version: &quot;2&quot;

services:
  master:
    image: singularities/spark
    command: start-spark master
    hostname: master
    ports:
      - &quot;6066:6066&quot;
      - &quot;7070:7070&quot;
      - &quot;8080:8080&quot;
      - &quot;50070:50070&quot;
  worker:
    image: singularities/spark
    command: start-spark worker master
    environment:
      SPARK_WORKER_CORES: 1
      SPARK_WORKER_MEMORY: 2g
    links:
      - master
&apos; &amp;gt; docker-compose.yml

sudo docker-compose up -d

# sudo docker-compose scale worker=2
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;extending-other-images&quot;&gt;Extending other images&lt;/h2&gt;
&lt;p&gt;With docker you can build on top of someone else’s image. For example, here I will extend &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;singularities/spark&lt;/code&gt; image, make my custom spark configuration changes, and push the final version to my own docker hub repo.&lt;/p&gt;

&lt;h3 id=&quot;pushing-your-changes-to-docker-hub&quot;&gt;Pushing your changes to Docker hub&lt;/h3&gt;
&lt;p&gt;To create a fork from some base repo (singularities/spark), these are the steps&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;docker run &lt;span class=&quot;nt&quot;&gt;-it&lt;/span&gt; singularities/spark  &lt;span class=&quot;c&quot;&gt;# Run base repo. This will open a shell&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# Make your changes to the image in this container&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;docker login &lt;span class=&quot;nt&quot;&gt;--username&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;chaudhary &lt;span class=&quot;nt&quot;&gt;--password&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;lol
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;docker commit &amp;lt;container ID from docker ps&amp;gt; chaudhary/my-repo-name  &lt;span class=&quot;c&quot;&gt;# Commit changes&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;docker tag &amp;lt;image ID from docker images&amp;gt; chaudhary/my-repo-name  &lt;span class=&quot;c&quot;&gt;# Tag for pull to work properly&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;docker push chaudhary/my-repo-name
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now that you have pushed this image, you can start a new container from this image as shown below:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;docker run &lt;span class=&quot;nt&quot;&gt;-it&lt;/span&gt; chaudhary/my-repo-name
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;resources&quot;&gt;Resources&lt;/h2&gt;
&lt;p&gt;For more information read the official &lt;a href=&quot;https://docs.docker.com/get-started/&quot;&gt;getting started guide&lt;/a&gt;.&lt;/p&gt;

&lt;!--
## VM vs Container

### VM
![VM](https://www.docker.com/sites/default/files/VM%402x.png)

### Container
![Container](https://www.docker.com/sites/default/files/Container%402x.png)
--&gt;
</description>
        <pubDate>Sat, 22 Jul 2017 20:39:00 +0000</pubDate>
        <link>https://shubham.chaudhary.xyz/blog/docker/101</link>
        <guid isPermaLink="true">https://shubham.chaudhary.xyz/blog/docker/101</guid>
        
        <category>zomato</category>
        
        <category>development environment</category>
        
        <category>docker</category>
        
        <category>spark cluster</category>
        
        
      </item>
    
  </channel>
</rss>
